Updated Readme, no need of PR since base nightly works on Blackwell pretty well(PR is merged now))
#1
by crazymartian - opened
README.md
CHANGED
|
@@ -116,7 +116,7 @@ This model is NVFP4 quantized with nvidia-modelopt **v0.44.0**
|
|
| 116 |
This model was obtained by quantizing the weights and activations of Minimax-M3 to NVFP4 data type. This optimization reduces the number of bits per parameter from 8 to 4, reducing disk size and GPU memory requirements by approximately 2x.
|
| 117 |
|
| 118 |
## Usage
|
| 119 |
-
To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you currently need the nightly docker image
|
| 120 |
|
| 121 |
```
|
| 122 |
vllm serve nvidia/MiniMax-M3-NVFP4 \
|
|
|
|
| 116 |
This model was obtained by quantizing the weights and activations of Minimax-M3 to NVFP4 data type. This optimization reduces the number of bits per parameter from 8 to 4, reducing disk size and GPU memory requirements by approximately 2x.
|
| 117 |
|
| 118 |
## Usage
|
| 119 |
+
To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you currently need the nightly docker image. Launch the nightly image and run the sample command below:
|
| 120 |
|
| 121 |
```
|
| 122 |
vllm serve nvidia/MiniMax-M3-NVFP4 \
|