Update README.md
Browse files
README.md
CHANGED
|
@@ -61,7 +61,7 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
|
|
| 61 |
|
| 62 |
## Software Integration:
|
| 63 |
**Supported Runtime Engine(s):** <br>
|
| 64 |
-
* TRTLLM <br>
|
| 65 |
|
| 66 |
**Supported Hardware Microarchitecture Compatibility:** <br>
|
| 67 |
* NVIDIA Blackwell <br>
|
|
@@ -101,7 +101,7 @@ The model is quantized with nvidia-modelopt **v0.42.0** <br>
|
|
| 101 |
|
| 102 |
|
| 103 |
## Inference:
|
| 104 |
-
**Acceleration Engine:** TRTLLM <br>
|
| 105 |
**Test Hardware:** B200 <br>
|
| 106 |
|
| 107 |
## Post Training Quantization
|
|
@@ -112,7 +112,34 @@ This model was obtained by quantizing the weights and activations of Wan2.2-T2V-
|
|
| 112 |
To serve this checkpoint with [TRTLLM](https://github.com/NVIDIA/TensorRT-LLM):
|
| 113 |
|
| 114 |
```sh
|
|
|
|
| 115 |
trtllm-serve nvidia/Wan2.2-T2V-A14B-Diffusers-FP8 --extra_visual_gen_options ./examples/visual_gen/serve/configs/wan.yml
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
```
|
| 117 |
|
| 118 |
### Model Characteristics
|
|
|
|
| 61 |
|
| 62 |
## Software Integration:
|
| 63 |
**Supported Runtime Engine(s):** <br>
|
| 64 |
+
* TRTLLM,SGLang <br>
|
| 65 |
|
| 66 |
**Supported Hardware Microarchitecture Compatibility:** <br>
|
| 67 |
* NVIDIA Blackwell <br>
|
|
|
|
| 101 |
|
| 102 |
|
| 103 |
## Inference:
|
| 104 |
+
**Acceleration Engine:** TRTLLM,SGLang <br>
|
| 105 |
**Test Hardware:** B200 <br>
|
| 106 |
|
| 107 |
## Post Training Quantization
|
|
|
|
| 112 |
To serve this checkpoint with [TRTLLM](https://github.com/NVIDIA/TensorRT-LLM):
|
| 113 |
|
| 114 |
```sh
|
| 115 |
+
# TRTLLM
|
| 116 |
trtllm-serve nvidia/Wan2.2-T2V-A14B-Diffusers-FP8 --extra_visual_gen_options ./examples/visual_gen/serve/configs/wan.yml
|
| 117 |
+
|
| 118 |
+
# SGLang
|
| 119 |
+
|
| 120 |
+
PROMPT='A cat and a dog baking a cake together in a cozy kitchen. The cat carefully measures flour while the dog stirs batter in a glass bowl, sunlight through the window, smooth cinematic camera motion.'
|
| 121 |
+
|
| 122 |
+
FLASHINFER_DISABLE_VERSION_CHECK=1
|
| 123 |
+
python -m sglang.multimodal_gen.runtime.entrypoints.cli.main generate
|
| 124 |
+
--model-path nvidia/Wan2.2-T2V-A14B-Diffusers-FP8
|
| 125 |
+
--backend sglang
|
| 126 |
+
--attention-backend torch_sdpa
|
| 127 |
+
--performance-mode speed
|
| 128 |
+
--dit-cpu-offload false
|
| 129 |
+
--dit-layerwise-offload false
|
| 130 |
+
--text-encoder-cpu-offload false
|
| 131 |
+
--image-encoder-cpu-offload false
|
| 132 |
+
--vae-cpu-offload false
|
| 133 |
+
--pin-cpu-memory false
|
| 134 |
+
--width 832
|
| 135 |
+
--height 480
|
| 136 |
+
--num-frames 81
|
| 137 |
+
--fps 16
|
| 138 |
+
--num-inference-steps 50
|
| 139 |
+
--guidance-scale 5.0
|
| 140 |
+
--seed 0
|
| 141 |
+
--warmup false
|
| 142 |
+
--prompt "$PROMPT"
|
| 143 |
```
|
| 144 |
|
| 145 |
### Model Characteristics
|