Add FP8 weight quantization guide to README

#9
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -243,6 +243,38 @@ This moves each component (text encoder → transformer → VAE) to GPU only whe
243
  | `pipe.to("cuda")` | ~30 GB | Fastest | A100, H100, H200 |
244
  | `enable_model_cpu_offload()` | ~19 GB | Similar | RTX 4090, RTX 3090 |
245
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
246
  ### 🖥️ ComfyUI
247
 
248
  Official ComfyUI custom nodes for Motif-Video 2B are currently in development. Stay tuned for updates.
 
243
  | `pipe.to("cuda")` | ~30 GB | Fastest | A100, H100, H200 |
244
  | `enable_model_cpu_offload()` | ~19 GB | Similar | RTX 4090, RTX 3090 |
245
 
246
+ #### FP8 Weight Quantization (Optional)
247
+
248
+ For further VRAM reduction, you can quantize the transformer weights to FP8 using [torchao](https://github.com/pytorch/ao):
249
+
250
+ ```bash
251
+ pip install torchao
252
+ ```
253
+
254
+ ```python
255
+ from torchao.quantization import quantize_, Float8WeightOnlyConfig
256
+
257
+ pipe = DiffusionPipeline.from_pretrained(
258
+ "Motif-Technologies/Motif-Video-2B",
259
+ custom_pipeline="pipeline_motif_video",
260
+ trust_remote_code=True,
261
+ torch_dtype=torch.bfloat16,
262
+ guider=guider, # see T2V example above
263
+ )
264
+ quantize_(pipe.transformer, Float8WeightOnlyConfig())
265
+ pipe.enable_model_cpu_offload()
266
+
267
+ output = pipe(prompt="...", height=736, width=1280, num_frames=121, num_inference_steps=50)
268
+ export_to_video(output.frames[0], "output.mp4", fps=24)
269
+ ```
270
+
271
+ This stores the transformer weights in FP8 (8-bit) instead of BF16 (16-bit), reducing peak VRAM from ~19 GB to ~15 GB while keeping all computation in BF16 precision.
272
+
273
+ | Mode | Peak VRAM | Notes |
274
+ |------|-----------|-------|
275
+ | `enable_model_cpu_offload()` | ~19 GB | BF16 baseline |
276
+ | `+ Float8WeightOnlyConfig` | ~15 GB | FP8 weights, BF16 compute |
277
+
278
  ### 🖥️ ComfyUI
279
 
280
  Official ComfyUI custom nodes for Motif-Video 2B are currently in development. Stay tuned for updates.