Motif-Technologies
/

Motif-Video-2B

video-generation

diffusion-transformer

Model card Files Files and versions

Add FP8 weight quantization guide to README

#9

by gkalstn0 - opened Apr 22

base: refs/heads/main

←

from: refs/pr/9

Discussion Files changed

docs: add FP8 weight quantization guide to Memory-efficient Inference2e572b38

Motif Technologies org Apr 22

Summary

Add torchao Float8WeightOnlyConfig instructions to Memory-efficient Inference section
Reduces peak VRAM from ~19 GB to ~15 GB with enable_model_cpu_offload()
Stores transformer weights in FP8 while keeping all computation in BF16 precision

Test plan

Fresh venv with README pip install recipe + torchao
720p 121 frames 50 steps: VRAM confirmed ~15 GB (vs ~19 GB baseline)
Video output quality verified

gkalstn0 changed pull request status to open Apr 22

gkalstn0 changed pull request status to merged Apr 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment