Lower VRAM Mode or CPU Support for Step-Video-T2V?

by HassanStar - opened Feb 19, 2025

Feb 19, 2025

Hey StepFun team,

Step-Video-T2V looks incredible, but the high VRAM requirements (77GB for 204 frames) make it difficult for many users to run.

Are there any plans for a low-VRAM mode, quantized version, or even a CPU-compatible variant for research and experimentation?

Would love to hear if optimizations like Mixture of Experts (MoE), FP8 compression, or distillation techniques are being explored.

Thanks for your amazing work!

HassanStar

Feb 19, 2025

also will future versions support longer videos beyond 204 frames?

fisherma

Feb 20, 2025

The Modelscope community has developed an FP8 inference framework. Although it is still in the early stages, you can give it a try.
See it here: step-video-t2v-in-fp8.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment