step t2v

by jabbamaster - opened Jun 18, 2025

←

This PR is in draft mode

Files changed (3) hide show

.gitattributes CHANGED Viewed

@@ -51,4 +51,3 @@ demos/output_lightx2v_wan_t2v_t03.mp4 filter=lfs diff=lfs merge=lfs -text
 demos/output_lightx2v_wan_t2v_t06.mp4 filter=lfs diff=lfs merge=lfs -text
 demos/output_lightx2v_wan_t2v_t05.mp4 filter=lfs diff=lfs merge=lfs -text
 demos/output_lightx2v_wan_t2v_t01.mp4 filter=lfs diff=lfs merge=lfs -text
-assets/img_lightx2v.png filter=lfs diff=lfs merge=lfs -text

 demos/output_lightx2v_wan_t2v_t06.mp4 filter=lfs diff=lfs merge=lfs -text
 demos/output_lightx2v_wan_t2v_t05.mp4 filter=lfs diff=lfs merge=lfs -text
 demos/output_lightx2v_wan_t2v_t01.mp4 filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -5,11 +5,7 @@ language:
 - zh
 pipeline_tag: text-to-video
 tags:
-  - video generation
-  - diffusion-single-file
-  - comfyui
-  - distillation
-  - LoRA
 library_name: diffusers
 inference:
   parameters:
@@ -17,13 +13,9 @@ inference:
 ---
 # Wan2.1-T2V-14B-StepDistill-CfgDistill
-<p align="center">
-    <img src="assets/img_lightx2v.png" width=75%/>
-<p>
 ## Overview
-Wan2.1-T2V-14B-StepDistill-CfgDistill is an advanced text-to-video generation model built upon the Wan2.1-T2V-14B foundation. This approach allows the model to generate videos with significantly fewer inference steps (4 steps) and without classifier-free guidance, substantially reducing video generation time while maintaining high quality outputs.
 ## Video Demos
@@ -38,7 +30,7 @@ Our training code is modified based on the [Self-Forcing](https://github.com/gua
 Our inference framework utilizes [lightx2v](https://github.com/ModelTC/lightx2v), a highly efficient inference engine that supports multiple models. This framework significantly accelerates the video generation process while maintaining high quality output.
 ```bash
-bash scripts/wan/run_wan_t2v_distill_4step_cfg.sh
 ```
 ## License Agreement

 - zh
 pipeline_tag: text-to-video
 tags:
+- video generation
 library_name: diffusers
 inference:
   parameters:
 ---
 # Wan2.1-T2V-14B-StepDistill-CfgDistill
 ## Overview
+Wan2.1-T2V-14B-StepDistill-CfgDistill is an advanced text-to-video generation model built upon the Wan2.1-T2V-14B foundation. This approach allows the model to generate videos with significantly fewer inference steps (4 or 8 steps) and without classifier-free guidance, substantially reducing video generation time while maintaining high quality outputs.
 ## Video Demos
 Our inference framework utilizes [lightx2v](https://github.com/ModelTC/lightx2v), a highly efficient inference engine that supports multiple models. This framework significantly accelerates the video generation process while maintaining high quality output.
 ```bash
+bash scripts/run_wan_t2v_distill.sh
 ```
 ## License Agreement

assets/img_lightx2v.png DELETED Viewed