Image-to-Video
Diffusers
Safetensors
Video
WorldModels
Stream
Diffusion
zhuhz22 commited on
Commit
d8ba72e
·
verified ·
1 Parent(s): 7b789f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -38
README.md CHANGED
@@ -1,49 +1,22 @@
1
  ---
2
  license: mit
3
  pipeline_tag: image-to-video
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Causal Forcing
7
 
8
- [**Causal Forcing**](https://huggingface.co/papers/2602.02214) is a framework for high-quality real-time interactive video generation. It distills pretrained bidirectional video diffusion models into few-step autoregressive (AR) models by bridging the architectural gap between bidirectional and causal attention.
9
 
10
- - **Project Page:** [https://thu-ml.github.io/CausalForcing.github.io/](https://thu-ml.github.io/CausalForcing.github.io/)
11
- - **Code:** [https://github.com/thu-ml/Causal-Forcing](https://github.com/thu-ml/Causal-Forcing)
12
- - **Paper:** [Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation](https://huggingface.co/papers/2602.02214)
13
 
14
- ## Overview
15
-
16
- Causal Forcing uses an autoregressive teacher for ODE initialization to bridge the architectural gap, then applies an asymmetric DMD procedure. It significantly outperforms existing baselines in visual quality and motion dynamics while maintaining inference efficiency. The frame-wise models natively support both Text-to-Video (T2V) and Image-to-Video (I2V) generation.
17
-
18
- ## Inference
19
-
20
- Please refer to the [official GitHub repository](https://github.com/thu-ml/Causal-Forcing) for installation instructions.
21
-
22
- ### Text-to-Video (T2V)
23
-
24
- To generate video using the chunk-wise model:
25
-
26
- ```bash
27
- python inference.py \
28
- --config_path configs/causal_forcing_dmd_chunkwise.yaml \
29
- --output_folder output/chunkwise \
30
- --checkpoint_path checkpoints/chunkwise/causal_forcing.pt \
31
- --data_path prompts/demos.txt
32
- ```
33
-
34
- ### Image-to-Video (I2V)
35
-
36
- The frame-wise setting natively supports I2V. Set the first latent initial frame as your conditional image:
37
-
38
- ```bash
39
- python inference.py \
40
- --config_path configs/causal_forcing_dmd_framewise.yaml \
41
- --output_folder output/framewise \
42
- --checkpoint_path checkpoints/framewise/causal_forcing.pt \
43
- --data_path prompts/i2v \
44
- --i2v \
45
- --use_ema
46
- ```
47
 
48
  ## Citation
49
 
 
1
  ---
2
  license: mit
3
  pipeline_tag: image-to-video
4
+ datasets:
5
+ - MIN-Lab/minWM-data
6
+ tags:
7
+ - Video
8
+ - WorldModels
9
+ - Stream
10
+ - Diffusion
11
  ---
12
 
13
+ # 🌍 minWM: The First Full-Stack Open-Source World Model Framework
14
 
15
+ > ***A full-stack framework and tutorial for newcomers, rather than a specific model.***
16
 
17
+ **minWM** is our contribution to the world-model community: a **full-stack open-source framework** that walks you end-to-end through turning a bidirectional T2V foundation model into an action-conditioned video world model — with example data, runnable scripts, **Claude skills** capturing our hands-on experience, and **onboarding knowledge** for newcomers. We hope more researchers and developers join us in growing the community together.
 
 
18
 
19
+ ## Code: https://github.com/shengshu-ai/minWM
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Citation
22