wuruiqi0722 commited on
Commit
d42d8bd
·
verified ·
1 Parent(s): 4794722

Upload folder using huggingface_hub

Browse files
.gitignore CHANGED
@@ -1 +1,2 @@
1
- outputs
 
 
1
+ outputs
2
+ checkpoints
README.md CHANGED
@@ -1,3 +1,144 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h1 align="center">Infinite-World</h1>
2
+
3
+ <h3 align="center">Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory</h3>
4
+
5
+ <p align="center">
6
+ <a href="http://arxiv.org/abs/2602.02393"><img src="https://img.shields.io/badge/arXiv-2602.02393-b31b1b.svg" alt="arXiv"></a>
7
+ <a href="https://rq-wu.github.io/projects/infinite_world"><img src="https://img.shields.io/badge/Project-Page-blue.svg" alt="Project Page"></a>
8
+ </p>
9
+
10
+ <p align="center">
11
+ <strong>Ruiqi Wu</strong><sup>1,2,3*</sup>, <strong>Xuanhua He</strong><sup>4,2*</sup>, <strong>Meng Cheng</strong><sup>2*</sup>, <strong>Tianyu Yang</strong><sup>2</sup>, <strong>Yong Zhang</strong><sup>2‡</sup>, <strong>Chunle Guo</strong><sup>1,3†</sup>, <strong>Chongyi Li</strong><sup>1,3</sup>, <strong>Ming-Ming Cheng</strong><sup>1,3</sup>
12
+ </p>
13
+
14
+ <p align="center">
15
+ <sup>1</sup>Nankai University &nbsp; <sup>2</sup>Meituan &nbsp; <sup>3</sup>NKIARI &nbsp; <sup>4</sup>HKUST
16
+ </p>
17
+
18
+ <p align="center">
19
+ <sup>*</sup>Equal Contribution &nbsp; <sup>†</sup>Corresponding Author &nbsp; <sup>‡</sup>Project Leader
20
+ </p>
21
+
22
+ ---
23
+
24
+ ## Highlights
25
+
26
+ **Infinite-World** is a robust interactive world model with:
27
+
28
+ - **Real-World Training** — Trained on real-world videos without requiring perfect pose annotations or synthetic data
29
+ - **1000+ Frame Memory** — Maintains coherent visual memory over 1000+ frames via Hierarchical Pose-free Memory Compressor (HPMC)
30
+ - **Robust Action Control** — Uncertainty-aware action labeling ensures accurate action-response learning from noisy trajectories
31
+
32
+ <p align="center">
33
+ <img src="./assets/framework.png" alt="Infinite-World Framework" width="100%">
34
+ </p>
35
+
36
+
37
+ ## Installation
38
+
39
+ **Environment:** Python 3.10, CUDA 12.4 recommended.
40
+
41
+ ### 1. Create conda environment
42
+
43
+ ```bash
44
+ conda create -n infworld python=3.10
45
+ conda activate infworld
46
+ ```
47
+
48
+ ### 2. Install PyTorch with CUDA 12.4
49
+
50
+ Install from the official PyTorch index (no local whl):
51
+
52
+ ```bash
53
+ pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
54
+ ```
55
+
56
+
57
+ ### 3. Install Python dependencies
58
+
59
+ ```bash
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ ---
64
+
65
+ ## Checkpoint Configuration
66
+
67
+ All model paths are configured in **`configs/infworld_config.yaml`**. Paths are relative to the project root unless absolute.
68
+
69
+ ### Download checkpoints
70
+
71
+ Download from [Wan-AI/Wan2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) and place files under `checkpoints/`:
72
+
73
+ | File / directory | Config key | Description |
74
+ |------------------|------------|-------------|
75
+ | `models/Wan2.1_VAE.pth` | `vae_cfg.vae_pth` | VAE weights |
76
+ | `models/models_t5_umt5-xxl-enc-bf16.pth` | `text_encoder_cfg.checkpoint_path` | T5 text encoder |
77
+ | `models/google/umt5-xxl` (folder) | `text_encoder_cfg.tokenizer_path` | T5 tokenizer |
78
+ | `infinite_world_model.ckpt` | `checkpoint_path` | DiT model weights |
79
+
80
+
81
+
82
+ - **DiT checkpoint:** Can be downloaded from [TBD]().
83
+
84
+ ---
85
+
86
+ ## Upload to Hugging Face (including checkpoints)
87
+
88
+ To upload this repo to Hugging Face Hub (code + `checkpoints/`):
89
+
90
+ 1. **Login**
91
+ ```bash
92
+ pip install huggingface_hub
93
+ huggingface-cli login
94
+ ```
95
+ Use a token from [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) (need write permission).
96
+
97
+ 2. **Upload**
98
+ From the project root (`infinite-world/`):
99
+ ```bash
100
+ python scripts/upload_to_hf.py YOUR_USERNAME/infinite-world
101
+ ```
102
+ Or set the repo and run:
103
+ ```bash
104
+ export HF_REPO_ID=YOUR_USERNAME/infinite-world
105
+ python scripts/upload_to_hf.py
106
+ ```
107
+
108
+ The script uploads the whole directory (including `checkpoints/`) and skips `__pycache__`, `outputs`, `.git`, etc. Large checkpoint files are uploaded via the Hub API; the first run may take a while depending on size and network.
109
+
110
+ 3. **Create repo manually (optional)**
111
+ You can create the model repo first at [https://huggingface.co/new](https://huggingface.co/new) (type: **Model**), then run the script with that `repo_id`.
112
+
113
+ ---
114
+
115
+ ## Results
116
+
117
+ ### Quantitative Comparison
118
+
119
+ | Model | Mot. Smo.↑ | Dyn. Deg.↑ | Aes. Qual.↑ | Img. Qual.↑ | Avg. Score↑ | Memory↓ | Fidelity↓ | Action↓ | ELO Rating↑ |
120
+ |:------|:----------:|:----------:|:-----------:|:-----------:|:-----------:|:-------:|:---------:|:-------:|:-----------:|
121
+ | Hunyuan-GameCraft | 0.9855 | 0.9896 | 0.5380 | 0.6010 | 0.7785 | 2.67 | 2.49 | 2.56 | 1311 |
122
+ | Matrix-Game 2.0 | 0.9788 | **1.0000** | 0.5267 | **0.7215** | 0.8068 | 2.98 | 2.91 | 1.78 | 1432 |
123
+ | Yume 1.5 | 0.9861 | 0.9896 | **0.5840** | <u>0.6969</u> | **0.8141** | <u>2.43</u> | <u>1.91</u> | 2.47 | 1495 |
124
+ | HY-World-1.5 | **0.9905** | **1.0000** | 0.5280 | 0.6611 | 0.7949 | 2.59 | 2.78 | **1.50** | <u>1542</u> |
125
+ | **Infinite-World** | <u>0.9876</u> | **1.0000** | <u>0.5440</u> | <u>0.7159</u> | <u>0.8119</u> | **1.92** | **1.67** | <u>1.54</u> | **1719** |
126
+
127
+
128
+ ## Citation
129
+
130
+ If you find this work useful, please consider citing:
131
+
132
+ ```bibtex
133
+ @article{wu2026infiniteworld,
134
+ title={Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory},
135
+ author={Wu, Ruiqi and He, Xuanhua and Cheng, Meng and Yang, Tianyu and Zhang, Yong and Kang, Zhuoliang and Cai, Xunliang and Wei, Xiaoming and Guo, Chunle and Li, Chongyi and Cheng, Ming-Ming},
136
+ journal={arXiv preprint arXiv:2602.02393},
137
+ year={2026}
138
+ }
139
+ ```
140
+
141
+
142
+ ## License
143
+
144
+ This project is released under the [MIT License](LICENSE).
infworld/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (207 Bytes). View file
 
infworld/configs/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (215 Bytes). View file
 
infworld/configs/__pycache__/bucket_config.cpython-310.pyc ADDED
Binary file (6.16 kB). View file
 
infworld/context_parallel/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (224 Bytes). View file
 
infworld/context_parallel/__pycache__/context_parallel_util.cpython-310.pyc ADDED
Binary file (9.21 kB). View file
 
infworld/models/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (214 Bytes). View file
 
infworld/models/__pycache__/checkpoint.cpython-310.pyc ADDED
Binary file (1.11 kB). View file
 
infworld/models/__pycache__/dit_model.cpython-310.pyc ADDED
Binary file (32.7 kB). View file
 
infworld/models/__pycache__/scheduler.cpython-310.pyc ADDED
Binary file (8.13 kB). View file
 
infworld/models/__pycache__/umt5.cpython-310.pyc ADDED
Binary file (15.6 kB). View file
 
infworld/utils/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (213 Bytes). View file
 
infworld/utils/__pycache__/data_utils.cpython-310.pyc ADDED
Binary file (21.6 kB). View file
 
infworld/utils/__pycache__/dataset_utils.cpython-310.pyc ADDED
Binary file (20.7 kB). View file
 
infworld/utils/__pycache__/prepare_dataloader.cpython-310.pyc ADDED
Binary file (3.24 kB). View file
 
infworld/vae/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (1.71 kB). View file
 
infworld/vae/__pycache__/vae.cpython-310.pyc ADDED
Binary file (17.5 kB). View file