Instructions to use BWGZK/EndlessWorld with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- SelfForcing
How to use BWGZK/EndlessWorld with SelfForcing:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
File size: 3,527 Bytes
ceb2093 b84c217 ceb2093 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | ---
license: apache-2.0
library_name: pytorch
pipeline_tag: text-to-video
tags:
- text-to-video
- video-generation
- streaming
- self-forcing
- wan2.1
- 3d-aware
base_model: Wan-AI/Wan2.1-T2V-1.3B
---
# EndlessWorld — Real-Time 3D-Aware Long Video Generation
Checkpoint for **EndlessWorld**, a streaming video diffusion model that produces
*unbounded-length*, 3D-consistent videos in real time on a single GPU.
- **Paper:** [arXiv:2512.12430](https://arxiv.org/abs/2512.12430)
- **Code:** [github.com/BWGZK-keke/EndlessWorld](https://github.com/BWGZK-keke/EndlessWorld)
- **Base model:** [Wan-AI/Wan2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B)
- **3D encoder:** [lhjiang/anysplat](https://huggingface.co/lhjiang/anysplat)
## What's in this repo
| File | Description |
|------------|-------------------------------------------------------------------------|
| `model.pt` | DMD-distilled generator weights for the EndlessWorld causal Wan model (step 1000 of the `self_forcing_dmd_separate` SOTA run). |
This is the generator checkpoint only. To run inference you also need:
1. The Wan2.1-T2V-1.3B base weights (text encoder, VAE, etc.)
2. The AnySplat 3D Gaussian feature encoder
See the [GitHub README](https://github.com/BWGZK-keke/EndlessWorld#installation)
for the full setup.
## Method
EndlessWorld extends the **Self-Forcing** causal diffusion framework (Wan2.1
T2V-1.3B backbone) with a **Global 3D-Aware Attention** module that injects
scene geometry — extracted on the fly by AnySplat — into the conditional
embedding of every autoregressive chunk.

Three ingredients:
- **Conditional autoregressive (self-forcing) training** — frames are denoised
block-by-block with KV-cache, conditioning each new block on previously
generated content.
- **Global 3D-Aware Attention** — `CrossAttentionFusion` + `To3D` modules ingest
3D Gaussian features produced by AnySplat and fuse them with the text
embedding, giving the generator a persistent geometric memory of the world
rendered so far.
- **Real-time streaming inference** — the rollout loop re-extracts 3D features
from the most recently decoded chunk and feeds the fused embedding back into
the causal generator, enabling indefinite extension on a single GPU.
## Quick start
```bash
git clone https://github.com/BWGZK-keke/EndlessWorld
cd EndlessWorld
pip install -r requirements.txt
# Download this checkpoint
huggingface-cli download BWGZK/EndlessWorld model.pt --local-dir checkpoints/
# Update configs/self_forcing_dmd.yaml -> generator_ckpt: checkpoints/model.pt
bash test.sh
```
Loading directly from Python:
```python
import torch
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download(repo_id="BWGZK/EndlessWorld", filename="model.pt")
state_dict = torch.load(ckpt, map_location="cpu")
```
## Training
- **Framework:** Multi-GPU FSDP via the [`train.py`](https://github.com/BWGZK-keke/EndlessWorld/blob/main/train.py)
entry point with [`configs/self_forcing_dmd.yaml`](https://github.com/BWGZK-keke/EndlessWorld/blob/main/configs/self_forcing_dmd.yaml).
## Citation
```bibtex
@article{zhang2025endlessworld,
title = {Endless World: Real-Time 3D-Aware Long Video Generation},
author = {Zhang, Ke and others},
journal = {arXiv preprint arXiv:2512.12430},
year = {2025}
}
```
## License
Apache 2.0 — same as the upstream Wan2.1 and Self-Forcing projects.
|