KlingTeam
/

DecMem

Model card Files Files and versions

DecMem / README.md

JeffreyYzh's picture

Update README.md

6725228 verified 3 days ago

|

history blame contribute delete

1.78 kB

	---
	pipeline_tag: video-to-video
	license: apache-2.0
	language:
	- en
	base_model:
	- Wan-AI/Wan2.1-T2V-1.3B
	---

	# DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

	We propose DecMem, a decoupled memory architecture that employs Sparse Global Memory for efficient fine-grained access to global history and Anchored Local Memory for stable and high-quality extrapolation.

	[Project Page](https://jeffreyyzh.github.io/DecMem-Page/) \| [Paper](https://arxiv.org/abs/2605.31336) \| [Code](https://github.com/KlingAIResearch/DecMem)

	## Checkpoints

	Download the Wan2.1 backbone (VAE + tokenizer weights used by the pipeline):

	```bash
	huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B \
	--local-dir-use-symlinks False \
	--local-dir wan_models/Wan2.1-T2V-1.3B
	```

	Download DecMem trained checkpoints from HuggingFace:

	```bash
	huggingface-cli download KlingTeam/DecMem --local-dir checkpoints
	```

	Checkpoint layout expected by training / inference scripts:

	```
	checkpoints/
	└── decmem.pt # released weights
	```

	## Quick start

	We provide the example video-pose pairs for quick inference. The inference is Block-by-block causal denoising manner with KV cache.

	```bash
	bash scripts/infer_example.sh
	```

	## Citation
	If you find our work helpful, please cite our paper:

	```bibtex
	@misc{yang2026decmemminutelongconsistentworld,
	title={DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory},
	author={Zhenhao Yang and Xiaoshi Wu and Zhengyao Lv and Xiaoyu Shi and Xintao Wang and Pengfei Wan and Kun Gai and Kwan-Yee K. Wong},
	year={2026},
	eprint={2605.31336},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2605.31336},
	}
	```