---
base_model:
- Wan-AI/Wan2.2-TI2V-5B
library_name: diffusion-single-file
tags:
- diffusion
- perspective-to-360
extra gated eu disallowed: true
language:
- en
- zh
pipeline_tag: video-to-video
---

<p align="center"> <b> CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video </b> </p>

<p align="center"> Lingen Li, Guangzhi Wang, Xiaoyu Li, Zhaoyang Zhang, Qi Dou, Jinwei Gu, Tianfan Xue, Ying Shan </p>

<p align="center"> CVPR 2026 </p>

**TL;DR**: Generate one cubemap face per time window with an effective and efficient context mechanism. Then, perspective video becomes 4K 360° without the memory blow‑up or the low‑res‑then‑upscale.

For more details, please visit our [project page](https://lg-li.github.io/project/cubecomposer/), [paper](https://arxiv.org/abs/2603.04291), and [GitHub repo](https://github.com/TencentARC/CubeComposer).

### Model variants

We provide two variants of CubeComposer in this repo:

 - cubecomposer-3k: supports 2K/3K generation, cubemap size = 512/768, temporal window length = 9 frames.
 - cubecomposer-4k: supports 4K generation, cubemap size = 960, temporal window length = 5 frames.

### Citation

If you find our model helpful in your research, please like this repo, star the [GitHub repo](https://github.com/TencentARC/CubeComposer) and cite:

```
@article{li2026cubecomposer,
    title={CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video},
    author={Li, Lingen and Wang, Guangzhi and Li, Xiaoyu and Zhang, Zhaoyang and Dou, Qi and Gu, Jinwei and Xue, Tianfan and Shan, Ying},
    journal={arXiv preprint arXiv:2603.04291},
    year={2026}
}
```

### License

This repository is released under the terms of the [LICENSE file](./LICENSE).

By cloning, downloading, using, or distributing this repository or any of its models or weights, you agree to comply with the terms and conditions specified in the LICENSE.