File size: 949 Bytes

c483c3a
 
 
 
 
 
f327b95
 
 
 
 
c0a433c
f327b95

---
license: mit
base_model:
- Wan-AI/Wan2.1-T2V-1.3B
pipeline_tag: text-to-video
---
<p align="center">
<h1 align="center">HiAR</h1>
<h3 align="center">Hierarchical Autoregressive Video Generation with Pipelined Parallel Inference</h3>
</p>
<p align="center">
  <h3 align="center"><a href="https://arxiv.org/abs/2603.08703">arXiv</a> | <a href="https://jacky-hate.github.io/HiAR/">Website</a> | <a href="https://github.com/Jacky-hate/HiAR">Code</a> | <a href="https://huggingface.co/jackyhate/HiAR/tree/main">Model</a></h3>
</p>

---

HiAR proposes **hierarchical denoising** for autoregressive video diffusion models, a paradigm shift from conventional block-first to **step-first** denoising order. By conditioning each block on context at a matched noise level, HiAR maximally attenuates error propagation while preserving temporal causality, achieving **state-of-the-art long video generation** (20s+) with significantly reduced quality drift.