NVG / README.md
yikaiwang's picture
Create README.md
3259d1a verified
---
extra_gated_fields:
Name: text
Institute: text
Institutional Email: text
I agree to use this model for non-commercial use ONLY: checkbox
---
# Model Card for NVG series
<p align="center">
<h1 align="center">Next Visual Granularity Generation</h1>
<center>Yikai Wang, Zhouxia Wang, Zhonghua Wu, Qingyi Tao, Kang Liao, Chen Change Loy.<br>
S-Lab, Nanyang Technological University; SenseTime Research<br> </center>
<p align="center">
<a href="https://arxiv.org/abs/2508.12811"><img alt='arXiv' src="https://img.shields.io/badge/arXiv-2508.12811-b31b1b.svg"></a>
<a href="https://yikai-wang.github.io/nvg/"><img alt='page' src="https://img.shields.io/badge/Project-Website-orange"></a>
</p>
</p>
## Model Details
### Model Description
We propose a novel approach to image generation by decomposing an image into a structured sequence, where each element in the sequence shares the same spatial resolution but differs in the number of unique tokens used, capturing different level of visual granularity.<br>
Image generation is carried out through our newly introduced Next Visual Granularity (NVG) generation framework, which generates a visual granularity sequence beginning from an empty image and progressively refines it, from global layout to fine details, in a structured manner. This iterative process encodes a hierarchical, layered representation that offers fine-grained control over the generation process across multiple granularity levels.<br>
We train a series of NVG models for class-conditional image generation on the ImageNet dataset and observe clear scaling behavior. Compared to the VAR series, NVG consistently outperforms it in terms of FID scores (3.30 → 3.03, 2.57 → 2.44, 2.09 → 2.06). We also conduct extensive analysis to showcase the capability and potential of the NVG framework. Our code and models will be released.<br>
- **License:** S-Lab License 1.0
### Model Sources
<!-- Provide the basic links for the model. -->
- **Code:** https://github.com/Yikai-Wang/nvg
- **Paper:** https://arxiv.org/abs/2508.12811
## Uses
Illustrated in the github repo.
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
```
@article{wang2025next,
title={Next Visual Granularity Generation},
author={Wang, Yikai and Wang, Zhouxia and Wu, Zhonghua and Tao, Qingyi and Liao, Kang and Loy, Chen Change},
journal={arXiv preprint arXiv:2508.12811},
year={2025}
}
```