File size: 3,585 Bytes
b3f9627 81301e7 b3f9627 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 64b0761 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 064fdd1 81301e7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
language:
- en
license: mit
pipeline_tag: text-to-image
---
# InfinityCC: Spherical Leech Quantization for Visual Tokenization and Generation
This repository hosts **InfinityCC**, a working example showcasing the power of [Non-Parametric Quantization (NPQ)](https://cs.stanford.edu/~yzz/npq/) for ImageNet-1k class-conditioned image generation.
The model is based on the paper: [**Spherical Leech Quantization for Visual Tokenization and Generation**](https://huggingface.co/papers/2512.14697)
Yue Zhao, Hanwen Jiang, Zhenlin Xu, Chutong Yang, Ehsan Adeli, Philipp Krähenbühl.
Project Page: [https://cs.stanford.edu/~yzz/npq/](https://cs.stanford.edu/~yzz/npq/)
Code: [https://github.com/zhaoyue-zephyrus/InfinityCC](https://github.com/zhaoyue-zephyrus/InfinityCC)
<img src="https://github.com/zhaoyue-zephyrus/InfinityCC/raw/main/assets/npq.png" width="640">
## Introduction
In this work, we explore Spherical Leech Quantization ($\Lambda_{24}$-SQ), a non-parametric quantization method rooted in lattice coding. This approach simplifies the training recipe and improves the reconstruction-compression tradeoff, thanks to its high symmetry and even distribution on the hypersphere. It has demonstrated better reconstruction quality than prior art in image tokenization and compression tasks, with improvements extending to state-of-the-art auto-regressive image generation frameworks. InfinityCC serves as a practical demonstration of this powerful quantization technique for visual generation.
## Installation
We use [uv](https://docs.astral.sh/uv/) to manage all dependencies.
```bash
uv sync
source .venv/bin/activate
```
To evaluate ImageNet using the ADM evaluator, run the following command lines:
```bash
mkdir third_party/ && cd third_party/
git clone https://${GIT_TOKEN}@github.com/openai/guided-diffusion.git
cd guided-diffusion/evaluations
wget https://openaipublic.blob.core.windows.net/diffusion/jul-2021/ref_batches/imagenet/256/VIRTUAL_imagenet256_labeled.npz
```
## Results
### InfinityCC Performance
| model | Resolution | #layers | Tokenizer (HF weights🤗) | VAR Model (HF weights🤗) | FID |
|:----------:|:-----:|:--------:|:---------:|:-----------------------------------------------------------------------------------:|:----:|
| InfinityCC | 256 | 12 | [bitvae_l24_xl](https://huggingface.co/zhaoyue-zephyrus/InfinityCC_L24SQ/tree/main/tokenization/infinity_l24_stage1_xl) | [infinitycc_12layer_weights](https://huggingface.co/zhaoyue-zephyrus/InfinityCC_L24SQ/tree/main/generation/infinitycc_12layer_256x256_l24_xl_ep50_cce_zloss_improved_schedule_dion) | 6.66 |
| InfinityCC | 256 | 24 | [bitvae_l24_xl_vf](https://huggingface.co/zhaoyue-zephyrus/InfinityCC_L24SQ/tree/main/tokenization/infinity_l24_stage1_xl_vf) | [infinitycc_24layer_weights](https://huggingface.co/zhaoyue-zephyrus/InfinityCC_L24SQ/tree/main/generation/infinitycc_24layer_256x256_l24_xl_vf_ep350_cce_zloss_improved_schedule_dion_unsharedaln) | 2.21 |
| InfinityCC-2B | 256 | 32 | [TBD]() | [TBD]() | 1.80 |
## Citation
If our work assists your research, feel free to give us a star ⭐ or cite us using:
```bibtex
@article{zhao2025spherical,
title={Spherical Leech Quantization for Visual Tokenization and Generation},
author={Zhao, Yue and Jiang, Hanwen and Xu, Zhenlin and Yang, Chutong and Adeli, Ehsan and Krähenbühl, Philipp},
journal={arXiv preprint arXiv:2512.14697},
year={2025}
}
```
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |