|
|
--- |
|
|
pipeline_tag: text-to-image |
|
|
--- |
|
|
|
|
|
# SphereAR: Hyperspherical Latents Improve Continuous-Token Autoregressive Generation |
|
|
|
|
|
This repository contains the official PyTorch implementation of the paper [Hyperspherical Latents Improve Continuous-Token Autoregressive Generation](https://huggingface.co/papers/2509.24335). |
|
|
|
|
|
SphereAR proposes a simple yet effective approach to continuous-token autoregressive (AR) image generation. It addresses issues like heterogeneous variance in VAE latents, which is amplified during AR decoding, by constraining all AR inputs and outputs---including after Classifier-Free Guidance (CFG)---to lie on a fixed-radius hypersphere (constant $\ell_2$ norm) via hyperspherical VAEs. This approach removes the scale component, thereby stabilizing AR decoding. |
|
|
|
|
|
The model is a pure next-token AR generator with raster order. Empirically, on ImageNet 256×256 generation, SphereAR-H (943M) achieves a new state-of-the-art for AR models, reaching FID 1.34. |
|
|
|
|
|
For more details, including implementation, training, and evaluation scripts, please refer to the [official GitHub repository](https://github.com/guolinke/SphereAR). |
|
|
|
|
|
## Model Checkpoints |
|
|
|
|
|
Pre-trained model checkpoints are available: |
|
|
|
|
|
| Name | params | FID (256x256) | weight | |
|
|
|---|:---:|:---:|:---:| |
|
|
| S-VAE | 75M | - | [vae.pt](https://huggingface.co/guolinke/SphereAR/blob/main/vae.pt) | |
|
|
| SphereAR-B | 208M | 1.92 | [SphereAR_B.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_B.pt) | |
|
|
| SphereAR-L | 479M | 1.54 | [SphereAR_L.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_L.pt) | |
|
|
| SphereAR-H | 943M | 1.34 | [SphereAR_H.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_H.pt) | |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you find this work useful, please consider citing the paper: |
|
|
|
|
|
```bibtex |
|
|
@article{ke2025hyperspherical, |
|
|
title={Hyperspherical Latents Improve Continuous-Token Autoregressive Generation}, |
|
|
author={Guolin Ke and Hui Xue}, |
|
|
journal={arXiv preprint arXiv:2509.24335}, |
|
|
year={2025} |
|
|
} |
|
|
``` |