guolinke
/

SphereAR

Model card Files Files and versions

SphereAR / README.md

nielsr's picture

nielsr HF Staff

Add model card for SphereAR

2d7af34 verified 4 months ago

|

2.02 kB

	---
	pipeline_tag: text-to-image
	---

	# SphereAR: Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

	This repository contains the official PyTorch implementation of the paper [Hyperspherical Latents Improve Continuous-Token Autoregressive Generation](https://huggingface.co/papers/2509.24335).

	SphereAR proposes a simple yet effective approach to continuous-token autoregressive (AR) image generation. It addresses issues like heterogeneous variance in VAE latents, which is amplified during AR decoding, by constraining all AR inputs and outputs---including after Classifier-Free Guidance (CFG)---to lie on a fixed-radius hypersphere (constant $\ell_2$ norm) via hyperspherical VAEs. This approach removes the scale component, thereby stabilizing AR decoding.

	The model is a pure next-token AR generator with raster order. Empirically, on ImageNet 256×256 generation, SphereAR-H (943M) achieves a new state-of-the-art for AR models, reaching FID 1.34.

	For more details, including implementation, training, and evaluation scripts, please refer to the [official GitHub repository](https://github.com/guolinke/SphereAR).

	## Model Checkpoints

	Pre-trained model checkpoints are available:

	\| Name \| params \| FID (256x256) \| weight \|
	\|---\|:---:\|:---:\|:---:\|
	\| S-VAE \| 75M \| - \| [vae.pt](https://huggingface.co/guolinke/SphereAR/blob/main/vae.pt) \|
	\| SphereAR-B \| 208M \| 1.92 \| [SphereAR_B.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_B.pt) \|
	\| SphereAR-L \| 479M \| 1.54 \| [SphereAR_L.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_L.pt) \|
	\| SphereAR-H \| 943M \| 1.34 \| [SphereAR_H.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_H.pt) \|

	## Citation

	If you find this work useful, please consider citing the paper:

	```bibtex
	@article{ke2025hyperspherical,
	title={Hyperspherical Latents Improve Continuous-Token Autoregressive Generation},
	author={Guolin Ke and Hui Xue},
	journal={arXiv preprint arXiv:2509.24335},
	year={2025}
	}
	```