Add model card for SphereAR

#3
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-to-image
3
+ ---
4
+
5
+ # SphereAR: Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
6
+
7
+ This repository contains the official PyTorch implementation of the paper [Hyperspherical Latents Improve Continuous-Token Autoregressive Generation](https://huggingface.co/papers/2509.24335).
8
+
9
+ SphereAR proposes a simple yet effective approach to continuous-token autoregressive (AR) image generation. It addresses issues like heterogeneous variance in VAE latents, which is amplified during AR decoding, by constraining all AR inputs and outputs---including after Classifier-Free Guidance (CFG)---to lie on a fixed-radius hypersphere (constant $\ell_2$ norm) via hyperspherical VAEs. This approach removes the scale component, thereby stabilizing AR decoding.
10
+
11
+ The model is a pure next-token AR generator with raster order. Empirically, on ImageNet 256×256 generation, SphereAR-H (943M) achieves a new state-of-the-art for AR models, reaching FID 1.34.
12
+
13
+ For more details, including implementation, training, and evaluation scripts, please refer to the [official GitHub repository](https://github.com/guolinke/SphereAR).
14
+
15
+ ## Model Checkpoints
16
+
17
+ Pre-trained model checkpoints are available:
18
+
19
+ | Name | params | FID (256x256) | weight |
20
+ |---|:---:|:---:|:---:|
21
+ | S-VAE | 75M | - | [vae.pt](https://huggingface.co/guolinke/SphereAR/blob/main/vae.pt) |
22
+ | SphereAR-B | 208M | 1.92 | [SphereAR_B.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_B.pt) |
23
+ | SphereAR-L | 479M | 1.54 | [SphereAR_L.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_L.pt) |
24
+ | SphereAR-H | 943M | 1.34 | [SphereAR_H.pt](https://huggingface.co/guolinke/SphereAR/blob/main/SphereAR_H.pt) |
25
+
26
+ ## Citation
27
+
28
+ If you find this work useful, please consider citing the paper:
29
+
30
+ ```bibtex
31
+ @article{ke2025hyperspherical,
32
+ title={Hyperspherical Latents Improve Continuous-Token Autoregressive Generation},
33
+ author={Guolin Ke and Hui Xue},
34
+ journal={arXiv preprint arXiv:2509.24335},
35
+ year={2025}
36
+ }
37
+ ```