Enhance model card for Neon: Add metadata, links, abstract, benchmarks, and usage
Browse filesThis Pull Request updates the model card for the "Neon: Negative Extrapolation From Self-Training Improves Image Generation" repository.
Key changes include:
- **Metadata:** Added `pipeline_tag: unconditional-image-generation` for better discoverability. The `library_name` has been omitted as the original repository does not provide direct integration with Hugging Face libraries (e.g., `diffusers`) for automated usage snippets.
- **Content:**
- The model card now features the full paper title as the main heading, along with a direct link to the paper ([Neon: Negative Extrapolation From Self-Training Improves Image Generation](https://huggingface.co/papers/2510.03597)).
- A link to the official GitHub repository ([https://github.com/SinaAlemohammad/Neon](https://github.com/SinaAlemohammad/Neon)) has been included.
- An introductory section summarizing the paper's abstract and contributions has been added.
- The "Benchmark Performance" table from the GitHub README is now included, showcasing model results across various architectures and datasets.
- The comprehensive "Quickstart" section, detailing environment setup, model download, and evaluation procedures (with original bash commands), has been incorporated to guide users on how to run the models.
- Additional sections like "Toy Experiment", "Repository Map", "Citation", "Contact", and "Acknowledgments" have been added to provide complete context from the original GitHub README.
These changes aim to make the model card more informative, discoverable, and aligned with best practices for documenting AI artifacts on the Hugging Face Hub.
|
@@ -1,3 +1,185 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
pipeline_tag: unconditional-image-generation
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# Neon: Negative Extrapolation From Self-Training Improves Image Generation
|
| 7 |
+
|
| 8 |
+
This repository contains the models and code presented in the paper [Neon: Neon: Negative Extrapolation From Self-Training Improves Image Generation](https://huggingface.co/papers/2510.03597).
|
| 9 |
+
|
| 10 |
+
The official PyTorch implementation and code can be found at the [GitHub repository](https://github.com/SinaAlemohammad/Neon).
|
| 11 |
+
|
| 12 |
+
## About Neon
|
| 13 |
+
|
| 14 |
+
Scaling generative AI models is bottlenecked by the scarcity of high-quality training data. The ease of synthesizing from a generative model suggests using (unverified) synthetic data to augment a limited corpus of real data for the purpose of fine-tuning in the hope of improving performance. Unfortunately, however, the resulting positive feedback loop leads to model autophagy disorder (MAD, aka model collapse) that results in a rapid degradation in sample quality and/or diversity.
|
| 15 |
+
|
| 16 |
+
Neon (for Negative Extrapolation frOm self-traiNing) introduces a new learning method that turns the degradation from self-training into a powerful signal for self-improvement. Given a base model, Neon first fine-tunes it on its own self-synthesized data but then, counterintuitively, reverses its gradient updates to extrapolate away from the degraded weights. This approach corrects predictable anti-alignment between synthetic and real data population gradients, leading to better alignment with the true data distribution. Neon is remarkably easy to implement via a simple post-hoc merge that requires no new real data, works effectively with as few as 1k synthetic samples, and typically uses less than 1% additional training compute. It demonstrates universality across a range of architectures (diffusion, flow matching, autoregressive, and inductive moment matching models) and datasets (ImageNet, CIFAR-10, and FFHQ). On ImageNet 256x256, Neon elevates the xAR-L model to a new state-of-the-art FID of 1.02 with only 0.36% additional training compute.
|
| 17 |
+
|
| 18 |
+
## Benchmark Performance
|
| 19 |
+
|
| 20 |
+
| Model type | Dataset | Base model FID | Neon FID (paper) | Download model |
|
| 21 |
+
| :---------- | :--------- | -------------: | ---------------: | :--------------- |
|
| 22 |
+
| xAR-L | ImageNet‑256 | 1.28 | **1.02** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_xARL_imagenet256.pth) |
|
| 23 |
+
| xAR-B | ImageNet‑256 | 1.72 | **1.31** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_xARB_imagenet256.pth) |
|
| 24 |
+
| VAR d16 | ImageNet‑256 | 3.30 | **2.01** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_VARd16_imagenet256.pth) |
|
| 25 |
+
| VAR d36 | ImageNet‑512 | 2.63 | **1.70** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_VARd36_imagenet512.pth) |
|
| 26 |
+
| EDM (cond.) | CIFAR‑10 (32×32) | 1.78 | **1.38** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_EDM_conditional_CIFAR10.pkl) |
|
| 27 |
+
| EDM (uncond.) | CIFAR‑10 (32×32) | 1.98 | **1.38** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_EDM_unconditional_CIFAR10.pkl) |
|
| 28 |
+
| EDM | FFHQ‑64×64 | 2.39 | **1.12** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_EDM_FFHQ.pkl) |
|
| 29 |
+
| IMM | ImageNet‑256 | 1.99 | **1.46** | [Download](https://huggingface.co/sinaalemohammad/Neon/resolve/main/Neon_imm_imagenet256.pkl) |
|
| 30 |
+
|
| 31 |
+
## 🚀 Quickstart
|
| 32 |
+
|
| 33 |
+
### 1) Environment
|
| 34 |
+
|
| 35 |
+
```bash
|
| 36 |
+
# from repo root
|
| 37 |
+
conda env create -f environment.yml
|
| 38 |
+
conda activate neon
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
### 2) Download pretrained models & FID stats
|
| 42 |
+
|
| 43 |
+
```bash
|
| 44 |
+
bash download_models.sh
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
This populates `checkpoints/` and `fid_stats/`.
|
| 48 |
+
**Pretrained Neon models can also be downloaded from Hugging Face:** [https://huggingface.co/sinaalemohammad/Neon](https://huggingface.co/sinaalemohammad/Neon)
|
| 49 |
+
|
| 50 |
+
### 3) Evaluate (FID/IS)
|
| 51 |
+
|
| 52 |
+
> All examples assume 8 GPUs; adjust `--nproc_per_node` / batch sizes as needed.
|
| 53 |
+
|
| 54 |
+
**xAR @ ImageNet‑256**
|
| 55 |
+
|
| 56 |
+
```bash
|
| 57 |
+
# 1) VAE for xAR (credit: MAR)
|
| 58 |
+
hf download xwen99/mar-vae-kl16 --include kl16.ckpt --local-dir xAR/pretrained
|
| 59 |
+
# 2) Use it via:
|
| 60 |
+
# --vae_path xAR/pretrained/kl16.ckpt
|
| 61 |
+
|
| 62 |
+
# xAR‑L
|
| 63 |
+
PYTHONPATH=xAR torchrun --standalone --nproc_per_node=8 xAR/calculate_fid.py \
|
| 64 |
+
--model xar_large \
|
| 65 |
+
--model_ckpt checkpoints/Neon_xARL_imagenet256.pth \
|
| 66 |
+
--cfg 2.3 --vae_path xAR/pretrained/kl16.ckpt \
|
| 67 |
+
--num_images 50000 --batch_size 64 --flow_steps 40 --img_size 256 \
|
| 68 |
+
--fid_stats fid_stats/adm_in256_stats.npz
|
| 69 |
+
|
| 70 |
+
# xAR‑B
|
| 71 |
+
PYTHONPATH=xAR torchrun --standalone --nproc_per_node=8 xAR/calculate_fid.py \
|
| 72 |
+
--model xar_base \
|
| 73 |
+
--model_ckpt checkpoints/Neon_xARB_imagenet256.pth \
|
| 74 |
+
--cfg 2.7 --vae_path xAR/pretrained/kl16.ckpt \
|
| 75 |
+
--num_images 50000 --batch_size 32 --flow_steps 50 --img_size 256 \
|
| 76 |
+
--fid_stats fid_stats/adm_in256_stats.npz
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
**VAR @ ImageNet‑256 / 512**
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
# d16 @ 256
|
| 83 |
+
PYTHONPATH=VAR/VAR_imagenet_256 torchrun --standalone --nproc_per_node=8 \
|
| 84 |
+
VAR/VAR_imagenet_256/calculate_fid.py \
|
| 85 |
+
--var_ckpt checkpoints/Neon_VARd16_imagenet256.pth \
|
| 86 |
+
--num_images 50000 --batch_size 64 --img_size 256 \
|
| 87 |
+
--fid_stats fid_stats/adm_in256_stats.npz
|
| 88 |
+
|
| 89 |
+
# d36 @ 512
|
| 90 |
+
PYTHONPATH=VAR/VAR_imagenet_512 torchrun --standalone --nproc_per_node=8 \
|
| 91 |
+
VAR/VAR_imagenet_512/calculate_fid.py \
|
| 92 |
+
--var_ckpt checkpoints/Neon_VARd36_imagenet512.pth \
|
| 93 |
+
--num_images 50000 --batch_size 32 --img_size 512 \
|
| 94 |
+
--fid_stats fid_stats/adm_in512_stats.npz
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
**EDM (Karras et al.) @ CIFAR‑10 / FFHQ**
|
| 98 |
+
|
| 99 |
+
```bash
|
| 100 |
+
# CIFAR‑10 (conditional)
|
| 101 |
+
PYTHONPATH=edm torchrun --standalone --nproc_per_node=8 edm/calculate_fid.py \
|
| 102 |
+
--network_pkl checkpoints/Neon_EDM_conditional_CIFAR10.pkl \
|
| 103 |
+
--ref https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/cifar10-32x32.npz \
|
| 104 |
+
--seeds 0-49999 --max_batch_size 256 --num_steps 18
|
| 105 |
+
|
| 106 |
+
# CIFAR‑10 (unconditional)
|
| 107 |
+
PYTHONPATH=edm torchrun --standalone --nproc_per_node=8 edm/calculate_fid.py \
|
| 108 |
+
--network_pkl checkpoints/Neon_EDM_unconditional_CIFAR10.pkl \
|
| 109 |
+
--ref https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/cifar10-32x32.npz \
|
| 110 |
+
--seeds 0-49999 --max_batch_size 256 --num_steps 18
|
| 111 |
+
|
| 112 |
+
# FFHQ‑64 (unconditional)
|
| 113 |
+
PYTHONPATH=edm torchrun --standalone --nproc_per_node=8 edm/calculate_fid.py \
|
| 114 |
+
--network_pkl checkpoints/Neon_EDM_FFHQ.pkl \
|
| 115 |
+
--ref https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/ffhq-64x64.npz \
|
| 116 |
+
--seeds 0-49999 --max_batch_size 256 --num_steps 40
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
**IMM @ ImageNet‑256**
|
| 120 |
+
|
| 121 |
+
```bash
|
| 122 |
+
# IMM @ T = 8
|
| 123 |
+
PYTHONPATH=imm torchrun --standalone --nproc_per_node=8 imm/calculate_fid.py \
|
| 124 |
+
--model_ckpt checkpoints/Neon_IMM_imagenet256.pth \
|
| 125 |
+
--num_images 50000 --batch_size 64 --img_size 256 \
|
| 126 |
+
--fid_stats fid_stats/adm_in256_stats.npz
|
| 127 |
+
```
|
| 128 |
+
|
| 129 |
+
---
|
| 130 |
+
|
| 131 |
+
## 🧪 Toy Experiment (2D Gaussian)
|
| 132 |
+
|
| 133 |
+
A minimal, visual demo of Neon in action:
|
| 134 |
+
|
| 135 |
+
* File: `toy_appendix.ipynb`
|
| 136 |
+
* **What it does**: learns a 2D Gaussian with (i) a tiny diffusion model and (ii) a tiny autoregressive model, then applies Neon to show how the reverse‑merge restores coverage. Great for building intuition
|
| 137 |
+
|
| 138 |
+
---
|
| 139 |
+
|
| 140 |
+
## 🗺️ Repository Map
|
| 141 |
+
|
| 142 |
+
```
|
| 143 |
+
Neon/
|
| 144 |
+
├── VAR/ # VAR baselines + eval scripts
|
| 145 |
+
├── xAR/ # xAR baselines + eval scripts (uses MAR VAE)
|
| 146 |
+
├── edm/ # EDM baselines + metrics/scripts
|
| 147 |
+
├── imm/ # IMM baselines + eval scripts
|
| 148 |
+
├── toy_appendix.ipynb # 2D Gaussian toy example (diffusion & AR)
|
| 149 |
+
├── download_models.sh # Grab all checkpoints + FID refs
|
| 150 |
+
├── environment.yml # Reproducible env
|
| 151 |
+
└── checkpoints/, fid_stats/ (created by the script)
|
| 152 |
+
```
|
| 153 |
+
|
| 154 |
+
---
|
| 155 |
+
|
| 156 |
+
## 📣 Citation
|
| 157 |
+
|
| 158 |
+
If you find Neon useful, please consider citing the paper:
|
| 159 |
+
|
| 160 |
+
```bibtex
|
| 161 |
+
@article{neon2025,
|
| 162 |
+
title={Neon: Negative Extrapolation from Self-Training for Generative Models},
|
| 163 |
+
author={Alemohammad, Sina and collaborators},
|
| 164 |
+
journal={arXiv preprint},
|
| 165 |
+
year={2025}
|
| 166 |
+
}
|
| 167 |
+
```
|
| 168 |
+
|
| 169 |
+
---
|
| 170 |
+
|
| 171 |
+
## Contact
|
| 172 |
+
|
| 173 |
+
Questions? Reach out to **Sina Alemohammad** — [sinaalemohammad@gmail.com](mailto:sinaalemohammad@gmail.com).
|
| 174 |
+
|
| 175 |
+
---
|
| 176 |
+
|
| 177 |
+
## Acknowledgments
|
| 178 |
+
|
| 179 |
+
This repository builds upon and thanks the following projects:
|
| 180 |
+
|
| 181 |
+
* [VAR — Visual AutoRegressive Modeling](https://github.com/FoundationVision/VAR)
|
| 182 |
+
* [xAR — Beyond Next‑Token: Next‑X Prediction](https://github.com/OliverRensu/xAR)
|
| 183 |
+
* [IMM — Inductive Moment Matching](https://github.com/lumalabs/imm)
|
| 184 |
+
* [EDM — Elucidating the Design Space of Diffusion Models](https://github.com/NVlabs/edm)
|
| 185 |
+
* [MAR VAE (KL‑16) tokenizer](https://huggingface.co/xwen99/mar-vae-kl16)
|