Improve model card: add paper link, code link, and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +56 -3
README.md CHANGED
@@ -1,3 +1,56 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-to-image
4
+ ---
5
+
6
+ # Geometric Autoencoder for Diffusion Models (GAE)
7
+
8
+ Geometric Autoencoder (GAE) is a principled framework designed to systematically address the heuristic nature of latent space design in Latent Diffusion Models (LDMs). GAE significantly enhances semantic discriminability and latent compactness without compromising reconstruction fidelity.
9
+
10
+ - **Paper:** [Geometric Autoencoder for Diffusion Models](https://huggingface.co/papers/2603.10365)
11
+ - **Code:** [GitHub Repository](https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models)
12
+
13
+ ## Overview
14
+
15
+ GAE introduces three core innovations:
16
+ 1. **Latent Normalization**: Replaces the restrictive KL-divergence of standard VAEs with **RMSNorm** regularization. By projecting features onto a unit hypersphere, GAE provides a stable, scalable latent manifold optimized for diffusion learning.
17
+ 2. **Latent Alignment**: Leverages Vision Foundation Models (VFMs, e.g., DINOv2) as semantic teachers. Through a carefully designed semantic downsampler, the low-dimensional latent vectors directly inherit strong discriminative semantic priors.
18
+ 3. **Dynamic Noise Sampling**: Specifically addresses the high-intensity noise typical in diffusion processes, ensuring robust reconstruction performance even under extreme noise levels.
19
+
20
+ ## Model Zoo
21
+
22
+ | Model | Epochs | Latent Dim | gFID (w/o CFG) | Weights |
23
+ | :--- | :---: | :---: | :---: | :---: |
24
+ | **GAE-LightningDiT-XL** | 80 | 32 | 1.82 | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
25
+ | **GAE-LightningDiT-XL** | 800 | 32 | 1.31 | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
26
+ | **GAE** | 200 | 32 | - | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
27
+
28
+ ## Usage
29
+
30
+ ### 1. Installation
31
+ ```bash
32
+ git clone https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models.git
33
+ cd GAE
34
+ conda create -n gae python=3.10.12
35
+ conda activate gae
36
+ pip install -r requirements.txt
37
+ ```
38
+
39
+ ### 2. Inference (Sampling)
40
+ Download the pre-trained weights from Hugging Face and place them in the `checkpoints/` folder. Ensure you update the paths in the `configs/` folder to match your local setup.
41
+
42
+ For class-uniform sampling:
43
+ ```bash
44
+ bash inference_gae.sh $DIT_CONFIG $VAE_CONFIG
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @article{liu2026geometric,
51
+ title={Geometric Autoencoder for Diffusion Models},
52
+ author={Hangyu Liu and Jianyong Wang and Yutao Sun},
53
+ journal={arXiv preprint arXiv:2603.10365},
54
+ year={2026}
55
+ }
56
+ ```