File size: 2,597 Bytes
bd13075
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
license: apache-2.0
pipeline_tag: image-to-image
---

# Geometric Autoencoder for Diffusion Models (GAE)

Geometric Autoencoder (GAE) is a principled framework designed to systematically address the heuristic nature of latent space design in Latent Diffusion Models (LDMs). GAE significantly enhances semantic discriminability and latent compactness without compromising reconstruction fidelity.

- **Paper:** [Geometric Autoencoder for Diffusion Models](https://huggingface.co/papers/2603.10365)
- **Code:** [GitHub Repository](https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models)

## Overview

GAE introduces three core innovations:
1. **Latent Normalization**: Replaces the restrictive KL-divergence of standard VAEs with **RMSNorm** regularization. By projecting features onto a unit hypersphere, GAE provides a stable, scalable latent manifold optimized for diffusion learning.
2. **Latent Alignment**: Leverages Vision Foundation Models (VFMs, e.g., DINOv2) as semantic teachers. Through a carefully designed semantic downsampler, the low-dimensional latent vectors directly inherit strong discriminative semantic priors.
3. **Dynamic Noise Sampling**: Specifically addresses the high-intensity noise typical in diffusion processes, ensuring robust reconstruction performance even under extreme noise levels.

## Model Zoo

| Model | Epochs | Latent Dim | gFID (w/o CFG) | Weights |
| :--- | :---: | :---: | :---: | :---: |
| **GAE-LightningDiT-XL** | 80 | 32 | 1.82 | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
| **GAE-LightningDiT-XL** | 800 | 32 | 1.31 | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
| **GAE** | 200 | 32  | - | [๐Ÿ”— Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |

## Usage

### 1. Installation
```bash
git clone https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models.git
cd GAE
conda create -n gae python=3.10.12
conda activate gae
pip install -r requirements.txt
```

### 2. Inference (Sampling)
Download the pre-trained weights from Hugging Face and place them in the `checkpoints/` folder. Ensure you update the paths in the `configs/` folder to match your local setup.

For class-uniform sampling:
```bash
bash inference_gae.sh $DIT_CONFIG $VAE_CONFIG
```

## Citation

```bibtex
@article{liu2026geometric,
  title={Geometric Autoencoder for Diffusion Models},
  author={Hangyu Liu and Jianyong Wang and Yutao Sun},
  journal={arXiv preprint arXiv:2603.10365},
  year={2026}
}
```