GK50
/

GAE-Checkpoints

Model card Files Files and versions

xet

Community

Improve model card: add paper link, code link, and metadata

by nielsr HF Staff - opened Mar 12

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+56

-3

Files changed (1) hide show

README.md +56 -3

README.md CHANGED Viewed

@@ -1,3 +1,56 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+pipeline_tag: image-to-image
+---
+# Geometric Autoencoder for Diffusion Models (GAE)
+Geometric Autoencoder (GAE) is a principled framework designed to systematically address the heuristic nature of latent space design in Latent Diffusion Models (LDMs). GAE significantly enhances semantic discriminability and latent compactness without compromising reconstruction fidelity.
+- **Paper:** [Geometric Autoencoder for Diffusion Models](https://huggingface.co/papers/2603.10365)
+- **Code:** [GitHub Repository](https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models)
+## Overview
+GAE introduces three core innovations:
+1. **Latent Normalization**: Replaces the restrictive KL-divergence of standard VAEs with **RMSNorm** regularization. By projecting features onto a unit hypersphere, GAE provides a stable, scalable latent manifold optimized for diffusion learning.
+2. **Latent Alignment**: Leverages Vision Foundation Models (VFMs, e.g., DINOv2) as semantic teachers. Through a carefully designed semantic downsampler, the low-dimensional latent vectors directly inherit strong discriminative semantic priors.
+3. **Dynamic Noise Sampling**: Specifically addresses the high-intensity noise typical in diffusion processes, ensuring robust reconstruction performance even under extreme noise levels.
+## Model Zoo
+| Model | Epochs | Latent Dim | gFID (w/o CFG) | Weights |
+| :--- | :---: | :---: | :---: | :---: |
+| **GAE-LightningDiT-XL** | 80 | 32 | 1.82 | [🔗 Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
+| **GAE-LightningDiT-XL** | 800 | 32 | 1.31 | [🔗 Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
+| **GAE** | 200 | 32  | - | [🔗 Checkpoints](https://huggingface.co/GK50/GAE-Checkpoints/tree/main/checkpoints/d32) |
+## Usage
+### 1. Installation
+```bash
+git clone https://github.com/freezing-index/Geometric-Autoencoder-for-Diffusion-Models.git
+cd GAE
+conda create -n gae python=3.10.12
+conda activate gae
+pip install -r requirements.txt
+```
+### 2. Inference (Sampling)
+Download the pre-trained weights from Hugging Face and place them in the `checkpoints/` folder. Ensure you update the paths in the `configs/` folder to match your local setup.
+For class-uniform sampling:
+```bash
+bash inference_gae.sh $DIT_CONFIG $VAE_CONFIG
+```
+## Citation
+```bibtex
+@article{liu2026geometric,
+  title={Geometric Autoencoder for Diffusion Models},
+  author={Hangyu Liu and Jianyong Wang and Yutao Sun},
+  journal={arXiv preprint arXiv:2603.10365},
+  year={2026}
+}
+```