Add model card and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-to-image
4
+ ---
5
+
6
+ # UAE: Unified Autoencoding
7
+
8
+ This repository contains the weights for **Unified Autoencoding (UAE)**, introduced in the paper [The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding](https://huggingface.co/papers/2512.19693).
9
+
10
+ UAE is a model that harmonizes semantic structure and pixel-level details through an innovative frequency-band modulator. By leveraging the "Prism Hypothesis," the model unifies semantic abstraction (low-frequency) and pixel-level fidelity (high-frequency) into a single latent space, achieving state-of-the-art performance on ImageNet and MS-COCO benchmarks.
11
+
12
+ ## Resources
13
+
14
+ - **Paper:** [The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding](https://huggingface.co/papers/2512.19693)
15
+ - **GitHub Repository:** [https://github.com/WeichenFan/UAE](https://github.com/WeichenFan/UAE)
16
+
17
+ ## Evaluation Results
18
+
19
+ As reported in the official repository, the model achieves the following performance with a frequency ratio of 1.0:
20
+
21
+ | Dataset | PSNR | SSIM | rFID |
22
+ |---------|------|------|------|
23
+ | ImageNet | 29.588 dB | 0.8789 | 0.193 |
24
+ | MS-COCO | 29.484 dB | 0.8846 | 0.157 |
25
+
26
+ ## Citation
27
+
28
+ ```bibtex
29
+ @misc{fan2025uae,
30
+ title={The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding},
31
+ author={Weichen Fan and Haiwen Diao and Quan Wang and Dahua Lin and Ziwei Liu},
32
+ year={2025},
33
+ eprint={2512.19693},
34
+ archivePrefix={arXiv},
35
+ primaryClass={cs.CV},
36
+ url={https://arxiv.org/abs/2512.19693},
37
+ }
38
+ ```