zhaoyangjia nielsr HF Staff commited on
Commit
3a7a1a8
·
1 Parent(s): 187d515

Add authors and paper highlights to model card (#1)

Browse files

- Add authors and paper highlights to model card (0e40b3cb02329f33cae5c2567aa816a567d9bbb3)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +19 -9
README.md CHANGED
@@ -1,23 +1,33 @@
1
  ---
2
- license: mit
3
- tags:
4
- - image-compression
5
- - diffusion
6
- - codec
7
- - neural-compression
8
  language:
9
- - en
 
10
  pipeline_tag: image-to-image
 
 
 
 
 
 
11
  ---
12
 
13
  <h2 align="center">CoD: A Diffusion Foundation Model for Image Compression</h2>
14
 
15
  <p align="center">
 
16
  <a href="https://arxiv.org/abs/2511.18706"><img src="https://img.shields.io/badge/arXiv-2511.18706-b31b1b.svg" alt="arXiv"></a>
17
  <a href="https://github.com/microsoft/GenCodec/tree/main/CoD"><img src="https://img.shields.io/badge/Code-GitHub-blue.svg" alt="GitHub"></a>
18
  </p>
19
 
20
- **CoD** (**Co**mpression-oriented **D**iffusion) is the first diffusion foundation model designed and trained from scratch specifically for image compression. A lightweight condition encoder image-native features, a VQ information bottleneck compresses them into a compact bitstream, and a Diffusion Transformer reconstructs the image conditioned on the quantized representation.
 
 
 
 
 
 
 
 
21
 
22
  ## Available Models
23
 
@@ -141,4 +151,4 @@ python -m downstream.perceptual_loss_inference \
141
 
142
  ## License
143
 
144
- MIT
 
1
  ---
 
 
 
 
 
 
2
  language:
3
+ - en
4
+ license: mit
5
  pipeline_tag: image-to-image
6
+ tags:
7
+ - image-compression
8
+ - diffusion
9
+ - codec
10
+ - neural-compression
11
+ - foundation-model
12
  ---
13
 
14
  <h2 align="center">CoD: A Diffusion Foundation Model for Image Compression</h2>
15
 
16
  <p align="center">
17
+ <a href="https://huggingface.co/papers/2511.18706"><img src="https://img.shields.io/badge/Paper-HF%20Paper%20Page-blue.svg" alt="Paper"></a>
18
  <a href="https://arxiv.org/abs/2511.18706"><img src="https://img.shields.io/badge/arXiv-2511.18706-b31b1b.svg" alt="arXiv"></a>
19
  <a href="https://github.com/microsoft/GenCodec/tree/main/CoD"><img src="https://img.shields.io/badge/Code-GitHub-blue.svg" alt="GitHub"></a>
20
  </p>
21
 
22
+ **CoD** (**Co**mpression-oriented **D**iffusion) is the first diffusion foundation model designed and trained from scratch specifically for image compression. It enables end-to-end optimization of both compression and generation.
23
+
24
+ ### Authors
25
+ Zhaoyang Jia, Zihan Zheng, Naifu Xue, Jiahao Li, Bin Li, Zongyu Guo, Xiaoyi Zhang, Houqiang Li, Yan Lu
26
+
27
+ ### Key Advantages
28
+ - **High compression efficiency**: Replaces Stable Diffusion in downstream codecs (like DiffC) to achieve SOTA results, especially at ultra-low bitrates (e.g., 0.0039 bpp).
29
+ - **Low-cost and reproducible training**: 300$\times$ faster training than Stable Diffusion ($\sim$ 20 vs. $\sim$ 6,250 A100 GPU days) on entirely open image-only datasets.
30
+ - **Architecture**: Features a lightweight condition encoder for image-native features, a VQ information bottleneck for compact bitstreams, and a Diffusion Transformer (DiT) for reconstruction.
31
 
32
  ## Available Models
33
 
 
151
 
152
  ## License
153
 
154
+ MIT