CoD: A Diffusion Foundation Model for Image Compression

CoD (Compression-oriented Diffusion) is the first diffusion foundation model designed and trained from scratch specifically for image compression. A lightweight condition encoder image-native features, a VQ information bottleneck compresses them into a compact bitstream, and a Diffusion Transformer reconstructs the image conditioned on the quantized representation.

Available Models

Base CoD Models (`cod/`)

Model	Space	BPP	Config	Checkpoint
CoD (pixel)	Pixel	0.0039	`CoD_pixel_vpred.yaml`	`CoD_pixel_vpred.pt`
CoD (latent)	Latent	0.0039	`CoD_latent_vpred.yaml`	`CoD_latent_vpred.pt`
CoD (latent, 64-bit)	Latent	0.00024	`CoD_latent_vpred_64bits.yaml`	`CoD_latent_vpred_64bits.pt`

One-Step CoD (`finetuned_one_step_cod/`)

Single forward pass, better performance, wider bitrates.

Model	BPP	Config	Checkpoint
`bpp_0_0039`	0.0039	`bpp_0_0039.yaml`	`bpp_0_0039.pt`
`bpp_0_0039_noise_1`	0.0039	`bpp_0_0039_noise_1.yaml`	`bpp_0_0039_noise_1.pt`
`bpp_0_0312`	0.0312	`bpp_0_0312.yaml`	`bpp_0_0312.pt`
`bpp_0_1250`	0.1250	`bpp_0_1250.yaml`	`bpp_0_1250.pt`

CoD as Perceptual Loss (`perceptual_loss_illm_dec/`)

Model	Checkpoint
`msillm_quality_vlo2`	`msillm_quality_vlo2.pt`
`msillm_quality_1`	`msillm_quality_1.pt`
`msillm_quality_2`	`msillm_quality_2.pt`
`msillm_quality_3`	`msillm_quality_3.pt`
`msillm_quality_4`	`msillm_quality_4.pt`

Performance

Metrics evaluated on Kodak (512x512):

Model	BPP	PSNR	LPIPS	DISTS	FID
CoD (pixel)	0.0039	16.21	0.434	0.186	46.0
CoD (latent)	0.0039	15.03	0.415	0.188	45.7
CoD (latent, 64-bit)	0.00024	10.09	0.686	0.288	69.5

Note: CoD (latent) at 0.0039 bpp uses --cfg 1.25. CoD (latent, 64-bit) uses --cfg 3.0.

Quick Start

Installation

git clone https://github.com/microsoft/GenCodec/CoD.git
cd CoD
pip install -r requirements.txt

Download Checkpoints

# Download base CoD models
huggingface-cli download jzyustc/CoD --include "cod/*" --local-dir ./pretrained/CoD

# Download one-step models
huggingface-cli download jzyustc/CoD --include "finetuned_one_step_cod/*" --local-dir ./pretrained/CoD

# Download perceptual loss models
huggingface-cli download jzyustc/CoD --include "perceptual_loss_illm_dec/*" --local-dir ./pretrained/CoD

# Download a specific model
huggingface-cli download jzyustc/CoD cod/CoD_pixel_vpred.pt cod/CoD_pixel_vpred.yaml --local-dir ./pretrained/CoD

# Download everything
huggingface-cli download jzyustc/CoD --local-dir ./pretrained/CoD

Base CoD Inference

python -m cod.inference evaluate \
    --ckpt ./pretrained/CoD/cod/CoD_pixel_vpred.pt \
    --config ./pretrained/CoD/cod/CoD_pixel_vpred.yaml \
    --input <image_dir> --output <recon_dir> \
    --step 25 --cfg 3.0 --sampler adam2

# For latent model, use --cfg 1.25
python -m cod.inference evaluate \
    --ckpt ./pretrained/CoD/cod/CoD_latent_vpred.pt \
    --config ./pretrained/CoD/cod/CoD_latent_vpred.yaml \
    --input <image_dir> --output <recon_dir> \
    --step 25 --cfg 1.25 --sampler adam2

One-Step CoD Inference

python -m downstream.finetuned_one_step_cod evaluate \
    --ckpt ./pretrained/CoD/finetuned_one_step_cod/bpp_0_0039.pt \
    --config ./pretrained/CoD/finetuned_one_step_cod/bpp_0_0039.yaml \
    --input <image_dir> --output <recon_dir>

Perceptual Loss Inference

Requires NeuralCompression (installed automatically via torch.hub).

python -m downstream.perceptual_loss_inference \
    --ckpt ./pretrained/CoD/perceptual_loss_illm_dec/msillm_quality_1.pt \
    --quality 1 \
    --input <image_dir> --output <recon_dir>

Citation

@inproceedings{jia2025cod,
    title     = {CoD: A Diffusion Foundation Model for Image Compression},
    author    = {Jia, Zhaoyang and Zheng, Zihan and Xue, Naifu and Li, Jiahao and Li, Bin and Guo, Zongyu and Zhang, Xiaoyi and Li, Houqiang and Lu, Yan},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2026}
}

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Collection including jzyustc/CoD

CoD

Collection

Collection of Compression-oriented Diffusion Models and its Applications • 1 item • Updated about 12 hours ago

Paper for jzyustc/CoD

CoD: A Diffusion Foundation Model for Image Compression

Paper • 2511.18706 • Published Nov 24, 2025 • 2

CoD: A Diffusion Foundation Model for Image Compression

Available Models

Base CoD Models (cod/)

One-Step CoD (finetuned_one_step_cod/)

CoD as Perceptual Loss (perceptual_loss_illm_dec/)

Performance

Quick Start

Installation

Download Checkpoints

Base CoD Inference

One-Step CoD Inference

Perceptual Loss Inference

Citation

License

Collection including jzyustc/CoD

Paper for jzyustc/CoD

Base CoD Models (`cod/`)

One-Step CoD (`finetuned_one_step_cod/`)

CoD as Perceptual Loss (`perceptual_loss_illm_dec/`)