JunqiShi
/

DiT-IC

image-compression

diffusion_transformer

Model card Files Files and versions

🐱 DiT-IC Model Card

Source code is available at Github.

Model Description

Developed by: NJU VisionLab
Model type: Diffusion Transformer based image compression model
Model size: 1.049 B parameters
Resolution: Supports arbitrary resolution images

This model performs the diffusion process in a 32× latent space to reduce memory usage and accelerate inference.

The model is based on the pretrained text-to-image generative model SANA-600M.

Quick Start

For training and inference scripts, please visit our GitHub Repository.

Limitations

The model may fail to reconstruct tiny text at low bitrates.
Fingers and other fine structures may not be generated properly.

Citation

If you use this model, please cite:

@inproceedings{shi2026ditic,
  title={DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression},
  author={Shi Junqi, Lu Ming, Li Xingchen, Ke Anle, Zhang Ruiqi and Ma Zhan},
  booktitle={CVPR},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for JunqiShi/DiT-IC

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train JunqiShi/DiT-IC

Paper for JunqiShi/DiT-IC

DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression

Paper • 2603.13162 • Published Mar 13