--- library_name: diffusers pipeline_tag: text-to-image tags: - text-to-image - image-generation - flux - dc-gen - diffusers base_model: - dc-ai/dc_flux_2K4K - black-forest-labs/FLUX.1-Krea-dev --- # blanchon/dc_flux_krea_diffusers **Diffusers-compatible port of DC-Gen-FLUX (Krea)** for efficient high-resolution text-to-image generation (2K / 4K). This repository repackages the original **DC-Gen FLUX.1-Krea checkpoint** into a 🧨 **Diffusers** `DiffusionPipeline`, enabling standard Diffusers workflows while preserving the behavior and performance of the upstream model. --- ## Model Details ### Model Description **FLUX.1 DC-Gen Krea [dev]** is a DC-Gen–adapted FLUX.1-Krea checkpoint that replaces the original FLUX VAE with a **deeply compressed DC-AE latent space**. Using **embedding alignment** followed by **lightweight LoRA fine-tuning**, DC-Gen enables much faster native **2K / 4K image generation** while preserving the base model’s realism and text-rendering quality. This repository does **not** retrain the model. It only provides a **Diffusers port** of the upstream checkpoint for easier inference and deployment. - **DC-Gen method & model:** NVIDIA DC-Gen team (Wenkun He*, Yuchao Gu*, Junyu Chen*, Dongyun Zou, Yujun Lin, Zhekai Zhang, Haocheng Xi, Muyang Li, Ligeng Zhu, Jincheng Yu, Junsong Chen, Enze Xie, Song Han, Han Cai) - **Diffusers port:** @blanchon - **Model type:** Text-to-image diffusion (FLUX family, rectified flow transformer) - **License:** FLUX.1 [dev] **Non-Commercial License** (same as upstream) - **Upstream checkpoint:** `dc-ai/dc_flux_2K4K` - **Base model family:** `black-forest-labs/FLUX.1-Krea-dev` --- ## Model Sources - **DC-Gen project:** https://github.com/dc-ai-projects/DC-Gen - **DC-Gen homepage:** https://hanlab.mit.edu/projects/dc-gen - **Paper:** https://arxiv.org/abs/2509.25180 - **Upstream checkpoint:** https://huggingface.co/dc-ai/dc_flux_2K4K - **FLUX.1-Krea base model:** https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev --- ## Uses ### Direct Use - High-resolution text-to-image generation (1024 → 4096 px) - Diffusers-based inference, demos, and deployment - Research on efficient latent-space diffusion and high-resolution synthesis ### Downstream Use - Further research or finetuning **only if compliant with the upstream license** - Integration into non-commercial creative or research tools ### Out-of-Scope Use - Commercial usage (not permitted by the FLUX.1-dev license) - Illegal, harmful, or deceptive content generation --- ## Bias, Risks, and Limitations - The model may reproduce societal biases present in its training data. - High-resolution generation is GPU- and VRAM-intensive. - Outputs are not guaranteed to be factual or safe without moderation. - This repo does not introduce new safety mechanisms beyond those of the base model. ### Recommendations - Review the FLUX.1-dev non-commercial license carefully before use. - Apply standard content filtering and safety practices in downstream applications. - Expect memory usage to scale significantly with resolution. --- ## How to Get Started with the Model ### Minimal Load ```python import torch from diffusers import DiffusionPipeline pipe = DiffusionPipeline.from_pretrained( "blanchon/dc_flux_krea_diffusers", trust_remote_code=True, torch_dtype=torch.bfloat16, ).to("cuda") ```` ### Image Generation Example ```python import torch from diffusers import DiffusionPipeline pipe = DiffusionPipeline.from_pretrained( "blanchon/dc_flux_krea_diffusers", trust_remote_code=True, torch_dtype=torch.bfloat16, ).to("cuda") prompt = "a tiny astronaut hatching from an egg on mars" image = pipe( prompt=prompt, width=2048, height=2048, guidance_scale=4.5, num_inference_steps=28, output_type="pil", ).images[0] image.save("dc_flux_krea.png") ``` For reproducible results, pass a seeded `torch.Generator(device="cuda")`. --- ## Training Details ### Training Data This repository does **not** introduce new training data. According to the DC-Gen paper, post-training uses **synthetic data generated from the base model** to adapt it to a deeply compressed latent space. ### Training Procedure DC-Gen applies: 1. **Embedding alignment** to bridge the representation gap between latent spaces 2. **LoRA fine-tuning** to recover base-model quality See the DC-Gen paper for full methodological details. --- ## Evaluation This repository does not add new evaluation results. All reported quality, throughput, and latency benchmarks originate from the DC-Gen technical report. --- ## Technical Specifications ### Architecture * FLUX-family text-to-image diffusion model * Rectified flow transformer * Deeply compressed DC-AE latent space (DC-Gen) ### Hardware Requirements * CUDA-capable GPU strongly recommended * 2K/4K generation requires substantial VRAM (≥24 GB recommended) --- ## Citation If you use this model in research, please cite: ```bibtex @article{he2025dc, title={DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space}, author={He, Wenkun and Gu, Yuchao and Chen, Junyu and Zou, Dongyun and Lin, Yujun and Zhang, Zhekai and Xi, Haocheng and Li, Muyang and Zhu, Ligeng and Yu, Jincheng and others}, journal={arXiv preprint arXiv:2509.25180}, year={2025} } ``` --- ## Model Card Authors * **DC-Gen research & model:** DC-Gen team (NVIDIA) * **Diffusers port & model card:** @blanchon ## Model Card Contact * For research questions: see the DC-Gen project page * For Diffusers port issues: use the Hugging Face Discussions tab