File size: 5,660 Bytes

ece00c4
 
83703b2
 
 
 
 
 
 
 
 
 
ece00c4
 
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
 
 
 
 
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
 
 
83703b2
 
 
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
 
 
83703b2
 
ece00c4
83703b2
ece00c4
 
 
83703b2
 
 
 
ece00c4
 
 
83703b2
 
 
ece00c4
83703b2
ece00c4
 
 
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
 
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
 
 
83703b2

---
library_name: diffusers
pipeline_tag: text-to-image
tags:
  - text-to-image
  - image-generation
  - flux
  - dc-gen
  - diffusers
base_model:
  - dc-ai/dc_flux_2K4K
  - black-forest-labs/FLUX.1-Krea-dev
---

# blanchon/dc_flux_krea_diffusers

**Diffusers-compatible port of DC-Gen-FLUX (Krea)** for efficient high-resolution text-to-image generation (2K / 4K).

This repository repackages the original **DC-Gen FLUX.1-Krea checkpoint** into a 🧨 **Diffusers** `DiffusionPipeline`, enabling standard Diffusers workflows while preserving the behavior and performance of the upstream model.

---

## Model Details

### Model Description

**FLUX.1 DC-Gen Krea [dev]** is a DC-Gen–adapted FLUX.1-Krea checkpoint that replaces the original FLUX VAE with a **deeply compressed DC-AE latent space**.  
Using **embedding alignment** followed by **lightweight LoRA fine-tuning**, DC-Gen enables much faster native **2K / 4K image generation** while preserving the base model’s realism and text-rendering quality.

This repository does **not** retrain the model. It only provides a **Diffusers port** of the upstream checkpoint for easier inference and deployment.

- **DC-Gen method & model:** NVIDIA DC-Gen team  
  (Wenkun He*, Yuchao Gu*, Junyu Chen*, Dongyun Zou, Yujun Lin, Zhekai Zhang, Haocheng Xi, Muyang Li, Ligeng Zhu, Jincheng Yu, Junsong Chen, Enze Xie, Song Han, Han Cai)
- **Diffusers port:** @blanchon
- **Model type:** Text-to-image diffusion (FLUX family, rectified flow transformer)
- **License:** FLUX.1 [dev] **Non-Commercial License** (same as upstream)
- **Upstream checkpoint:** `dc-ai/dc_flux_2K4K`
- **Base model family:** `black-forest-labs/FLUX.1-Krea-dev`

---

## Model Sources

- **DC-Gen project:** https://github.com/dc-ai-projects/DC-Gen  
- **DC-Gen homepage:** https://hanlab.mit.edu/projects/dc-gen  
- **Paper:** https://arxiv.org/abs/2509.25180  
- **Upstream checkpoint:** https://huggingface.co/dc-ai/dc_flux_2K4K  
- **FLUX.1-Krea base model:** https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev  

---

## Uses

### Direct Use

- High-resolution text-to-image generation (1024 → 4096 px)
- Diffusers-based inference, demos, and deployment
- Research on efficient latent-space diffusion and high-resolution synthesis

### Downstream Use

- Further research or finetuning **only if compliant with the upstream license**
- Integration into non-commercial creative or research tools

### Out-of-Scope Use

- Commercial usage (not permitted by the FLUX.1-dev license)
- Illegal, harmful, or deceptive content generation

---

## Bias, Risks, and Limitations

- The model may reproduce societal biases present in its training data.
- High-resolution generation is GPU- and VRAM-intensive.
- Outputs are not guaranteed to be factual or safe without moderation.
- This repo does not introduce new safety mechanisms beyond those of the base model.

### Recommendations

- Review the FLUX.1-dev non-commercial license carefully before use.
- Apply standard content filtering and safety practices in downstream applications.
- Expect memory usage to scale significantly with resolution.

---

## How to Get Started with the Model

### Minimal Load

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "blanchon/dc_flux_krea_diffusers",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")
````

### Image Generation Example

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "blanchon/dc_flux_krea_diffusers",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

prompt = "a tiny astronaut hatching from an egg on mars"

image = pipe(
    prompt=prompt,
    width=2048,
    height=2048,
    guidance_scale=4.5,
    num_inference_steps=28,
    output_type="pil",
).images[0]

image.save("dc_flux_krea.png")
```

For reproducible results, pass a seeded `torch.Generator(device="cuda")`.

---

## Training Details

### Training Data

This repository does **not** introduce new training data.

According to the DC-Gen paper, post-training uses **synthetic data generated from the base model** to adapt it to a deeply compressed latent space.

### Training Procedure

DC-Gen applies:

1. **Embedding alignment** to bridge the representation gap between latent spaces
2. **LoRA fine-tuning** to recover base-model quality

See the DC-Gen paper for full methodological details.

---

## Evaluation

This repository does not add new evaluation results.

All reported quality, throughput, and latency benchmarks originate from the DC-Gen technical report.

---

## Technical Specifications

### Architecture

* FLUX-family text-to-image diffusion model
* Rectified flow transformer
* Deeply compressed DC-AE latent space (DC-Gen)

### Hardware Requirements

* CUDA-capable GPU strongly recommended
* 2K/4K generation requires substantial VRAM (≥24 GB recommended)

---

## Citation

If you use this model in research, please cite:

```bibtex
@article{he2025dc,
  title={DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space},
  author={He, Wenkun and Gu, Yuchao and Chen, Junyu and Zou, Dongyun and Lin, Yujun and Zhang, Zhekai and Xi, Haocheng and Li, Muyang and Zhu, Ligeng and Yu, Jincheng and others},
  journal={arXiv preprint arXiv:2509.25180},
  year={2025}
}
```

---

## Model Card Authors

* **DC-Gen research & model:** DC-Gen team (NVIDIA)
* **Diffusers port & model card:** @blanchon

## Model Card Contact

* For research questions: see the DC-Gen project page
* For Diffusers port issues: use the Hugging Face Discussions tab