File size: 5,660 Bytes
ece00c4
 
83703b2
 
 
 
 
 
 
 
 
 
ece00c4
 
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
 
 
 
 
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
 
 
83703b2
 
 
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
 
 
83703b2
 
ece00c4
83703b2
ece00c4
 
 
83703b2
 
 
 
ece00c4
 
 
83703b2
 
 
ece00c4
83703b2
ece00c4
 
 
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
 
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
 
 
 
 
 
 
ece00c4
83703b2
ece00c4
83703b2
ece00c4
83703b2
 
ece00c4
 
 
83703b2
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
library_name: diffusers
pipeline_tag: text-to-image
tags:
  - text-to-image
  - image-generation
  - flux
  - dc-gen
  - diffusers
base_model:
  - dc-ai/dc_flux_2K4K
  - black-forest-labs/FLUX.1-Krea-dev
---

# blanchon/dc_flux_krea_diffusers

**Diffusers-compatible port of DC-Gen-FLUX (Krea)** for efficient high-resolution text-to-image generation (2K / 4K).

This repository repackages the original **DC-Gen FLUX.1-Krea checkpoint** into a 🧨 **Diffusers** `DiffusionPipeline`, enabling standard Diffusers workflows while preserving the behavior and performance of the upstream model.

---

## Model Details

### Model Description

**FLUX.1 DC-Gen Krea [dev]** is a DC-Gen–adapted FLUX.1-Krea checkpoint that replaces the original FLUX VAE with a **deeply compressed DC-AE latent space**.  
Using **embedding alignment** followed by **lightweight LoRA fine-tuning**, DC-Gen enables much faster native **2K / 4K image generation** while preserving the base model’s realism and text-rendering quality.

This repository does **not** retrain the model. It only provides a **Diffusers port** of the upstream checkpoint for easier inference and deployment.

- **DC-Gen method & model:** NVIDIA DC-Gen team  
  (Wenkun He*, Yuchao Gu*, Junyu Chen*, Dongyun Zou, Yujun Lin, Zhekai Zhang, Haocheng Xi, Muyang Li, Ligeng Zhu, Jincheng Yu, Junsong Chen, Enze Xie, Song Han, Han Cai)
- **Diffusers port:** @blanchon
- **Model type:** Text-to-image diffusion (FLUX family, rectified flow transformer)
- **License:** FLUX.1 [dev] **Non-Commercial License** (same as upstream)
- **Upstream checkpoint:** `dc-ai/dc_flux_2K4K`
- **Base model family:** `black-forest-labs/FLUX.1-Krea-dev`

---

## Model Sources

- **DC-Gen project:** https://github.com/dc-ai-projects/DC-Gen  
- **DC-Gen homepage:** https://hanlab.mit.edu/projects/dc-gen  
- **Paper:** https://arxiv.org/abs/2509.25180  
- **Upstream checkpoint:** https://huggingface.co/dc-ai/dc_flux_2K4K  
- **FLUX.1-Krea base model:** https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev  

---

## Uses

### Direct Use

- High-resolution text-to-image generation (1024 → 4096 px)
- Diffusers-based inference, demos, and deployment
- Research on efficient latent-space diffusion and high-resolution synthesis

### Downstream Use

- Further research or finetuning **only if compliant with the upstream license**
- Integration into non-commercial creative or research tools

### Out-of-Scope Use

- Commercial usage (not permitted by the FLUX.1-dev license)
- Illegal, harmful, or deceptive content generation

---

## Bias, Risks, and Limitations

- The model may reproduce societal biases present in its training data.
- High-resolution generation is GPU- and VRAM-intensive.
- Outputs are not guaranteed to be factual or safe without moderation.
- This repo does not introduce new safety mechanisms beyond those of the base model.

### Recommendations

- Review the FLUX.1-dev non-commercial license carefully before use.
- Apply standard content filtering and safety practices in downstream applications.
- Expect memory usage to scale significantly with resolution.

---

## How to Get Started with the Model

### Minimal Load

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "blanchon/dc_flux_krea_diffusers",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")
````

### Image Generation Example

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "blanchon/dc_flux_krea_diffusers",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

prompt = "a tiny astronaut hatching from an egg on mars"

image = pipe(
    prompt=prompt,
    width=2048,
    height=2048,
    guidance_scale=4.5,
    num_inference_steps=28,
    output_type="pil",
).images[0]

image.save("dc_flux_krea.png")
```

For reproducible results, pass a seeded `torch.Generator(device="cuda")`.

---

## Training Details

### Training Data

This repository does **not** introduce new training data.

According to the DC-Gen paper, post-training uses **synthetic data generated from the base model** to adapt it to a deeply compressed latent space.

### Training Procedure

DC-Gen applies:

1. **Embedding alignment** to bridge the representation gap between latent spaces
2. **LoRA fine-tuning** to recover base-model quality

See the DC-Gen paper for full methodological details.

---

## Evaluation

This repository does not add new evaluation results.

All reported quality, throughput, and latency benchmarks originate from the DC-Gen technical report.

---

## Technical Specifications

### Architecture

* FLUX-family text-to-image diffusion model
* Rectified flow transformer
* Deeply compressed DC-AE latent space (DC-Gen)

### Hardware Requirements

* CUDA-capable GPU strongly recommended
* 2K/4K generation requires substantial VRAM (≥24 GB recommended)

---

## Citation

If you use this model in research, please cite:

```bibtex
@article{he2025dc,
  title={DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space},
  author={He, Wenkun and Gu, Yuchao and Chen, Junyu and Zou, Dongyun and Lin, Yujun and Zhang, Zhekai and Xi, Haocheng and Li, Muyang and Zhu, Ligeng and Yu, Jincheng and others},
  journal={arXiv preprint arXiv:2509.25180},
  year={2025}
}
```

---

## Model Card Authors

* **DC-Gen research & model:** DC-Gen team (NVIDIA)
* **Diffusers port & model card:** @blanchon

## Model Card Contact

* For research questions: see the DC-Gen project page
* For Diffusers port issues: use the Hugging Face Discussions tab