CalamitousFelicitousness's picture
Upload README.md with huggingface_hub
242c829 verified
|
Raw
History Blame Contribute Delete
2.84 kB
---
license: other
license_name: krea-2-community
license_link: https://www.krea.ai/krea-2-licensing
pipeline_tag: text-to-image
library_name: diffusers
tags:
- text-to-image
- image-generation
- diffusion
- flow-matching
- dit
- krea
base_model: krea/Krea-2-Raw
base_model_relation: finetune
---
# Krea 2 (K2) Base - Diffusers
Diffusers-format conversion of the Krea 2 **Base** checkpoint, the undistilled foundation
model of the Krea 2 family from [Krea](https://krea.ai). The Base checkpoint carries no step
or guidance distillation, which keeps it diverse and highly malleable. It is the checkpoint
intended for fine-tuning, post-training, and LoRA training.
LoRAs trained on Base apply cleanly to Krea 2 Turbo, so the recommended workflow is to train
on Base and run inference on [Krea-2-Turbo-Diffusers](https://huggingface.co/CalamitousFelicitousness/Krea-2-Turbo-Diffusers).
## Model Summary
Krea 2 is a latent-diffusion image model trained from scratch with an emphasis on aesthetics
and stylistic control. The architecture is a single-stream multimodal diffusion transformer.
- **Transformer**: single-stream DiT, 12.9B parameters, 28 blocks at width 6144. Grouped-query
attention, a learned output gate, per-head QK normalization, and a 3-axis rotary embedding.
A text-fusion stage inside the transformer collapses twelve text-encoder hidden-state layers
into one conditioning stream.
- **Text encoder**: [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct),
tapped at twelve intermediate layers (text-only conditioning).
- **VAE**: the Qwen-Image autoencoder (`AutoencoderKLQwenImage`, f8, 16 latent channels).
- **Sampler**: flow matching with a resolution-aware timestep shift.
Weights are stored in their original mixed precision (bf16 matmuls, fp32 norms and modulations).
## Recommended Settings
Base is undistilled and uses classifier-free guidance with a negative prompt.
| Setting | Value |
| ------- | ----- |
| Steps | 52 |
| Guidance (CFG) | 3.5 |
| Resolution | up to 1024 x 1024 |
The timestep shift is resolution-aware: the conditioning interpolates the shift between low and
high resolution, so no manual tuning is required across sizes.
## Prompting
Natural-language prompts are recommended. Long, detailed descriptions yield the best results,
though strong images are produced from short prompts as well. For text rendering, the words to
be rendered are wrapped in quotes. An optional prompt-expansion system prompt is available in
the upstream [krea-2-oss](https://github.com/krea-ai) repository.
## License
The weights are released under the [Krea 2 community license](https://www.krea.ai/krea-2-licensing).
## Citation
```bibtex
@misc{krea2,
title = {Krea 2},
author = {Krea},
year = {2026},
url = {https://www.krea.ai/krea-2}
}
```