dataautogpt3's picture
Upload README.md with huggingface_hub
f708710 verified
|
Raw
History Blame Contribute Delete
6.76 kB
---
license: other
license_name: krea-2-research
license_link: https://huggingface.co/dataautogpt3/Krea2-weights-experiments/blob/main/LICENSE
language:
- en
library_name: diffusers
tags:
- krea-2
- turbo
- weight-editing
- diffusion
- dit
- mmdit
- safetensors
- comfyui
- experimental
pipeline_tag: text-to-image
---
# Krea 2 Turbo β€” Hand-Edited Weight Experiments
![Comparison Grid](comparison_ALL.png)
## Overview
This repository contains **weight-edited variants** of the Krea 2 Turbo diffusion model. Each variant was created by surgically scaling specific transformer block weights in the 12.8B parameter single-stream MMDiT, producing artistic and functional model variations without any retraining.
These are **research artifacts** from hand-editing diffusion model weights using the methodology described below. The base models (Krea 2 Turbo and Krea 2 Raw) are NOT included β€” only the edited variants.
## Method
All variants use the core formula:
```
theta_new = theta_original * (1 - 2 * alpha)
```
Where `alpha` controls the inversion strength:
- `alpha=0.05` β†’ scale 0.90 (subtle)
- `alpha=0.10` β†’ scale 0.80 (artistic sweet spot)
- `alpha=0.15` β†’ scale 0.70 (strong)
- `alpha=0.20` β†’ scale 0.60 (aggressive but functional)
Full negation (`alpha=0.5`, scale=-1.0) **breaks the model** and is excluded from this repository.
## Architecture: Krea 2 Turbo
- **Type**: Single-stream MMDiT (Diffusion Transformer)
- **Parameters**: 12.8B
- **File size**: ~25GB per variant (BF16 + F32 tensors)
- **Structure**: 28 uniform transformer blocks
- **Block sub-layers**:
- `blocks.N.attn.*` (7 tensors): gate, qknorm, wq, wk, wv, wo
- `blocks.N.mlp.*` (3 tensors): gate, up, down (SwiGLU)
- `blocks.N.mod.lin` (1 tensor): conditioning modulation
- `blocks.N.prenorm.scale` / `blocks.N.postnorm.scale`
## Variants
### B1 β€” Partial Inversion (Most Artistic)
| Property | Value |
|---|---|
| File | `Krea_2_turbo_inv_B1_partial10.safetensors` |
| Blocks | 12-14 (mid) |
| Layers | ALL (39 tensors per block group) |
| Alpha | 0.10 (scale=0.80) |
| Result | **Most artistic variant** β€” strong style/content shift while remaining coherent |
### B3 β€” Attention-Only Partial Inversion
| Property | Value |
|---|---|
| File | `Krea_2_turbo_inv_B3_attn_p10.safetensors` |
| Blocks | 12-14 (mid) |
| Layers | attn only (21 tensors) |
| Alpha | 0.10 (scale=0.80) |
| Result | Functional, subtler than B1 β€” attention-specific perturbation |
### D β€” Gate Scaling (All Blocks)
| Property | Value |
|---|---|
| File | `Krea_2_turbo_inv_D_gate_p20.safetensors` |
| Blocks | 0-27 (all) |
| Layers | attn.gate only (28 tensors) |
| Alpha | 0.20 (scale=0.60) |
| Result | Functional, moderate effect β€” gate weights are more tolerant of aggressive scaling |
### F β€” Early/Late Block Inversion
| Property | Value |
|---|---|
| File | `Krea_2_turbo_F_early_a10.safetensors` |
| Blocks | 0-2 (early) |
| Layers | ALL |
| Alpha | 0.10 (scale=0.80) |
| Result | Affects structure, composition, spatial layout |
| Property | Value |
|---|---|
| File | `Krea_2_turbo_F_late_a10.safetensors` |
| Blocks | 25-27 (late) |
| Layers | ALL |
| Alpha | 0.10 (scale=0.80) |
| Result | Affects style, color, detail, texture refinement |
### G β€” Mid-Block Alpha Sweep
Three variants at different inversion strengths on the same block zone:
| File | Alpha | Scale | Notes |
|---|---|---|---|
| `Krea_2_turbo_G_mid_a05.safetensors` | 0.05 | 0.90 | Subtle |
| `Krea_2_turbo_G_mid_a15.safetensors` | 0.15 | 0.70 | Strong |
| `Krea_2_turbo_G_mid_a20.safetensors` | 0.20 | 0.60 | Aggressive but functional |
All target blocks 12-14, ALL layers.
### H β€” Layer-Selective Mid-Block
| File | Blocks | Layers | Alpha |
|---|---|---|---|
| `Krea_2_turbo_H_mid_attn_a10.safetensors` | 12-14 | attn only | 0.10 |
| `Krea_2_turbo_H_mid_mlp_a10.safetensors` | 12-14 | mlp only | 0.10 |
Isolates the effect of attention vs MLP perturbation on the same block zone.
### I β€” Gradient Alpha
| Property | Value |
|---|---|
| File | `Krea_2_turbo_I_gradient.safetensors` |
| Blocks | 0-27 (all) |
| Layers | ALL |
| Alpha | 0.03 β†’ 0.17 (gradient across blocks) |
| Scale | 0.94 β†’ 0.66 |
| Result | Smooth global perturbation β€” early blocks barely touched, late blocks aggressively inverted |
## Excluded Variants (Broken)
The following variants were created but are **broken** (model produces noise/garbage) and are NOT included:
| Variant | What was done | Why it broke |
|---|---|---|
| B2_attn_full | attn weights * -1.0 | Full negation destroys attention computation |
| D_wv_all | wv weights * -1.0 | Full negation of value projection |
| E_ties_mid | TIES-style sign flip on mid blocks | Full negation variant |
## Usage
### ComfyUI
1. Place `.safetensors` files in `ComfyUI/models/diffusion_models/`
2. Load via `UNETLoader` node
3. Use the same VAE, CLIP, and text encoder as Krea 2 Turbo
4. Generate with your standard Krea 2 workflow
### Diffusers
```python
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"dataautogpt3/Krea2-weights-experiments",
torch_dtype=torch.bfloat16,
variant="bf16"
).to("cuda")
```
> Note: These are diffusion model weights only. You need the corresponding VAE, text encoders, and tokenizer from the original Krea 2 Turbo release.
## Key Findings
1. **Scaling works, full negation breaks.** Partial inversion (scale 0.60-0.90) produces functional, artistic variants. Full negation (scale=-1.0) breaks the model.
2. **10% inversion is the sweet spot.** Alpha=0.10 (scale=0.80) on mid blocks 12-14 produces the most artistically interesting results.
3. **Mid blocks are safest to modify.** Blocks 12-14 are the most redundant and tolerate perturbation best.
4. **Gate weights are most tolerant.** Attention gate weights can be scaled to 0.60 across all blocks while remaining functional β€” other layers break sooner.
5. **The artistic effects come from compensation.** Partial perturbation triggers creative reorganization in unedited blocks β€” the compensatory masquerade effect.
## Research Context
This work draws on findings from:
- **Task Arithmetic** (Ilharco et al., ICLR 2023) β€” formal basis for weight negation
- **weights2weights** (NeurIPS 2024) β€” diffusion weight space as meta-latent
- **Unraveling MMDiT Blocks** (2025) β€” per-block role mapping for MMDiT
- **C3: Creative Concept Catalyst** (CVPR 2025) β€” low-frequency amplification in shallow blocks
- **ConceptPrune** (ICLR 2025) β€” tiny weight changes shift semantic output
## Credits
- Base model: Krea 2 Turbo (Krea AI)
- Weight editing: DataPlusEngine
- Methodology: Hand-editing diffusion weights via mmap-based surgical tensor scaling