File size: 1,984 Bytes
4e9f255
 
 
 
 
 
468fcfa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
license: mit
datasets:
- SubMaroon/danbooru-lineart
base_model:
- cagliostrolab/animagine-xl-3.0
---

# Experimental ControlNet (Low Quality / Research Prototype)

> **Experimental model. Low quality. Not intended for production use.**  
> This ControlNet was trained as a research experiment to explore line-based conditioning and colorization behavior in SDXL anime models.

---

## Model Summary

This repository contains an **experimental ControlNet for SDXL**, trained on anime-style images.  
The model is **not stable**, shows **inconsistent color behavior**, and should be treated as a **research prototype** rather than a finished or polished solution.

The goal of this experiment was to understand:
- How SDXL ControlNet learns **colorization from line-based conditioning**
- How different conditioning types (Canny vs Lineart) affect **color consistency**

---

## Base Model

- **Base model:** `cagliostrolab/animagine-xl-3.0`
- **Architecture:** ControlNet SDXL
- **Training framework:** 🤗 Diffusers
- **Precision:** `bf16`

---

## Conditioning Type

- Primary conditioning: **Lineart / Canny-like edges**
- Backgrounds are mostly white
- Line quality varies (mostly clean, some noisy samples)

> Important limitation:  
> Lineart / Canny **does not contain color information**, which leads to unstable and drifting color predictions.

---

## Dataset

- Size: ~**14,000 image pairs**
- Format:
  - Original image (color)
  - Conditioning image (lineart / canny)
  - Prompt (caption)

### Known dataset issues
- Some lineart images are **noisy or inconsistent**
- Images are resized to square resolution (possible cropping artifacts)
- No explicit color supervision
- No palette or region-level color constraints

---

## Training Configuration

Typical training setup:

```bash
resolution: 768
train_batch_size: 2
gradient_accumulation_steps: 2
effective_batch_size: 4
learning_rate: 2e-5
lr_scheduler: cosine
max_train_steps: 6000–8000
mixed_precision: bf16