File size: 3,962 Bytes
be3b6df
 
 
 
 
 
 
 
 
147747b
 
fb98a59
be3b6df
 
 
2c54287
be3b6df
 
 
 
 
 
 
 
 
4833ccc
be3b6df
4833ccc
be3b6df
022ee4c
be3b6df
 
 
 
 
 
022ee4c
be3b6df
 
022ee4c
793d0e0
022ee4c
be3b6df
b0bca83
be3b6df
 
 
 
 
 
 
 
 
 
 
2c54287
be3b6df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
022ee4c
be3b6df
 
 
 
 
 
 
 
022ee4c
 
 
 
 
 
 
 
 
 
 
 
 
be3b6df
022ee4c
 
 
 
 
be3b6df
 
 
 
 
 
 
 
 
 
 
 
 
 
33bfe63
be3b6df
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
license: creativeml-openrail-m
datasets:
- SubMaroon/danbooru-colored
base_model:
- John6666/nsfw-anime-xl-v1-sdxl
tags:
- controlnet
- anime
- stable-diffusion-xl-diffusers
- text-to-image
- stable-diffusion-xl
---
# ControlNet for Manga Colorization

**Model Name:** `SubMaroon/ControlNet-manga-recolor`  
**Base model:** `John6666/nsfw-anime-xl-v1-sdxl`  
**Task:** Conditional image generation — Colorization  
**Conditioning:** Grayscale manga panel (lineart or filled)  
**Trained with:** [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline

---

## Description

This is a custom-trained **ControlNet** model designed to perform **automatic colorization of grayscale anime styled** images.

The model takes in a **black-and-white anime styled pictures** (converted to RGB) as conditioning input and generates a **colorized version** using Stable Diffusion.

It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models.

---

## Training details

- **Base model:** `John6666/nsfw-anime-xl-v1-sdxl`
- **Dataset:** Custom dataset of ~6,000 image pairs from **Danbooru-based manga scans**, manually cleaned and resized to `768x768`
- **Inputs:**
  - `conditioning_image`: black-and-white manga scan (RGB)
  - `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.")
- **Loss:** MSE with FP16, trained on 1×RTX3090, 4 epochs
- **Resolution:** 768x768  
- **Scheduler:** default diffusers setup
- **Optimizer:** LR: `1.4e-4`

---

## Usage (Diffusers)

```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from diffusers.utils import load_image
import torch

# Load ControlNet
controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16)

# Load base pipeline
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16
)

pipe.to("cuda")

# Load grayscale manga panel
conditioning_image = load_image("bw_manga_panel.png").convert("RGB")

# Generate
image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0]
image.save("colorized.png")
```

---

## Usage in ComfyUI / WebUI

- Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder
- Make sure to also include the `config.json`
- Select this ControlNet in your workflow
- Use grayscale images as conditioning inputs

---

## Alternative training run (SDXL version)

This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command:

```bash
accelerate launch train_controlnet.py \
  --pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \
  --dataset_name="SubMaroon/danbooru-colored" \
  --image_column="image" \
  --conditioning_image_column="conditioning_image" \
  --caption_column="prompt" \
  --output_dir="./controlnet-colorization" \
  --resolution=768 \
  --train_batch_size=4 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1.4e-4 \
  --num_train_epochs=12 \
  --mixed_precision="fp16" \
  --gradient_checkpointing \
  --checkpointing_steps=1000 \
  --validation_steps=1000 \
  --report_to="tensorboard" \
  --tracker_project_name="controlnet-colorization" \
  --seed=42
```

---

## License

The model is released under the **CreativeML Open RAIL-M** license.  
You are free to use it for non-commercial and research purposes. Commercial use may require additional permission.

---

## Credits

Created by [SubMaroon](https://huggingface.co/SubMaroon)  
Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb)  
Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example  
Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet)