SubMaroon's picture
Update README.md
793d0e0 verified
---
license: creativeml-openrail-m
datasets:
- SubMaroon/danbooru-colored
base_model:
- John6666/nsfw-anime-xl-v1-sdxl
tags:
- controlnet
- anime
- stable-diffusion-xl-diffusers
- text-to-image
- stable-diffusion-xl
---
# ControlNet for Manga Colorization
**Model Name:** `SubMaroon/ControlNet-manga-recolor`
**Base model:** `John6666/nsfw-anime-xl-v1-sdxl`
**Task:** Conditional image generation — Colorization
**Conditioning:** Grayscale manga panel (lineart or filled)
**Trained with:** [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline
---
## Description
This is a custom-trained **ControlNet** model designed to perform **automatic colorization of grayscale anime styled** images.
The model takes in a **black-and-white anime styled pictures** (converted to RGB) as conditioning input and generates a **colorized version** using Stable Diffusion.
It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models.
---
## Training details
- **Base model:** `John6666/nsfw-anime-xl-v1-sdxl`
- **Dataset:** Custom dataset of ~6,000 image pairs from **Danbooru-based manga scans**, manually cleaned and resized to `768x768`
- **Inputs:**
- `conditioning_image`: black-and-white manga scan (RGB)
- `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.")
- **Loss:** MSE with FP16, trained on 1×RTX3090, 4 epochs
- **Resolution:** 768x768
- **Scheduler:** default diffusers setup
- **Optimizer:** LR: `1.4e-4`
---
## Usage (Diffusers)
```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from diffusers.utils import load_image
import torch
# Load ControlNet
controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16)
# Load base pipeline
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.to("cuda")
# Load grayscale manga panel
conditioning_image = load_image("bw_manga_panel.png").convert("RGB")
# Generate
image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0]
image.save("colorized.png")
```
---
## Usage in ComfyUI / WebUI
- Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder
- Make sure to also include the `config.json`
- Select this ControlNet in your workflow
- Use grayscale images as conditioning inputs
---
## Alternative training run (SDXL version)
This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command:
```bash
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \
--dataset_name="SubMaroon/danbooru-colored" \
--image_column="image" \
--conditioning_image_column="conditioning_image" \
--caption_column="prompt" \
--output_dir="./controlnet-colorization" \
--resolution=768 \
--train_batch_size=4 \
--gradient_accumulation_steps=4 \
--learning_rate=1.4e-4 \
--num_train_epochs=12 \
--mixed_precision="fp16" \
--gradient_checkpointing \
--checkpointing_steps=1000 \
--validation_steps=1000 \
--report_to="tensorboard" \
--tracker_project_name="controlnet-colorization" \
--seed=42
```
---
## License
The model is released under the **CreativeML Open RAIL-M** license.
You are free to use it for non-commercial and research purposes. Commercial use may require additional permission.
---
## Credits
Created by [SubMaroon](https://huggingface.co/SubMaroon)
Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb)
Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example
Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet)