|
|
--- |
|
|
license: creativeml-openrail-m |
|
|
datasets: |
|
|
- SubMaroon/danbooru-colored |
|
|
base_model: |
|
|
- John6666/nsfw-anime-xl-v1-sdxl |
|
|
tags: |
|
|
- controlnet |
|
|
- anime |
|
|
- stable-diffusion-xl-diffusers |
|
|
- text-to-image |
|
|
- stable-diffusion-xl |
|
|
--- |
|
|
# ControlNet for Manga Colorization |
|
|
|
|
|
**Model Name:** `SubMaroon/ControlNet-manga-recolor` |
|
|
**Base model:** `John6666/nsfw-anime-xl-v1-sdxl` |
|
|
**Task:** Conditional image generation — Colorization |
|
|
**Conditioning:** Grayscale manga panel (lineart or filled) |
|
|
**Trained with:** [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline |
|
|
|
|
|
--- |
|
|
|
|
|
## Description |
|
|
|
|
|
This is a custom-trained **ControlNet** model designed to perform **automatic colorization of grayscale anime styled** images. |
|
|
|
|
|
The model takes in a **black-and-white anime styled pictures** (converted to RGB) as conditioning input and generates a **colorized version** using Stable Diffusion. |
|
|
|
|
|
It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training details |
|
|
|
|
|
- **Base model:** `John6666/nsfw-anime-xl-v1-sdxl` |
|
|
- **Dataset:** Custom dataset of ~6,000 image pairs from **Danbooru-based manga scans**, manually cleaned and resized to `768x768` |
|
|
- **Inputs:** |
|
|
- `conditioning_image`: black-and-white manga scan (RGB) |
|
|
- `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.") |
|
|
- **Loss:** MSE with FP16, trained on 1×RTX3090, 4 epochs |
|
|
- **Resolution:** 768x768 |
|
|
- **Scheduler:** default diffusers setup |
|
|
- **Optimizer:** LR: `1.4e-4` |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage (Diffusers) |
|
|
|
|
|
```python |
|
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
|
from diffusers.utils import load_image |
|
|
import torch |
|
|
|
|
|
# Load ControlNet |
|
|
controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16) |
|
|
|
|
|
# Load base pipeline |
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
|
"John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
pipe.to("cuda") |
|
|
|
|
|
# Load grayscale manga panel |
|
|
conditioning_image = load_image("bw_manga_panel.png").convert("RGB") |
|
|
|
|
|
# Generate |
|
|
image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0] |
|
|
image.save("colorized.png") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage in ComfyUI / WebUI |
|
|
|
|
|
- Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder |
|
|
- Make sure to also include the `config.json` |
|
|
- Select this ControlNet in your workflow |
|
|
- Use grayscale images as conditioning inputs |
|
|
|
|
|
--- |
|
|
|
|
|
## Alternative training run (SDXL version) |
|
|
|
|
|
This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command: |
|
|
|
|
|
```bash |
|
|
accelerate launch train_controlnet.py \ |
|
|
--pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \ |
|
|
--dataset_name="SubMaroon/danbooru-colored" \ |
|
|
--image_column="image" \ |
|
|
--conditioning_image_column="conditioning_image" \ |
|
|
--caption_column="prompt" \ |
|
|
--output_dir="./controlnet-colorization" \ |
|
|
--resolution=768 \ |
|
|
--train_batch_size=4 \ |
|
|
--gradient_accumulation_steps=4 \ |
|
|
--learning_rate=1.4e-4 \ |
|
|
--num_train_epochs=12 \ |
|
|
--mixed_precision="fp16" \ |
|
|
--gradient_checkpointing \ |
|
|
--checkpointing_steps=1000 \ |
|
|
--validation_steps=1000 \ |
|
|
--report_to="tensorboard" \ |
|
|
--tracker_project_name="controlnet-colorization" \ |
|
|
--seed=42 |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
The model is released under the **CreativeML Open RAIL-M** license. |
|
|
You are free to use it for non-commercial and research purposes. Commercial use may require additional permission. |
|
|
|
|
|
--- |
|
|
|
|
|
## Credits |
|
|
|
|
|
Created by [SubMaroon](https://huggingface.co/SubMaroon) |
|
|
Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb) |
|
|
Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example |
|
|
Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet) |