File size: 3,962 Bytes
be3b6df 147747b fb98a59 be3b6df 2c54287 be3b6df 4833ccc be3b6df 4833ccc be3b6df 022ee4c be3b6df 022ee4c be3b6df 022ee4c 793d0e0 022ee4c be3b6df b0bca83 be3b6df 2c54287 be3b6df 022ee4c be3b6df 022ee4c be3b6df 022ee4c be3b6df 33bfe63 be3b6df |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
---
license: creativeml-openrail-m
datasets:
- SubMaroon/danbooru-colored
base_model:
- John6666/nsfw-anime-xl-v1-sdxl
tags:
- controlnet
- anime
- stable-diffusion-xl-diffusers
- text-to-image
- stable-diffusion-xl
---
# ControlNet for Manga Colorization
**Model Name:** `SubMaroon/ControlNet-manga-recolor`
**Base model:** `John6666/nsfw-anime-xl-v1-sdxl`
**Task:** Conditional image generation — Colorization
**Conditioning:** Grayscale manga panel (lineart or filled)
**Trained with:** [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline
---
## Description
This is a custom-trained **ControlNet** model designed to perform **automatic colorization of grayscale anime styled** images.
The model takes in a **black-and-white anime styled pictures** (converted to RGB) as conditioning input and generates a **colorized version** using Stable Diffusion.
It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models.
---
## Training details
- **Base model:** `John6666/nsfw-anime-xl-v1-sdxl`
- **Dataset:** Custom dataset of ~6,000 image pairs from **Danbooru-based manga scans**, manually cleaned and resized to `768x768`
- **Inputs:**
- `conditioning_image`: black-and-white manga scan (RGB)
- `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.")
- **Loss:** MSE with FP16, trained on 1×RTX3090, 4 epochs
- **Resolution:** 768x768
- **Scheduler:** default diffusers setup
- **Optimizer:** LR: `1.4e-4`
---
## Usage (Diffusers)
```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from diffusers.utils import load_image
import torch
# Load ControlNet
controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16)
# Load base pipeline
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.to("cuda")
# Load grayscale manga panel
conditioning_image = load_image("bw_manga_panel.png").convert("RGB")
# Generate
image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0]
image.save("colorized.png")
```
---
## Usage in ComfyUI / WebUI
- Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder
- Make sure to also include the `config.json`
- Select this ControlNet in your workflow
- Use grayscale images as conditioning inputs
---
## Alternative training run (SDXL version)
This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command:
```bash
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \
--dataset_name="SubMaroon/danbooru-colored" \
--image_column="image" \
--conditioning_image_column="conditioning_image" \
--caption_column="prompt" \
--output_dir="./controlnet-colorization" \
--resolution=768 \
--train_batch_size=4 \
--gradient_accumulation_steps=4 \
--learning_rate=1.4e-4 \
--num_train_epochs=12 \
--mixed_precision="fp16" \
--gradient_checkpointing \
--checkpointing_steps=1000 \
--validation_steps=1000 \
--report_to="tensorboard" \
--tracker_project_name="controlnet-colorization" \
--seed=42
```
---
## License
The model is released under the **CreativeML Open RAIL-M** license.
You are free to use it for non-commercial and research purposes. Commercial use may require additional permission.
---
## Credits
Created by [SubMaroon](https://huggingface.co/SubMaroon)
Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb)
Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example
Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet) |