--- license: creativeml-openrail-m datasets: - SubMaroon/danbooru-colored base_model: - John6666/nsfw-anime-xl-v1-sdxl tags: - controlnet - anime - stable-diffusion-xl-diffusers - text-to-image - stable-diffusion-xl --- # ControlNet for Manga Colorization **Model Name:** `SubMaroon/ControlNet-manga-recolor` **Base model:** `John6666/nsfw-anime-xl-v1-sdxl` **Task:** Conditional image generation — Colorization **Conditioning:** Grayscale manga panel (lineart or filled) **Trained with:** [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline --- ## Description This is a custom-trained **ControlNet** model designed to perform **automatic colorization of grayscale anime styled** images. The model takes in a **black-and-white anime styled pictures** (converted to RGB) as conditioning input and generates a **colorized version** using Stable Diffusion. It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models. --- ## Training details - **Base model:** `John6666/nsfw-anime-xl-v1-sdxl` - **Dataset:** Custom dataset of ~6,000 image pairs from **Danbooru-based manga scans**, manually cleaned and resized to `768x768` - **Inputs:** - `conditioning_image`: black-and-white manga scan (RGB) - `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.") - **Loss:** MSE with FP16, trained on 1×RTX3090, 4 epochs - **Resolution:** 768x768 - **Scheduler:** default diffusers setup - **Optimizer:** LR: `1.4e-4` --- ## Usage (Diffusers) ```python from diffusers import StableDiffusionControlNetPipeline, ControlNetModel from diffusers.utils import load_image import torch # Load ControlNet controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16) # Load base pipeline pipe = StableDiffusionControlNetPipeline.from_pretrained( "John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16 ) pipe.to("cuda") # Load grayscale manga panel conditioning_image = load_image("bw_manga_panel.png").convert("RGB") # Generate image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0] image.save("colorized.png") ``` --- ## Usage in ComfyUI / WebUI - Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder - Make sure to also include the `config.json` - Select this ControlNet in your workflow - Use grayscale images as conditioning inputs --- ## Alternative training run (SDXL version) This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command: ```bash accelerate launch train_controlnet.py \ --pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \ --dataset_name="SubMaroon/danbooru-colored" \ --image_column="image" \ --conditioning_image_column="conditioning_image" \ --caption_column="prompt" \ --output_dir="./controlnet-colorization" \ --resolution=768 \ --train_batch_size=4 \ --gradient_accumulation_steps=4 \ --learning_rate=1.4e-4 \ --num_train_epochs=12 \ --mixed_precision="fp16" \ --gradient_checkpointing \ --checkpointing_steps=1000 \ --validation_steps=1000 \ --report_to="tensorboard" \ --tracker_project_name="controlnet-colorization" \ --seed=42 ``` --- ## License The model is released under the **CreativeML Open RAIL-M** license. You are free to use it for non-commercial and research purposes. Commercial use may require additional permission. --- ## Credits Created by [SubMaroon](https://huggingface.co/SubMaroon) Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb) Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet)