Update README.md

793d0e0 verified 6 months ago

3.96 kB

	---
	license: creativeml-openrail-m
	datasets:
	- SubMaroon/danbooru-colored
	base_model:
	- John6666/nsfw-anime-xl-v1-sdxl
	tags:
	- controlnet
	- anime
	- stable-diffusion-xl-diffusers
	- text-to-image
	- stable-diffusion-xl
	---
	# ControlNet for Manga Colorization

	Model Name: `SubMaroon/ControlNet-manga-recolor`
	Base model: `John6666/nsfw-anime-xl-v1-sdxl`
	Task: Conditional image generation — Colorization
	Conditioning: Grayscale manga panel (lineart or filled)
	Trained with: [Hugging Face diffusers](https://github.com/huggingface/diffusers) ControlNet training pipeline

	---

	## Description

	This is a custom-trained ControlNet model designed to perform automatic colorization of grayscale anime styled images.

	The model takes in a black-and-white anime styled pictures (converted to RGB) as conditioning input and generates a colorized version using Stable Diffusion.

	It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as `nsfw-anime-xl-v1-sdxl` or other anime/manga-focused SDXL models.

	---

	## Training details

	- Base model: `John6666/nsfw-anime-xl-v1-sdxl`
	- Dataset: Custom dataset of ~6,000 image pairs from Danbooru-based manga scans, manually cleaned and resized to `768x768`
	- Inputs:
	- `conditioning_image`: black-and-white manga scan (RGB)
	- `text prompt`: optional (e.g. "1girl, blue_eyes, blue_hair etc.")
	- Loss: MSE with FP16, trained on 1×RTX3090, 4 epochs
	- Resolution: 768x768
	- Scheduler: default diffusers setup
	- Optimizer: LR: `1.4e-4`

	---

	## Usage (Diffusers)

	```python
	from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
	from diffusers.utils import load_image
	import torch

	# Load ControlNet
	controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16)

	# Load base pipeline
	pipe = StableDiffusionControlNetPipeline.from_pretrained(
	"John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16
	)

	pipe.to("cuda")

	# Load grayscale manga panel
	conditioning_image = load_image("bw_manga_panel.png").convert("RGB")

	# Generate
	image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0]
	image.save("colorized.png")
	```

	---

	## Usage in ComfyUI / WebUI

	- Place `diffusion_pytorch_model.safetensors` into your `ComfyUI/models/controlnet/` folder
	- Make sure to also include the `config.json`
	- Select this ControlNet in your workflow
	- Use grayscale images as conditioning inputs

	---

	## Alternative training run (SDXL version)

	This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command:

	```bash
	accelerate launch train_controlnet.py \
	--pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \
	--dataset_name="SubMaroon/danbooru-colored" \
	--image_column="image" \
	--conditioning_image_column="conditioning_image" \
	--caption_column="prompt" \
	--output_dir="./controlnet-colorization" \
	--resolution=768 \
	--train_batch_size=4 \
	--gradient_accumulation_steps=4 \
	--learning_rate=1.4e-4 \
	--num_train_epochs=12 \
	--mixed_precision="fp16" \
	--gradient_checkpointing \
	--checkpointing_steps=1000 \
	--validation_steps=1000 \
	--report_to="tensorboard" \
	--tracker_project_name="controlnet-colorization" \
	--seed=42
	```

	---

	## License

	The model is released under the CreativeML Open RAIL-M license.
	You are free to use it for non-commercial and research purposes. Commercial use may require additional permission.

	---

	## Credits

	Created by [SubMaroon](https://huggingface.co/SubMaroon)
	Trained with compute generously provided by [Flanayt Pulsar](https://huggingface.co/fisb)
	Based on the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) ControlNet training example
	Inspired by [lllyasviel's original ControlNet](https://github.com/lllyasviel/ControlNet)