SubMaroon
/

ControlNet-anime-colorize

Model card Files Files and versions

ControlNet-anime-colorize / README.md

SubMaroon's picture

Update README.md

468fcfa verified 11 days ago

|

history blame contribute delete

1.98 kB

	---
	license: mit
	datasets:
	- SubMaroon/danbooru-lineart
	base_model:
	- cagliostrolab/animagine-xl-3.0
	---

	# Experimental ControlNet (Low Quality / Research Prototype)

	> Experimental model. Low quality. Not intended for production use.
	> This ControlNet was trained as a research experiment to explore line-based conditioning and colorization behavior in SDXL anime models.

	---

	## Model Summary

	This repository contains an experimental ControlNet for SDXL, trained on anime-style images.
	The model is not stable, shows inconsistent color behavior, and should be treated as a research prototype rather than a finished or polished solution.

	The goal of this experiment was to understand:
	- How SDXL ControlNet learns colorization from line-based conditioning
	- How different conditioning types (Canny vs Lineart) affect color consistency

	---

	## Base Model

	- Base model: `cagliostrolab/animagine-xl-3.0`
	- Architecture: ControlNet SDXL
	- Training framework: 🤗 Diffusers
	- Precision: `bf16`

	---

	## Conditioning Type

	- Primary conditioning: Lineart / Canny-like edges
	- Backgrounds are mostly white
	- Line quality varies (mostly clean, some noisy samples)

	> Important limitation:
	> Lineart / Canny does not contain color information, which leads to unstable and drifting color predictions.

	---

	## Dataset

	- Size: ~14,000 image pairs
	- Format:
	- Original image (color)
	- Conditioning image (lineart / canny)
	- Prompt (caption)

	### Known dataset issues
	- Some lineart images are noisy or inconsistent
	- Images are resized to square resolution (possible cropping artifacts)
	- No explicit color supervision
	- No palette or region-level color constraints

	---

	## Training Configuration

	Typical training setup:

	```bash
	resolution: 768
	train_batch_size: 2
	gradient_accumulation_steps: 2
	effective_batch_size: 4
	learning_rate: 2e-5
	lr_scheduler: cosine
	max_train_steps: 6000–8000
	mixed_precision: bf16