File size: 6,850 Bytes
00835cb
 
c7729df
 
 
 
 
 
 
 
 
 
 
00835cb
c7729df
89ca596
 
 
 
c7729df
 
89ca596
c7729df
 
 
89ca596
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c7729df
 
 
 
 
 
 
89ca596
 
 
c7729df
 
 
 
 
 
 
 
 
 
 
 
 
89ca596
 
 
 
c7729df
 
 
89ca596
 
 
c7729df
 
 
 
 
 
 
 
 
89ca596
c7729df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89ca596
c7729df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89ca596
 
 
c7729df
89ca596
 
 
c7729df
 
 
 
 
89ca596
 
c7729df
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
---

license: apache-2.0
license_name: apache-2.0
tags:
- lora
- manga
- coloring
- anime
- qwen
- dataset
- diffusers
- image-to-image
viewer: false
---


<p align="center">
  <img src="images/logo.png" alt="PanelPainter Logo" width="400">
</p>

# PanelPainter-Project

**PanelPainter-Project** is an open-source initiative to automate black-and-white manga coloring using fine-tuned LoRAs.

This project is dedicated to training LoRAs to automate the coloring of black-and-white manga panels. I am releasing all the files here, including datasets, logs, and experimental versions, so others can see exactly how it was trained.

## Showcase

Here are some examples comparing the original panel, the base Qwen Image Edit model, and the result with the PanelPainter V3 LoRA.

> [!NOTE]
> **Showcase Generation Settings:**
> * **LoRAs:** PanelPainter V3 (Weight: 1.0) + 4-Step Lighting (Weight: 1.0)
> * **Steps:** 4
> * **Sampler:** Euler
> * **Scheduler:** Simple
> * **Seed:** 1000
> * **CFG:** 1.0

<p align="center">
  <img src="images/Sample_Image_1.png" alt="Chainsaw Man Showcase">
  <br>
  <em>Chainsaw Man</em>
</p>

<p align="center">
  <img src="images/Sample_Image_2.png" alt="Frieren Showcase">
  <br>
  <em>Frieren</em>
</p>

<p align="center">
  <img src="images/Sample_Image_3.png" alt="Komi Showcase">
  <br>
  <em>Komi Can't Communicate</em>
</p>

<p align="center">
  <img src="images/Sample_Image_4.png" alt="Oshi no Ko Showcase">
  <br>
  <em>Oshi no Ko</em>
</p>

## Project Structure

This repository contains everything used to create the models:

### 1. LoRA Models (`/loras`)
This directory contains the model weights for all iterations of the project:

> [!TIP]
> **Trigger Word:** `Color this panelpainter` (Applicable for both V2 and V3)

* **V3 (Latest Release):** `PanelPainter_v3_Qwen2511.safetensors`
    * **Base:** Qwen Image Edit 2511
    * **Note:** The latest model trained on the expanded 903-image dataset.
* **V2 (Stable):** `PanelPainter_v2_Qwen2509.safetensors`
    * **Base:** Qwen Image Edit 2509 (Compatible with 2511).
    * **Note:** Standard release (High quality, low variety).
* **V1 (Legacy):** `PanelPainter_v1_Legacy.safetensors`
    * **Base:** Qwen Image Edit 2509
    * **Note:** Archived experimental version (synthetic data).

### 2. Training Logs (`/logs`)
**Content:** Tensorboard logs and charts from my training runs. You can check these to see how the loss converged and how the model learned over time for each version.

### 3. Workflows (`/workflows`)
**Content:** ComfyUI workflow JSON files to help you get started with PanelPainter.

### 4. Training Dataset
The datasets used for this project are hosted separately:

* **PanelPainter-Dataset**

> [!NOTE]
> **Coming Soon:** The V3 dataset was a good learning step for captioning, but it was randomly picked without any streamlined curation roughly 50% doujin and 50% mainstream colored manga. We're refining it further. Expect handpicked panels, better captions, and reduced doujin content. Release coming once quality standards are met.

---

## Version History & Development Log

### Version 3.0 (Current Release)
* **Status:** Released.
* **Base Architecture:** Qwen 2511.
* **Strategy:** Scaling Up High-Quality Data.
* **Dataset:** Expanded to 903 images. Recreated from scratch, comprising 50% doujin and 50% SFW panels.
* **Summary:** This version combines the correct "real line art" training method discovered in V2 with a significantly larger dataset. This improves the model's ability to generalize across different manga styles while maintaining the color quality of V2.

### Version 2.0
* **Status:** Released / Stable.
* **Base Model:** Trained on Qwen Image Edit 2509, also it works on Qwen 2511 as well.
* **The Breakthrough:** After V1 failed, this version switched to training on real line art instead of synthetic grayscale.
* **Dataset:** A tiny, hyper-curated set of 150 images (70% Doujin / 30% SFW).
* **Outcome:** Despite the small size, it proved that high-quality real line art outperforms massive synthetic datasets. It produces good colors but lacks variety due to the small sample size.

### Version 1.0
* **Status:** Archived / Deprecated.
* **Base Model:** Qwen Image Edit 2509.
* **The Mistake:** Trained on 7,000 images generated by simply desaturating colored pages (synthetic grayscale).
* **Outcome:** The model learned to color "perfect gray" inputs but failed on real, imperfect ink lines.
* **Lesson:** Quantity does not matter if the data distribution doesn't match real usage.

---

## Training Configuration (V3)

**Hardware:** Trained on an A40 GPU on Runpod.

Below is the exact accelerate command used to train the V3 model on Musubi Tuner:

```bash

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 \

  /workspace/musubi-tuner/src/musubi_tuner/qwen_image_train_network.py \

  --dataset_config dataset_edit.toml \

  --dit /workspace/Training_Models_Qwen/Qwen_Image_Edit_2511_BF16.safetensors \

  --vae /workspace/Training_Models_Qwen/qwen_train_vae.safetensors \

  --text_encoder /workspace/Training_Models_Qwen/qwen_2.5_vl_7b_bf16.safetensors \

  --model_version edit-2511 \

  --network_module networks.lora_qwen_image \

  --output_dir /workspace/output_panelpainter \

  --output_name panelpainter_v3_part1 \

  --mixed_precision bf16 \

  --max_data_loader_n_workers 0 \

  --learning_rate 3e-4 \

  --network_dim 128 \

  --network_alpha 128 \

  --optimizer_type adafactor \

  --optimizer_args "scale_parameter=False" "relative_step=False" "warmup_init=False" "weight_decay=0.01" \

  --lr_scheduler cosine \

  --lr_warmup_steps 150 \

  --timestep_sampling qinglong_qwen \

  --discrete_flow_shift 2.2 \

  --max_train_epochs 8 \

  --save_every_n_epochs 1 \

  --save_state \

  --gradient_checkpointing \

  --gradient_checkpointing_cpu_offload \

  --gradient_accumulation_steps 4 \

  --blocks_to_swap 20 \

  --sdpa

```

**Dataset Settings:** Use a resolution of **1328x1328** with bucketing enabled to handle varying aspect ratios (no upscaling). The training ran with a batch size of 1 and enabled `qwen_image_edit_no_resize_control` to preserve the original dimensions of the control images during processing.

## License

* **Project:** Apache 2.0
* **Dataset:** Hosted separately, contains copyrighted manga panels.
* **Copyright:** Original art belongs to the respective creators and publishers.

## Acknowledgements

Trained on Musubi Tuner. Thanks to kohya-ss.

**Dataset Contributors:** Thanks to @Rox_Jr & @lucifer_brine04 for their help with the dataset.

## External Links
* **Public Model Page:** [Civitai: PanelPainter](https://civitai.com/models/2103847/panelpainter-manga-coloring)