Fix README.md
#1
by
qpqpqpqpqpqp
- opened
README.md
CHANGED
|
@@ -8,7 +8,7 @@ base_model:
|
|
| 8 |
library_name: diffusers
|
| 9 |
---
|
| 10 |
|
| 11 |
-

|
| 33 |
- **Finetuned from model:** [Noobai V-pred 1.0](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0)
|
| 34 |
|
| 35 |
-
*Removed massive(in some cases over 6 tags) keep token`, introduced "protected tags", which allows for indiscriminate shuffling, while keeping tokens undroppable.
|
| 36 |
|
| 37 |
## Bias and Limitations
|
| 38 |
|
| 39 |
-
Due to low budget(~150$ total), we have not been successful in fully stabilizing the model, so you can and will encounter some issues that we were not able to find in our tests, or were not able to address. That wouldn't be too different from the performance of other base models, but your mileage will vary.
|
| 40 |
|
| 41 |
-
Most biases of official dataset will apply(Blue Archive, etc
|
| 42 |
|
| 43 |
-
Some color biases were not reduced, or became more apparent due to some of the quirks in convergence of rectified flow from Noobai v-pred. We did our best to mitigate it by training a bit further, but you will encounter them in certain strong color prompts. Some colors are in unstable state and are hard to achieve due to unfortunate state of their convergence at current step (Black and dark in particular, for example, `dark` will not generate dark image, you need to prompt `dark theme` for that
|
| 44 |
|
| 45 |
|
| 46 |
## Model Output Examples
|
|
@@ -73,10 +73,10 @@ Some color biases were not reduced, or became more apparent due to some of the q
|
|
| 73 |
|
| 74 |
#### Comfy
|
| 75 |
|
| 76 |
-
 - has native support for both RF, and conv padding.
|
|
@@ -96,18 +96,18 @@ Possible WebUIs:
|
|
| 96 |
|
| 97 |
**How to use in ReForge**:
|
| 98 |
|
| 99 |
-
 w
|
|
| 152 |
### Training Details
|
| 153 |
(Base / quality-tuned)
|
| 154 |
|
| 155 |
-
**Samples seen**(unbatched steps): ~2kk / ~400k
|
| 156 |
**Learning Rate**: 2e-5 / 2e-5
|
| 157 |
**Effective Batch size**: 1280 (40 real * 4 accum * 8 devices) / 1280 (40 * 4 * 8)
|
| 158 |
**Precision**: Full BF16
|
|
@@ -178,7 +178,7 @@ VAE: Changed, new VAE - [EQB7](https://huggingface.co/Anzhc/MS-LC-EQ-D-VR_VAE) w
|
|
| 178 |
|
| 179 |
"Original" Noobai data subset of ~2 million samples, then WAF* subset of ~20 thousand for quality tuning of this intermediate checkpoint. Tags were not changed, data was taken "as-is", as per the wishes of community.
|
| 180 |
|
| 181 |
-
*WAF - Weighted Aesthetic Filter, our recent solution for filtering data based on input of multiple scoring models at the same time(at varied weight, adapted for their specific prediction classes/range), including specialized models for specific content. High general threshold was used, resulting in top ~5% of data being selected for quality tuning.
|
| 182 |
|
| 183 |
|
| 184 |
### LoRA Trainig
|
|
@@ -187,8 +187,8 @@ Current base is highly trainable. We are mostly style trainers and finetuners, s
|
|
| 187 |
|
| 188 |
My current style training settings (Anzhc):
|
| 189 |
|
| 190 |
-
**Learning Rate**: tested up to **7.5e-4**, LoRA is still stable at that. Somehow. Prolonged training(300+ images for 50 epochs) at that LR did not result in degradation, likely can be pushed even further, likely up to 1e-3, at least at the batch im using.
|
| 191 |
-
**Batch Size**: 144 (6 real * 24 accum), using SGA(Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8.
|
| 192 |
**Optimizer**: Adamw8bit with Kahan summation
|
| 193 |
**Schedule**: ReREX (Use REX for simplicity)
|
| 194 |
**Precision**: Full BF16
|
|
@@ -201,8 +201,8 @@ My current style training settings (Anzhc):
|
|
| 201 |
|
| 202 |
**Optimal Transport**: True
|
| 203 |
|
| 204 |
-
**Expected Dataset Size**: 100 images (Can be even 10, but balance with repeats to roughly this target
|
| 205 |
-
**Epochs**: 50 (Yes, even with 10 repeats. 500 effective epochs works just fine and doesn't break from my tests
|
| 206 |
|
| 207 |
|
| 208 |
|
|
|
|
| 8 |
library_name: diffusers
|
| 9 |
---
|
| 10 |
|
| 11 |
+

|
| 12 |
|
| 13 |
## Model Details
|
| 14 |
|
|
|
|
| 19 |
|
| 20 |
### Model Description
|
| 21 |
|
| 22 |
+
Model is a continuation of NoobAI training on same dataset, with new diffusion target and few improvements to existing tag approach*. Given the scope of this undertaking, this is only an experimental version, utilizing only subset of full original data.
|
| 23 |
|
| 24 |
Current state of model is acceptable for general and research purposes, like Image Generation, Finetuning, LoRA Training, and others. We will provide example settings for common style training approach below.
|
| 25 |
|
|
|
|
| 32 |
- **License:** [fair-ai-public-license-1.0-sd](https://freedevproject.org/faipl-1.0-sd/)
|
| 33 |
- **Finetuned from model:** [Noobai V-pred 1.0](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0)
|
| 34 |
|
| 35 |
+
*Removed massive (in some cases over 6 tags) keep token`, introduced "protected tags", which allows for indiscriminate shuffling, while keeping tokens undroppable.
|
| 36 |
|
| 37 |
## Bias and Limitations
|
| 38 |
|
| 39 |
+
Due to low budget (~150$ total), we have not been successful in fully stabilizing the model, so you can and will encounter some issues that we were not able to find in our tests, or were not able to address. That wouldn't be too different from the performance of other base models, but your mileage will vary.
|
| 40 |
|
| 41 |
+
Most biases of official dataset will apply (Blue Archive, etc).
|
| 42 |
|
| 43 |
+
Some color biases were not reduced, or became more apparent due to some of the quirks in convergence of rectified flow from Noobai v-pred. We did our best to mitigate it by training a bit further, but you will encounter them in certain strong color prompts. Some colors are in unstable state and are hard to achieve due to unfortunate state of their convergence at current step (Black and dark in particular, for example, `dark` will not generate dark image, you need to prompt `dark theme` for that)
|
| 44 |
|
| 45 |
|
| 46 |
## Model Output Examples
|
|
|
|
| 73 |
|
| 74 |
#### Comfy
|
| 75 |
|
| 76 |
+

|
| 77 |
+
(The workflow is available alongside model in repo)
|
| 78 |
|
| 79 |
+
Same as your normal inference, but with addition of SD3 sampling node, and optional conv padding node, which is required for correct edges (VAE and model has been trained with padded convs in vae, to allow for easier edge content learning)
|
| 80 |
|
| 81 |
Recommended Parameters:
|
| 82 |
**Sampler**: Euler, Euler A, DPM++ SDE, etc.
|
|
|
|
| 87 |
**Negative Tags**: `worst quality, normal quality, bad anatomy`
|
| 88 |
|
| 89 |
|
| 90 |
+
#### A1111's WebUI
|
| 91 |
|
| 92 |
|
| 93 |
Recommended WebUI: [ReForge](https://github.com/Panchovix/stable-diffusion-webui-reForge) - has native support for both RF, and conv padding.
|
|
|
|
| 96 |
|
| 97 |
**How to use in ReForge**:
|
| 98 |
|
| 99 |
+

|
| 100 |
(ignore Sigma max field at the top, this is not used in RF)
|
| 101 |
|
| 102 |
Support for RF in ReForge is being implemented through a built-in extension:
|
| 103 |
|
| 104 |
+

|
| 105 |
|
| 106 |
Set parameters to that, and you're good to go.
|
| 107 |
|
| 108 |
**How to turn on padding**:
|
| 109 |
|
| 110 |
+

|
| 111 |
|
| 112 |
Turn this on, save, FULLY RELOAD the UI, by closing console and launching it again. This is required. Setting does not change until UI is fully reloaded.
|
| 113 |
Recommended Parameters:
|
|
|
|
| 121 |
**ADETAILER FIX FOR RF**:
|
| 122 |
By default, Adetailer discards Advanced Model Sampling extension, which breaks RF. You need to add AMS to this part of settings:
|
| 123 |
|
| 124 |
+

|
| 125 |
|
| 126 |
Add: `advanced_model_sampling_script,advanced_model_sampling_script_backported` to there.
|
| 127 |
|
| 128 |
If that does not work, go into adetailer extension, find args.py, open it, replace _builtin_scripts like this:
|
| 129 |
|
| 130 |
+

|
| 131 |
|
| 132 |
Here is a copypaste for easy copy:
|
| 133 |
```
|
|
|
|
| 152 |
### Training Details
|
| 153 |
(Base / quality-tuned)
|
| 154 |
|
| 155 |
+
**Samples seen** (unbatched steps): ~2kk / ~400k
|
| 156 |
**Learning Rate**: 2e-5 / 2e-5
|
| 157 |
**Effective Batch size**: 1280 (40 real * 4 accum * 8 devices) / 1280 (40 * 4 * 8)
|
| 158 |
**Precision**: Full BF16
|
|
|
|
| 178 |
|
| 179 |
"Original" Noobai data subset of ~2 million samples, then WAF* subset of ~20 thousand for quality tuning of this intermediate checkpoint. Tags were not changed, data was taken "as-is", as per the wishes of community.
|
| 180 |
|
| 181 |
+
*WAF - Weighted Aesthetic Filter, our recent solution for filtering data based on input of multiple scoring models at the same time (at varied weight, adapted for their specific prediction classes/range), including specialized models for specific content. High general threshold was used, resulting in top ~5% of data being selected for quality tuning.
|
| 182 |
|
| 183 |
|
| 184 |
### LoRA Trainig
|
|
|
|
| 187 |
|
| 188 |
My current style training settings (Anzhc):
|
| 189 |
|
| 190 |
+
**Learning Rate**: tested up to **7.5e-4**, LoRA is still stable at that. Somehow. Prolonged training (300+ images for 50 epochs) at that LR did not result in degradation, likely can be pushed even further, likely up to 1e-3, at least at the batch im using.
|
| 191 |
+
**Batch Size**: 144 (6 real * 24 accum), using SGA (Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8.
|
| 192 |
**Optimizer**: Adamw8bit with Kahan summation
|
| 193 |
**Schedule**: ReREX (Use REX for simplicity)
|
| 194 |
**Precision**: Full BF16
|
|
|
|
| 201 |
|
| 202 |
**Optimal Transport**: True
|
| 203 |
|
| 204 |
+
**Expected Dataset Size**: 100 images (Can be even 10, but balance with repeats to roughly this target)
|
| 205 |
+
**Epochs**: 50 (Yes, even with 10 repeats. 500 effective epochs works just fine and doesn't break from my tests)
|
| 206 |
|
| 207 |
|
| 208 |
|