Diffusers
Files changed (1) hide show
  1. README.md +21 -21
README.md CHANGED
@@ -8,7 +8,7 @@ base_model:
8
  library_name: diffusers
9
  ---
10
 
11
- ![Без имени-1 копия](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/6Ac5cvhD-KBTe9zyktmh8.png)
12
 
13
  ## Model Details
14
 
@@ -19,7 +19,7 @@ This is an Experimental Conversion of Noobai v-pred to Rectified Flow target, us
19
 
20
  ### Model Description
21
 
22
- Model is a continuation of Noobai training on same dataset, with new diffusion target and few improvements to existing tag approach*. Given the scope of this undertaking, this is only an experimental version, utilizing only subset of full original data.
23
 
24
  Current state of model is acceptable for general and research purposes, like Image Generation, Finetuning, LoRA Training, and others. We will provide example settings for common style training approach below.
25
 
@@ -32,15 +32,15 @@ Generally, model is fairly stable, but can suffer certain drawbacks coming from
32
  - **License:** [fair-ai-public-license-1.0-sd](https://freedevproject.org/faipl-1.0-sd/)
33
  - **Finetuned from model:** [Noobai V-pred 1.0](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0)
34
 
35
- *Removed massive(in some cases over 6 tags) keep token`, introduced "protected tags", which allows for indiscriminate shuffling, while keeping tokens undroppable.
36
 
37
  ## Bias and Limitations
38
 
39
- Due to low budget(~150$ total), we have not been successful in fully stabilizing the model, so you can and will encounter some issues that we were not able to find in our tests, or were not able to address. That wouldn't be too different from the performance of other base models, but your mileage will vary.
40
 
41
- Most biases of official dataset will apply(Blue Archive, etc.).
42
 
43
- Some color biases were not reduced, or became more apparent due to some of the quirks in convergence of rectified flow from Noobai v-pred. We did our best to mitigate it by training a bit further, but you will encounter them in certain strong color prompts. Some colors are in unstable state and are hard to achieve due to unfortunate state of their convergence at current step (Black and dark in particular, for example, `dark` will not generate dark image, you need to prompt `dark theme` for that.)
44
 
45
 
46
  ## Model Output Examples
@@ -73,10 +73,10 @@ Some color biases were not reduced, or became more apparent due to some of the q
73
 
74
  #### Comfy
75
 
76
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/IQFZizmP_NSbEMYE5LC7T.png)
77
- (Workflow is available alongside model in repo)
78
 
79
- Same as your normal inference, but with addition of SD3 sampling node, and optional conv padding node, which is required for correct edges(VAE and model has been trained with padded convs in vae, to allow for easier edge content learning.)
80
 
81
  Recommended Parameters:
82
  **Sampler**: Euler, Euler A, DPM++ SDE, etc.
@@ -87,7 +87,7 @@ Recommended Parameters:
87
  **Negative Tags**: `worst quality, normal quality, bad anatomy`
88
 
89
 
90
- #### A1111 WebUI
91
 
92
 
93
  Recommended WebUI: [ReForge](https://github.com/Panchovix/stable-diffusion-webui-reForge) - has native support for both RF, and conv padding.
@@ -96,18 +96,18 @@ Possible WebUIs:
96
 
97
  **How to use in ReForge**:
98
 
99
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/UV5Yp66H7YlccdQqborPf.png)
100
  (ignore Sigma max field at the top, this is not used in RF)
101
 
102
  Support for RF in ReForge is being implemented through a built-in extension:
103
 
104
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/LpMF0lmC96X001Au9fFU_.png)
105
 
106
  Set parameters to that, and you're good to go.
107
 
108
  **How to turn on padding**:
109
 
110
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/GmieYDa5l1C9sUiN363xt.png)
111
 
112
  Turn this on, save, FULLY RELOAD the UI, by closing console and launching it again. This is required. Setting does not change until UI is fully reloaded.
113
  Recommended Parameters:
@@ -121,13 +121,13 @@ Recommended Parameters:
121
  **ADETAILER FIX FOR RF**:
122
  By default, Adetailer discards Advanced Model Sampling extension, which breaks RF. You need to add AMS to this part of settings:
123
 
124
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/RQMtfm5Xi3V7oNsqXoZJN.png)
125
 
126
  Add: `advanced_model_sampling_script,advanced_model_sampling_script_backported` to there.
127
 
128
  If that does not work, go into adetailer extension, find args.py, open it, replace _builtin_scripts like this:
129
 
130
- ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/rmnS-i_kciJzTZmeR-mGP.png)
131
 
132
  Here is a copypaste for easy copy:
133
  ```
@@ -152,7 +152,7 @@ VAE: Changed, new VAE - [EQB7](https://huggingface.co/Anzhc/MS-LC-EQ-D-VR_VAE) w
152
  ### Training Details
153
  (Base / quality-tuned)
154
 
155
- **Samples seen**(unbatched steps): ~2kk / ~400k
156
  **Learning Rate**: 2e-5 / 2e-5
157
  **Effective Batch size**: 1280 (40 real * 4 accum * 8 devices) / 1280 (40 * 4 * 8)
158
  **Precision**: Full BF16
@@ -178,7 +178,7 @@ VAE: Changed, new VAE - [EQB7](https://huggingface.co/Anzhc/MS-LC-EQ-D-VR_VAE) w
178
 
179
  "Original" Noobai data subset of ~2 million samples, then WAF* subset of ~20 thousand for quality tuning of this intermediate checkpoint. Tags were not changed, data was taken "as-is", as per the wishes of community.
180
 
181
- *WAF - Weighted Aesthetic Filter, our recent solution for filtering data based on input of multiple scoring models at the same time(at varied weight, adapted for their specific prediction classes/range), including specialized models for specific content. High general threshold was used, resulting in top ~5% of data being selected for quality tuning.
182
 
183
 
184
  ### LoRA Trainig
@@ -187,8 +187,8 @@ Current base is highly trainable. We are mostly style trainers and finetuners, s
187
 
188
  My current style training settings (Anzhc):
189
 
190
- **Learning Rate**: tested up to **7.5e-4**, LoRA is still stable at that. Somehow. Prolonged training(300+ images for 50 epochs) at that LR did not result in degradation, likely can be pushed even further, likely up to 1e-3, at least at the batch im using.
191
- **Batch Size**: 144 (6 real * 24 accum), using SGA(Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8.
192
  **Optimizer**: Adamw8bit with Kahan summation
193
  **Schedule**: ReREX (Use REX for simplicity)
194
  **Precision**: Full BF16
@@ -201,8 +201,8 @@ My current style training settings (Anzhc):
201
 
202
  **Optimal Transport**: True
203
 
204
- **Expected Dataset Size**: 100 images (Can be even 10, but balance with repeats to roughly this target.)
205
- **Epochs**: 50 (Yes, even with 10 repeats. 500 effective epochs works just fine and doesn't break from my tests.)
206
 
207
 
208
 
 
8
  library_name: diffusers
9
  ---
10
 
11
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/6Ac5cvhD-KBTe9zyktmh8.png)
12
 
13
  ## Model Details
14
 
 
19
 
20
  ### Model Description
21
 
22
+ Model is a continuation of NoobAI training on same dataset, with new diffusion target and few improvements to existing tag approach*. Given the scope of this undertaking, this is only an experimental version, utilizing only subset of full original data.
23
 
24
  Current state of model is acceptable for general and research purposes, like Image Generation, Finetuning, LoRA Training, and others. We will provide example settings for common style training approach below.
25
 
 
32
  - **License:** [fair-ai-public-license-1.0-sd](https://freedevproject.org/faipl-1.0-sd/)
33
  - **Finetuned from model:** [Noobai V-pred 1.0](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0)
34
 
35
+ *Removed massive (in some cases over 6 tags) keep token`, introduced "protected tags", which allows for indiscriminate shuffling, while keeping tokens undroppable.
36
 
37
  ## Bias and Limitations
38
 
39
+ Due to low budget (~150$ total), we have not been successful in fully stabilizing the model, so you can and will encounter some issues that we were not able to find in our tests, or were not able to address. That wouldn't be too different from the performance of other base models, but your mileage will vary.
40
 
41
+ Most biases of official dataset will apply (Blue Archive, etc).
42
 
43
+ Some color biases were not reduced, or became more apparent due to some of the quirks in convergence of rectified flow from Noobai v-pred. We did our best to mitigate it by training a bit further, but you will encounter them in certain strong color prompts. Some colors are in unstable state and are hard to achieve due to unfortunate state of their convergence at current step (Black and dark in particular, for example, `dark` will not generate dark image, you need to prompt `dark theme` for that)
44
 
45
 
46
  ## Model Output Examples
 
73
 
74
  #### Comfy
75
 
76
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/IQFZizmP_NSbEMYE5LC7T.png)
77
+ (The workflow is available alongside model in repo)
78
 
79
+ Same as your normal inference, but with addition of SD3 sampling node, and optional conv padding node, which is required for correct edges (VAE and model has been trained with padded convs in vae, to allow for easier edge content learning)
80
 
81
  Recommended Parameters:
82
  **Sampler**: Euler, Euler A, DPM++ SDE, etc.
 
87
  **Negative Tags**: `worst quality, normal quality, bad anatomy`
88
 
89
 
90
+ #### A1111's WebUI
91
 
92
 
93
  Recommended WebUI: [ReForge](https://github.com/Panchovix/stable-diffusion-webui-reForge) - has native support for both RF, and conv padding.
 
96
 
97
  **How to use in ReForge**:
98
 
99
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/UV5Yp66H7YlccdQqborPf.png)
100
  (ignore Sigma max field at the top, this is not used in RF)
101
 
102
  Support for RF in ReForge is being implemented through a built-in extension:
103
 
104
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/LpMF0lmC96X001Au9fFU_.png)
105
 
106
  Set parameters to that, and you're good to go.
107
 
108
  **How to turn on padding**:
109
 
110
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/GmieYDa5l1C9sUiN363xt.png)
111
 
112
  Turn this on, save, FULLY RELOAD the UI, by closing console and launching it again. This is required. Setting does not change until UI is fully reloaded.
113
  Recommended Parameters:
 
121
  **ADETAILER FIX FOR RF**:
122
  By default, Adetailer discards Advanced Model Sampling extension, which breaks RF. You need to add AMS to this part of settings:
123
 
124
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/RQMtfm5Xi3V7oNsqXoZJN.png)
125
 
126
  Add: `advanced_model_sampling_script,advanced_model_sampling_script_backported` to there.
127
 
128
  If that does not work, go into adetailer extension, find args.py, open it, replace _builtin_scripts like this:
129
 
130
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/rmnS-i_kciJzTZmeR-mGP.png)
131
 
132
  Here is a copypaste for easy copy:
133
  ```
 
152
  ### Training Details
153
  (Base / quality-tuned)
154
 
155
+ **Samples seen** (unbatched steps): ~2kk / ~400k
156
  **Learning Rate**: 2e-5 / 2e-5
157
  **Effective Batch size**: 1280 (40 real * 4 accum * 8 devices) / 1280 (40 * 4 * 8)
158
  **Precision**: Full BF16
 
178
 
179
  "Original" Noobai data subset of ~2 million samples, then WAF* subset of ~20 thousand for quality tuning of this intermediate checkpoint. Tags were not changed, data was taken "as-is", as per the wishes of community.
180
 
181
+ *WAF - Weighted Aesthetic Filter, our recent solution for filtering data based on input of multiple scoring models at the same time (at varied weight, adapted for their specific prediction classes/range), including specialized models for specific content. High general threshold was used, resulting in top ~5% of data being selected for quality tuning.
182
 
183
 
184
  ### LoRA Trainig
 
187
 
188
  My current style training settings (Anzhc):
189
 
190
+ **Learning Rate**: tested up to **7.5e-4**, LoRA is still stable at that. Somehow. Prolonged training (300+ images for 50 epochs) at that LR did not result in degradation, likely can be pushed even further, likely up to 1e-3, at least at the batch im using.
191
+ **Batch Size**: 144 (6 real * 24 accum), using SGA (Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8.
192
  **Optimizer**: Adamw8bit with Kahan summation
193
  **Schedule**: ReREX (Use REX for simplicity)
194
  **Precision**: Full BF16
 
201
 
202
  **Optimal Transport**: True
203
 
204
+ **Expected Dataset Size**: 100 images (Can be even 10, but balance with repeats to roughly this target)
205
+ **Epochs**: 50 (Yes, even with 10 repeats. 500 effective epochs works just fine and doesn't break from my tests)
206
 
207
 
208