Update README.md
Browse files
README.md
CHANGED
|
@@ -41,7 +41,7 @@ license: mit
|
|
| 41 |
<img src="./card_images/11.png" class="wide" alt="Sample Image 11">
|
| 42 |
</div>
|
| 43 |
|
| 44 |
-
**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics.
|
| 45 |
|
| 46 |
## Key Features:
|
| 47 |
|
|
@@ -66,3 +66,35 @@ This model may produce unexpected or unintended results. **Use with caution and
|
|
| 66 |
- **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain.
|
| 67 |
|
| 68 |
Thank you! 😊
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
<img src="./card_images/11.png" class="wide" alt="Sample Image 11">
|
| 42 |
</div>
|
| 43 |
|
| 44 |
+
**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics. (Oct 6, 2024)
|
| 45 |
|
| 46 |
## Key Features:
|
| 47 |
|
|
|
|
| 66 |
- **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain.
|
| 67 |
|
| 68 |
Thank you! 😊
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
------------------------------------------------------
|
| 72 |
+
## Momo XL - Training Details (Oct 15, 2024)
|
| 73 |
+
|
| 74 |
+
### Dataset
|
| 75 |
+
Momo XL was trained using a dataset of over **400,000+ images** sourced from Danbooru.
|
| 76 |
+
|
| 77 |
+
### Base Model
|
| 78 |
+
Momo XL was built on top of SDXL, incorporating knowledge from two finetuned models:
|
| 79 |
+
- Formula:
|
| 80 |
+
`SDXL_base + (Animagine 3.0 base - SDXL_base) * 1.0 + (Pony V6 - SDXL_base) * 0.5`
|
| 81 |
+
|
| 82 |
+
For more details:
|
| 83 |
+
- [Animagine 3.0 base](https://huggingface.co/Linaqruf/animagine-xl-3.0)
|
| 84 |
+
- [Pony V6](https://huggingface.co/LyliaEngine/Pony_Diffusion_V6_XL)
|
| 85 |
+
|
| 86 |
+
### Training Process
|
| 87 |
+
Training was conducted on **A100 80GB GPUs**, totaling over **2000+ GPU hours**. The training was divided into three stages:
|
| 88 |
+
- **Finetuning - First Stage**: Trained on the entire dataset with a defined set of training configurations.
|
| 89 |
+
- **Finetuning - Second Stage**: Also trained on the entire dataset with some variations in settings.
|
| 90 |
+
- **Adjustment Stage**: Focused on aesthetic adjustments to improve the overall visual quality.
|
| 91 |
+
|
| 92 |
+
The final model, **Momo XL**, was released by merging the Text Encoder from the Finetuning Second Stage with the UNet from the Adjustment Stage.
|
| 93 |
+
|
| 94 |
+
### Hyperparameters
|
| 95 |
+
|
| 96 |
+
| Stage | Epochs | UNet lr | Text Encoder lr | Batch Size | Resolution | Noise Offset | Optimizer | LR Scheduler |
|
| 97 |
+
|--------------------------|--------|---------|-----------------|------------|------------|--------------|------------|--------------|
|
| 98 |
+
| **Finetuning 1st Stage** | 10 | 2e-5 | 1e-5 | 256 | 1024² | N/A | AdamW8bit | Constant |
|
| 99 |
+
| **Finetuning 2nd Stage** | 10 | 2e-5 | 1e-5 | 256 | Max. 1280² | N/A | AdamW | Constant |
|
| 100 |
+
| **Adjustment Stage** | 0.25 | 8e-5 | 4e-5 | 1024 | Max. 1280² | 0.05 | AdamW | Constant |
|