File size: 3,245 Bytes
00ef041 66839c6 4817359 ca3a941 00ef041 b015923 bd22703 e03f9f2 ca3a941 e03f9f2 fcd4d3b fe7478b 27d16d1 3633fee ca3a941 e896b5f e03f9f2 ca3a941 e03f9f2 ca3a941 574cb82 7b003e5 d0b355b 7b003e5 e03f9f2 0067149 e03f9f2 8ea96ad 68351f6 8ea96ad 0067149 bc5331c dc298ee bc5331c e03f9f2 574cb82 ca7dc28 0b76d3a e03f9f2 1c2ca64 ffe7577 e03f9f2 0067149 05869a3 a570323 55d7888 7291d41 21866f6 5fa94cf 7291d41 21866f6 7291d41 83abc3f 05869a3 a570323 7eb60c3 05869a3 574cb82 05869a3 a570323 05869a3 a570323 9721ef3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | ---
license: apache-2.0
private: false # Public์ด์ง๋ง
unlisted: true # ๊ฒ์์ ์ ๋ํ๋จ
thumbnail: https://huggingface.co/mamadat/SHREK_ENM/resolve/main/SHREK_ENM.png
tags:
- diffusion
- text-to-image
---

# SHREK_ENM Diffusion Model v0.1
## Model Details
- **์๋ ์บ๋ฆญํฐ ์์ฑ์ ํนํ๋ diffusion model**
- **์ ์ฒด ๊ฐ์ค์น ์ฌํ์ต, ๋ชจ๋ธ ์ํคํ
์ฒ๋ Flux Krea ์ฌ์ฉ**
- **Developed:** Jihun.Hong
- **Datasets:** Seungwoo.Kim, Jiyeon Lee
- **Model type:** Text-to-Image Diffusion Model
- **Base Model architecture:** Flux.1_Krea_dev
- **Training approach:** Full weight fine-tuning (Complete Retraining)
- **Release date:** September 19, 2025
- **Version:** v0.1
### Model Sources
- **Demo[coming soon]:** End to End with Bytedance Waver 1.0, GIF Sample Below
<div align="center">
<img src="./SHREK_ENM_Video.gif" alt="SHREK Animation">
</div>
## Training Details
### Training Results
**[๋ชจ๋ธ 3๊ฐ ๋น๊ต]** ์ข์ธก๋ถํฐ 3๊ฐ์ง Epoch(2์ฐจํ์ต ๊ฐ๊ฐ 4์๊ฐ, 8์๊ฐ, 12์๊ฐ)์ ๋ฐ๋ฅธ ๋ณํ๋ฅผ ๋ณด์ฌ์ค๋๋ค. ํ
์คํธ ๊ณผ์ ์ผ๋ก 30 Epoch ํ์ต๋ง ์งํํ์ผ๋ฉฐ, ํ๋ก๋์
๋ ๋ฒจ์ ์ํด์๋ ์ฝ 40์๊ฐ์ ์ถ๊ฐ ํ์ต์ด ํ์ํฉ๋๋ค.
<div align="center">
<img src="./training_progress.png" alt="Training Progress and Epoch Comparison" width="100%">
<p><em>Epoch๋ณ ๋ชจ๋ธ ๋ฐ์ ๊ณผ์ , ์ํ ์ถ๋ ฅ ๋ฐ ์ฑ๋ฅ ์งํ</em></p>
</div>
### Training Data
<div align="center">
<img src="./Dataset.png" alt="SHREK Animation">
</div>
- **๋ฐ์ดํฐ์
:** ์ปค์คํ
SHREK ๋ฐ์ดํฐ์
- **๋ฐ์ดํฐ์
ํฌ๊ธฐ:** augmentation ํฌํจ 2.4GB, 820์ฅ, 1024ร1024, Shrek ์ผ๊ตด ๊ธฐ์ค SAM2 Segment, Yolo CROP
- **๋ฐ์ดํฐ ์ ์ฒ๋ฆฌ:** Image augmentation, 1024ร1024 ๋ฆฌ์ฌ์ด์ง, face detection ๊ธฐ๋ฐ ํฌ๋กญํ(Yolo, SAM2 ๊ธฐ๋ฐ)
### Training Configuration
<div align="center">
<img src="./Train.png" alt="SHREK Animation">
</div>
- **ํ๋์จ์ด:** NVIDIA L40S GPU
- **ํ์ต ์๊ฐ:** PR: 30์๊ฐ 02๋ถ, SC: 12์๊ฐ 11๋ถ, Total: 42์๊ฐ 13๋ถ
- **Batch size:** 7
- **Learning rate:** 2e-06, 4e-06, 6e-06
- **Training steps:** 256 ร 40 / 7 = 1480 ์คํ
## Usage
### ๋ค์ํ UI ์ ํ๋ฆฌ์ผ์ด์
ํธํ
์ด ๋ชจ๋ธ์ **ComfyUI, SwarmUI, Forge, Automatic1111 ๋ฑ** AI UI ์ ํ๋ฆฌ์ผ์ด์
์์ ์ํํ๊ฒ ์๋ํฉ๋๋ค.
**ComfyUI**
<div align="center">
<img src="./ComfyUI_Workflow.png" alt="SHREK Animation">
</div>
**SwarmUI**
<div align="center">
<img src="./SwarmUI.png" alt="SHREK Animation">
</div>
#### ์ค์น ๋จ๊ณ
1. **๋ชจ๋ธ ํ์ผ ๋ค์ด๋ก๋:**
- `SHREK_ENM.safetensors` - ๋ฉ์ธ ๋ชจ๋ธ ํ์ผ
- `ae.safetensors` - VAE ๋ชจ๋ธ
- `clip_l.safetensors` - CLIP text encoder
- `t5xxl_enconly.safetensors` - T5 text encoder
2. **์ฌ๋ฐ๋ฅธ ๋๋ ํ ๋ฆฌ์ ํ์ผ ๋ฐฐ์น**
3. **ComfyUI์์ ๋ก๋:**
- ๊ฐ ๊ตฌ์ฑ ์์์ ์ ํฉํ loader node ์ฌ์ฉ
- workflow์ ๋ฐ๋ผ node ์ฐ๊ฒฐ
- "Load Diffusion Model" node๋ฅผ ์ฌ์ฉํ์ฌ `SHREK_ENM.safetensors` ๋ก๋
- ํด๋น loader node๋ฅผ ์ฌ์ฉํ์ฌ text encoder์ VAE ๋ก๋
#### ๊ถ์ฅ ์ค์
- **CFG Scale:** 1.0 (์ด ๊ฐ์ ์ ์งํ๋ ๊ฒ์ ๊ฐ๋ ฅํ ๊ถ์ฅ)
- **Sampling Steps:** 35-45
- **Sampler:** iPNDM ๋๋ Euler a |