SHREK_ENM / README.md
mamadat's picture
Update README.md
fe7478b verified
---
license: apache-2.0
private: false # Public์ด์ง€๋งŒ
unlisted: true # ๊ฒ€์ƒ‰์— ์•ˆ ๋‚˜ํƒ€๋‚จ
thumbnail: https://huggingface.co/mamadat/SHREK_ENM/resolve/main/SHREK_ENM.png
tags:
- diffusion
- text-to-image
---
![SHREK ENM Model](SHREK_ENM.png)
# SHREK_ENM Diffusion Model v0.1
## Model Details
- **์Šˆ๋ ‰ ์บ๋ฆญํ„ฐ ์ƒ์„ฑ์— ํŠนํ™”๋œ diffusion model**
- **์ „์ฒด ๊ฐ€์ค‘์น˜ ์žฌํ•™์Šต, ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋Š” Flux Krea ์‚ฌ์šฉ**
- **Developed:** Jihun.Hong
- **Datasets:** Seungwoo.Kim, Jiyeon Lee
- **Model type:** Text-to-Image Diffusion Model
- **Base Model architecture:** Flux.1_Krea_dev
- **Training approach:** Full weight fine-tuning (Complete Retraining)
- **Release date:** September 19, 2025
- **Version:** v0.1
### Model Sources
- **Demo[coming soon]:** End to End with Bytedance Waver 1.0, GIF Sample Below
<div align="center">
<img src="./SHREK_ENM_Video.gif" alt="SHREK Animation">
</div>
## Training Details
### Training Results
**[๋ชจ๋ธ 3๊ฐœ ๋น„๊ต]** ์ขŒ์ธก๋ถ€ํ„ฐ 3๊ฐ€์ง€ Epoch(2์ฐจํ•™์Šต ๊ฐ๊ฐ 4์‹œ๊ฐ„, 8์‹œ๊ฐ„, 12์‹œ๊ฐ„)์— ๋”ฐ๋ฅธ ๋ณ€ํ™”๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ ๊ณผ์ •์œผ๋กœ 30 Epoch ํ•™์Šต๋งŒ ์ง„ํ–‰ํ–ˆ์œผ๋ฉฐ, ํ”„๋กœ๋•์…˜ ๋ ˆ๋ฒจ์„ ์œ„ํ•ด์„œ๋Š” ์•ฝ 40์‹œ๊ฐ„์˜ ์ถ”๊ฐ€ ํ•™์Šต์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
<div align="center">
<img src="./training_progress.png" alt="Training Progress and Epoch Comparison" width="100%">
<p><em>Epoch๋ณ„ ๋ชจ๋ธ ๋ฐœ์ „ ๊ณผ์ •, ์ƒ˜ํ”Œ ์ถœ๋ ฅ ๋ฐ ์„ฑ๋Šฅ ์ง€ํ‘œ</em></p>
</div>
### Training Data
<div align="center">
<img src="./Dataset.png" alt="SHREK Animation">
</div>
- **๋ฐ์ดํ„ฐ์…‹:** ์ปค์Šคํ…€ SHREK ๋ฐ์ดํ„ฐ์…‹
- **๋ฐ์ดํ„ฐ์…‹ ํฌ๊ธฐ:** augmentation ํฌํ•จ 2.4GB, 820์žฅ, 1024ร—1024, Shrek ์–ผ๊ตด ๊ธฐ์ค€ SAM2 Segment, Yolo CROP
- **๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ:** Image augmentation, 1024ร—1024 ๋ฆฌ์‚ฌ์ด์ง•, face detection ๊ธฐ๋ฐ˜ ํฌ๋กญํ•‘(Yolo, SAM2 ๊ธฐ๋ฐ˜)
### Training Configuration
<div align="center">
<img src="./Train.png" alt="SHREK Animation">
</div>
- **ํ•˜๋“œ์›จ์–ด:** NVIDIA L40S GPU
- **ํ•™์Šต ์‹œ๊ฐ„:** PR: 30์‹œ๊ฐ„ 02๋ถ„, SC: 12์‹œ๊ฐ„ 11๋ถ„, Total: 42์‹œ๊ฐ„ 13๋ถ„
- **Batch size:** 7
- **Learning rate:** 2e-06, 4e-06, 6e-06
- **Training steps:** 256 ร— 40 / 7 = 1480 ์Šคํ…
## Usage
### ๋‹ค์–‘ํ•œ UI ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ํ˜ธํ™˜
์ด ๋ชจ๋ธ์€ **ComfyUI, SwarmUI, Forge, Automatic1111 ๋“ฑ** AI UI ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ ์›ํ™œํ•˜๊ฒŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
**ComfyUI**
<div align="center">
<img src="./ComfyUI_Workflow.png" alt="SHREK Animation">
</div>
**SwarmUI**
<div align="center">
<img src="./SwarmUI.png" alt="SHREK Animation">
</div>
#### ์„ค์น˜ ๋‹จ๊ณ„
1. **๋ชจ๋ธ ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ:**
- `SHREK_ENM.safetensors` - ๋ฉ”์ธ ๋ชจ๋ธ ํŒŒ์ผ
- `ae.safetensors` - VAE ๋ชจ๋ธ
- `clip_l.safetensors` - CLIP text encoder
- `t5xxl_enconly.safetensors` - T5 text encoder
2. **์˜ฌ๋ฐ”๋ฅธ ๋””๋ ‰ํ† ๋ฆฌ์— ํŒŒ์ผ ๋ฐฐ์น˜**
3. **ComfyUI์—์„œ ๋กœ๋“œ:**
- ๊ฐ ๊ตฌ์„ฑ ์š”์†Œ์— ์ ํ•ฉํ•œ loader node ์‚ฌ์šฉ
- workflow์— ๋”ฐ๋ผ node ์—ฐ๊ฒฐ
- "Load Diffusion Model" node๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ `SHREK_ENM.safetensors` ๋กœ๋“œ
- ํ•ด๋‹น loader node๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ text encoder์™€ VAE ๋กœ๋“œ
#### ๊ถŒ์žฅ ์„ค์ •
- **CFG Scale:** 1.0 (์ด ๊ฐ’์„ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์„ ๊ฐ•๋ ฅํžˆ ๊ถŒ์žฅ)
- **Sampling Steps:** 35-45
- **Sampler:** iPNDM ๋˜๋Š” Euler a