Ace-Step-Munk / docs /ko /LoRA_Training_Tutorial.md
OnyxMunk's picture
Add LoRA training assets: scripts, docs (no binaries), ui, my_dataset
bc9c638
# ACE-Step 1.5 LoRA ํ•™์Šต ํŠœํ† ๋ฆฌ์–ผ
## ํ•˜๋“œ์›จ์–ด ์š”๊ตฌ์‚ฌํ•ญ
| VRAM | ์„ค๋ช… |
|------|------|
| 16 GB (์ตœ์†Œ) | ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‚˜, ๊ธด ๊ณก์˜ ๊ฒฝ์šฐ ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค |
| 20 GB ์ด์ƒ (๊ถŒ์žฅ) | ์ „์ฒด ๊ธธ์ด์˜ ๊ณก์„ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ. ํ•™์Šต ์ค‘ VRAM ์‚ฌ์šฉ๋Ÿ‰์€ ๋ณดํ†ต 17 GB ์ˆ˜์ค€์ž…๋‹ˆ๋‹ค |
> **์ฐธ๊ณ :** ํ•™์Šต ์‹œ์ž‘ ์ „ ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„์—์„œ VRAM์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด Gradio๋ฅผ ์—ฌ๋Ÿฌ ๋ฒˆ ์žฌ์‹œ์ž‘ํ•ด์•ผ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ตฌ์ฒด์ ์ธ ์‹œ์ ์€ ์ดํ›„ ๋‹จ๊ณ„์—์„œ ์•ˆ๋‚ดํ•ฉ๋‹ˆ๋‹ค.
## ๋ฉด์ฑ… ์กฐํ•ญ
๋ณธ ํŠœํ† ๋ฆฌ์–ผ์€ **๋‚˜์œ ํƒ„์„ฑ์ธ (NayutalieN)** ์˜ ์•จ๋ฒ” *ใƒŠใƒฆใ‚ฟใƒณๆ˜Ÿใ‹ใ‚‰ใฎ็‰ฉไฝ“Y* (์ด 13๊ณก)์„ ๋ฐ๋ชจ๋กœ ์‚ฌ์šฉํ•˜๋ฉฐ, 500 ์—ํฌํฌ(๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ 1)๋กœ ํ•™์Šตํ–ˆ์Šต๋‹ˆ๋‹ค. **๋ณธ ํŠœํ† ๋ฆฌ์–ผ์€ LoRA ํŒŒ์ธํŠœ๋‹ ๊ธฐ์ˆ ์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ๊ต์œก ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ž์‹ ์˜ ์›์ž‘์œผ๋กœ LoRA๋ฅผ ํ•™์Šตํ•ด ์ฃผ์„ธ์š”.**
๊ฐœ๋ฐœ์ž๋กœ์„œ ๋‚˜์œ ํƒ„์„ฑ์ธ์˜ ์ž‘ํ’ˆ์„ ๋งค์šฐ ์ข‹์•„ํ•˜์—ฌ ์•จ๋ฒ” ํ•˜๋‚˜๋ฅผ ์˜ˆ์‹œ๋กœ ์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ถŒ๋ฆฌ ๋ณด์œ ์ž๋ถ„๊ป˜์„œ ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์ด ํ•ฉ๋ฒ•์ ์ธ ๊ถŒ๋ฆฌ๋ฅผ ์นจํ•ดํ•œ๋‹ค๊ณ  ํŒ๋‹จํ•˜์‹œ๋ฉด ์ฆ‰์‹œ ์—ฐ๋ฝ ์ฃผ์„ธ์š”. ์œ ํšจํ•œ ํ†ต์ง€๋ฅผ ๋ฐ›์€ ํ›„ ๊ด€๋ จ ์ฝ˜ํ…์ธ ๋ฅผ ์‚ญ์ œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
๊ธฐ์ˆ ์€ ํ•ฉ๋ฆฌ์ ์ด๊ณ  ํ•ฉ๋ฒ•์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„ํ‹ฐ์ŠคํŠธ์˜ ์ฐฝ์ž‘๋ฌผ์„ ์กด์ค‘ํ•˜๊ณ , ์›์ž‘ ์•„ํ‹ฐ์ŠคํŠธ์˜ ๋ช…์˜ˆ, ๊ถŒ๋ฆฌ ๋˜๋Š” ์ด์ต์„ **์†์ƒ์‹œํ‚ค๊ฑฐ๋‚˜ ํ•ด์น˜๋Š”** ํ–‰์œ„๋ฅผ ํ•˜์ง€ ๋งˆ์„ธ์š”.
---
## ๋ฐ์ดํ„ฐ ์ค€๋น„
> **ํŒ:** ํ”„๋กœ๊ทธ๋ž˜๋ฐ์— ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ, ์ด ๋ฌธ์„œ๋ฅผ Claude Code / Codex CLI / Cursor / Copilot ๋“ฑ์˜ AI ์ฝ”๋”ฉ ๋„๊ตฌ์— ์ „๋‹ฌํ•˜์—ฌ ์ž‘์—…์„ ๋Œ€์‹  ์ˆ˜ํ–‰ํ•˜๊ฒŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
### ๊ฐœ์š”
๊ฐ ๊ณก์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ๋‹ค์Œ ํ•ญ๋ชฉ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:
1. **์˜ค๋””์˜ค ํŒŒ์ผ** โ€” `.mp3`, `.wav`, `.flac`, `.ogg`, `.opus` ํ˜•์‹ ์ง€์›
2. **๊ฐ€์‚ฌ** โ€” ์˜ค๋””์˜ค์™€ ๋™์ผํ•œ ์ด๋ฆ„์˜ `.lyrics.txt` ํŒŒ์ผ (ํ•˜์œ„ ํ˜ธํ™˜์„ ์œ„ํ•ด `.txt`๋„ ์ง€์›)
3. **์–ด๋…ธํ…Œ์ด์…˜ ๋ฐ์ดํ„ฐ** โ€” `caption`, `bpm`, `keyscale`, `timesignature`, `language` ๋“ฑ์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
### ์–ด๋…ธํ…Œ์ด์…˜ ๋ฐ์ดํ„ฐ ํ˜•์‹
์™„์ „ํ•œ ์–ด๋…ธํ…Œ์ด์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋‹ค๋ฉด, JSON ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜์—ฌ ์˜ค๋””์˜ค ๋ฐ ๊ฐ€์‚ฌ์™€ ๊ฐ™์€ ๋””๋ ‰ํ† ๋ฆฌ์— ๋ฐฐ์น˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŒŒ์ผ ๊ตฌ์กฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
```
dataset/
โ”œโ”€โ”€ song1.mp3 # ์˜ค๋””์˜ค
โ”œโ”€โ”€ song1.lyrics.txt # ๊ฐ€์‚ฌ
โ”œโ”€โ”€ song1.json # ์–ด๋…ธํ…Œ์ด์…˜ (์„ ํƒ)
โ”œโ”€โ”€ song1.caption.txt # ์บก์…˜ (์„ ํƒ, JSON์— ํฌํ•จํ•  ์ˆ˜๋„ ์žˆ์Œ)
โ”œโ”€โ”€ song2.mp3
โ”œโ”€โ”€ song2.lyrics.txt
โ”œโ”€โ”€ song2.json
โ””โ”€โ”€ ...
```
JSON ํŒŒ์ผ ๊ตฌ์กฐ (๋ชจ๋“  ํ•„๋“œ๋Š” ์„ ํƒ ์‚ฌํ•ญ):
```json
{
"caption": "A high-energy J-pop track with synthesizer leads and fast tempo",
"bpm": 190,
"keyscale": "D major",
"timesignature": "4",
"language": "ja"
}
```
์–ด๋…ธํ…Œ์ด์…˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, ์ดํ›„ ์„น์…˜์—์„œ ์†Œ๊ฐœํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ทจ๋“ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
---
### ๊ฐ€์‚ฌ
๊ฐ€์‚ฌ๋ฅผ ์˜ค๋””์˜ค ํŒŒ์ผ๊ณผ ๋™์ผํ•œ ์ด๋ฆ„์˜ `.lyrics.txt` ํŒŒ์ผ๋กœ ์ €์žฅํ•˜๊ณ  ๊ฐ™์€ ๋””๋ ‰ํ† ๋ฆฌ์— ๋ฐฐ์น˜ํ•˜์„ธ์š”. ๊ฐ€์‚ฌ์˜ ์ •ํ™•์„ฑ์„ ํ™•์ธํ•ด ์ฃผ์„ธ์š”.
์Šค์บ” ์‹œ ๊ฐ€์‚ฌ ํŒŒ์ผ ๊ฒ€์ƒ‰ ์šฐ์„ ์ˆœ์œ„:
1. `{ํŒŒ์ผ๋ช…}.lyrics.txt` (๊ถŒ์žฅ)
2. `{ํŒŒ์ผ๋ช…}.txt` (ํ•˜์œ„ ํ˜ธํ™˜)
#### ๊ฐ€์‚ฌ ์ „์‚ฌ
๊ธฐ์กด ๊ฐ€์‚ฌ ํ…์ŠคํŠธ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, ๋‹ค์Œ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ „์‚ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
| ๋„๊ตฌ | ๊ตฌ์กฐํ™” ํƒœ๊ทธ | ์ •ํ™•๋„ | ์‚ฌ์šฉ ๋‚œ์ด๋„ | ๋ฐฐํฌ ๋ฐฉ์‹ |
|------|-----------|--------|-----------|----------|
| [acestep-transcriber](https://huggingface.co/ACE-Step/acestep-transcriber) | ์—†์Œ | ์˜ค๋ฅ˜ ๊ฐ€๋Šฅ์„ฑ ์žˆ์Œ | ๋†’์Œ (๋ชจ๋ธ ๋ฐฐํฌ ํ•„์š”) | ์ž์ฒด ํ˜ธ์ŠคํŒ… |
| [Gemini](https://aistudio.google.com/) | ์žˆ์Œ | ์˜ค๋ฅ˜ ๊ฐ€๋Šฅ์„ฑ ์žˆ์Œ | ๋‚ฎ์Œ | ์œ ๋ฃŒ API |
| [Whisper](https://github.com/openai/whisper) | ์—†์Œ | ์˜ค๋ฅ˜ ๊ฐ€๋Šฅ์„ฑ ์žˆ์Œ | ๋ณดํ†ต | ์ž์ฒด ํ˜ธ์ŠคํŒ… / ์œ ๋ฃŒ API |
| [ElevenLabs](https://elevenlabs.io/app/developers) | ์—†์Œ | ์˜ค๋ฅ˜ ๊ฐ€๋Šฅ์„ฑ ์žˆ์Œ | ๋ณดํ†ต | ์œ ๋ฃŒ API (๋ฌด๋ฃŒ ํฌ๋ ˆ๋”ง ์ œ๊ณต) |
๋ณธ ํ”„๋กœ์ ํŠธ๋Š” `scripts/lora_data_prepare/`์— ํ•ด๋‹น ์ „์‚ฌ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:
- `whisper_transcription.py` โ€” OpenAI Whisper API๋ฅผ ํ†ตํ•œ ์ „์‚ฌ
- `elevenlabs_transcription.py` โ€” ElevenLabs Scribe API๋ฅผ ํ†ตํ•œ ์ „์‚ฌ
๋‘ ์Šคํฌ๋ฆฝํŠธ ๋ชจ๋‘ `process_folder()` ๋ฉ”์„œ๋“œ๋ฅผ ํ†ตํ•œ ํด๋” ์ผ๊ด„ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
#### ๊ฒ€ํ†  ๋ฐ ์ •์ œ (ํ•„์ˆ˜)
์ „์‚ฌ๋œ ๊ฐ€์‚ฌ์—๋Š” ์˜ค๋ฅ˜๊ฐ€ ํฌํ•จ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, **๋ฐ˜๋“œ์‹œ ์ˆ˜๋™์œผ๋กœ ๊ฒ€ํ† ํ•˜๊ณ  ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค**.
LRC ํ˜•์‹์˜ ๊ฐ€์‚ฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ, ํƒ€์ž„์Šคํƒฌํ”„๋ฅผ ์ œ๊ฑฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ๊ฐ„๋‹จํ•œ ์ •์ œ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค:
```python
import re
def clean_lrc_content(lines):
"""LRC ํŒŒ์ผ ๋‚ด์šฉ์„ ์ •์ œํ•˜๊ณ  ํƒ€์ž„์Šคํƒฌํ”„๋ฅผ ์ œ๊ฑฐ"""
result = []
for line in lines:
line = line.strip()
if not line:
continue
# ํƒ€์ž„์Šคํƒฌํ”„ ์ œ๊ฑฐ [mm:ss.x] [mm:ss.xx] [mm:ss.xxx]
cleaned = re.sub(r"\[\d{2}:\d{2}\.\d{1,3}\]", "", line)
result.append(cleaned)
# ๋๋ถ€๋ถ„ ๋นˆ ์ค„ ์ œ๊ฑฐ
while result and not result[-1]:
result.pop()
return result
```
#### ๊ตฌ์กฐํ™” ํƒœ๊ทธ (์„ ํƒ)
๊ฐ€์‚ฌ์— ๊ตฌ์กฐํ™” ํƒœ๊ทธ(`[Verse]`, `[Chorus]` ๋“ฑ)๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์œผ๋ฉด, ๋ชจ๋ธ์ด ๊ณก์˜ ๊ตฌ์กฐ๋ฅผ ๋” ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ตฌ์กฐํ™” ํƒœ๊ทธ ์—†์ด๋„ ์ •์ƒ์ ์œผ๋กœ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
> **ํŒ:** [Gemini](https://aistudio.google.com/)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ์กด ๊ฐ€์‚ฌ์— ๊ตฌ์กฐํ™” ํƒœ๊ทธ๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์˜ˆ์‹œ:
```
[Intro]
La la la...
[Verse 1]
Walking down the empty street
Echoes dancing at my feet
[Chorus]
We are the stars tonight
Shining through the endless sky
[Bridge]
Close your eyes and feel the sound
```
---
### ์ž๋™ ์–ด๋…ธํ…Œ์ด์…˜
#### 1. BPM ๋ฐ Key ์ทจ๋“
[Key-BPM-Finder](https://vocalremover.org/key-bpm-finder)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ BPM๊ณผ ํ‚ค ์–ด๋…ธํ…Œ์ด์…˜์„ ์˜จ๋ผ์ธ์œผ๋กœ ์ทจ๋“ํ•ฉ๋‹ˆ๋‹ค:
1. ์›น ํŽ˜์ด์ง€๋ฅผ ์—ด๊ณ  **Browse my files**๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ฒ˜๋ฆฌํ•  ์˜ค๋””์˜ค ํŒŒ์ผ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค (ํ•œ ๋ฒˆ์— ๋„ˆ๋ฌด ๋งŽ์ด ์ฒ˜๋ฆฌํ•˜๋ฉด ๋ฉˆ์ถœ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ๋ถ„ํ•  ์ฒ˜๋ฆฌ ํ›„ CSV๋ฅผ ๋ณ‘ํ•ฉํ•˜๋Š” ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค). ์ฒ˜๋ฆฌ๋Š” ๋กœ์ปฌ์—์„œ ์ˆ˜ํ–‰๋˜๋ฉฐ ์„œ๋ฒ„์— ์—…๋กœ๋“œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
![key-bpm-finder-0.jpg](../pics/key-bpm-finder-0.jpg)
2. ์ฒ˜๋ฆฌ ์™„๋ฃŒ ํ›„, **Export CSV**๋ฅผ ํด๋ฆญํ•˜์—ฌ CSV ํŒŒ์ผ์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
![key-bpm-finder-1.jpg](../pics/key-bpm-finder-1.jpg)
3. CSV ํŒŒ์ผ ๋‚ด์šฉ ์˜ˆ์‹œ:
```csv
File,Artist,Title,BPM,Key,Camelot
song1.wav,,,190,D major,10B
song2.wav,,,128,A minor,8A
```
4. CSV ํŒŒ์ผ์„ ๋ฐ์ดํ„ฐ์…‹ ํด๋”์— ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค. ์บก์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ ค๋ฉด `Camelot` ์—ด ๋’ค์— ์ƒˆ ์—ด์„ ์ถ”๊ฐ€ํ•˜์„ธ์š”.
#### 2. Caption ์ทจ๋“
๋‹ค์Œ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ณก์˜ ์บก์…˜์„ ์ทจ๋“ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
- **acestep-5Hz-lm ์‚ฌ์šฉ** (0.6B / 1.7B / 4B) โ€” Gradio UI์˜ Auto Label ๊ธฐ๋Šฅ์—์„œ ํ˜ธ์ถœ (์ดํ›„ ๋‹จ๊ณ„ ์ฐธ์กฐ)
- **Gemini API ์‚ฌ์šฉ** โ€” ์Šคํฌ๋ฆฝํŠธ `scripts/lora_data_prepare/gemini_caption.py`๋ฅผ ์ฐธ์กฐ. `process_folder()`๋กœ ์ผ๊ด„ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•˜๋ฉฐ, ๊ฐ ์˜ค๋””์˜ค ํŒŒ์ผ์— ๋Œ€ํ•ด ๋‹ค์Œ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
- `{ํŒŒ์ผ๋ช…}.lyrics.txt` โ€” ๊ฐ€์‚ฌ
- `{ํŒŒ์ผ๋ช…}.caption.txt` โ€” ์บก์…˜ ์„ค๋ช…
---
## ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ
๋ฐ์ดํ„ฐ๊ฐ€ ์ค€๋น„๋˜๋ฉด Gradio UI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๊ฒ€ํ†  ๋ฐ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
> **์ค‘์š”:** ์‹œ์ž‘ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ, ์„œ๋น„์Šค ์‚ฌ์ „ ์ดˆ๊ธฐํ™”๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•˜๋„๋ก ์‹œ์ž‘ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
>
> - **Windows** (`start_gradio_ui.bat`): `if not defined INIT_SERVICE set INIT_SERVICE=--init_service true`๋ฅผ `if not defined INIT_SERVICE set INIT_SERVICE=--init_service false`๋กœ ๋ณ€๊ฒฝ
> - **Linux/macOS** (`start_gradio_ui.sh`): `: "${INIT_SERVICE:=--init_service true}"`๋ฅผ `: "${INIT_SERVICE:=--init_service false}"`๋กœ ๋ณ€๊ฒฝ
Gradio UI๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค (์‹œ์ž‘ ์Šคํฌ๋ฆฝํŠธ ๋˜๋Š” `acestep/acestep_v15_pipeline.py` ์ง์ ‘ ์‹คํ–‰).
### ๋‹จ๊ณ„ 1: ๋ชจ๋ธ ๋กœ๋“œ
- **LM์œผ๋กœ ์บก์…˜์„ ์ƒ์„ฑํ•ด์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ:** ์ดˆ๊ธฐํ™” ์‹œ ์‚ฌ์šฉํ•  LM ๋ชจ๋ธ(acestep-5Hz-lm-0.6B / 1.7B / 4B)์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
![](../pics/00_select_model_to_load.jpg)
- **LM์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ:** LM ๋ชจ๋ธ์„ ์„ ํƒํ•˜์ง€ ๋งˆ์„ธ์š”.
![](../pics/00_select_model_to_load_1.jpg)
### ๋‹จ๊ณ„ 2: ๋ฐ์ดํ„ฐ ๋กœ๋“œ
**LoRA Training** ํƒญ์œผ๋กœ ์ „ํ™˜ํ•˜๊ณ , ๋ฐ์ดํ„ฐ์…‹ ๋””๋ ‰ํ† ๋ฆฌ ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•œ ํ›„ **Scan**์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
์Šค์บ๋„ˆ๋Š” ๋‹ค์Œ ํŒŒ์ผ์„ ์ž๋™์œผ๋กœ ์ธ์‹ํ•ฉ๋‹ˆ๋‹ค:
| ํŒŒ์ผ | ์„ค๋ช… |
|------|------|
| `*.mp3` / `*.wav` / `*.flac` / ... | ์˜ค๋””์˜ค ํŒŒ์ผ |
| `{ํŒŒ์ผ๋ช…}.lyrics.txt` (๋˜๋Š” `{ํŒŒ์ผ๋ช…}.txt`) | ๊ฐ€์‚ฌ |
| `{ํŒŒ์ผ๋ช…}.caption.txt` | ์บก์…˜ ์„ค๋ช… |
| `{ํŒŒ์ผ๋ช…}.json` | ์–ด๋…ธํ…Œ์ด์…˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ (caption / bpm / keyscale / timesignature / language) |
| `*.csv` | BPM / Key ์ผ๊ด„ ์–ด๋…ธํ…Œ์ด์…˜ (Key-BPM-Finder์—์„œ ๋‚ด๋ณด๋‚ด๊ธฐ) |
![](../pics/01_load_dataset_path.jpg)
### ๋‹จ๊ณ„ 3: ๋ฐ์ดํ„ฐ์…‹ ๋ฏธ๋ฆฌ๋ณด๊ธฐ ๋ฐ ์กฐ์ •
- **Duration** โ€” ์˜ค๋””์˜ค ํŒŒ์ผ์—์„œ ์ž๋™์œผ๋กœ ์ฝ๊ธฐ
- **Lyrics** โ€” ๋™์ผํ•œ ์ด๋ฆ„์˜ `.lyrics.txt` ํŒŒ์ผ์ด ํ•„์š” (`.txt`๋„ ์ง€์›)
- **Labeled** โ€” ์บก์…˜์ด ์žˆ์œผ๋ฉด โœ…, ์—†์œผ๋ฉด โŒ๋กœ ํ‘œ์‹œ
- **BPM / Key / Caption** โ€” JSON ๋˜๋Š” CSV ํŒŒ์ผ์—์„œ ๋กœ๋“œ
- ๋ฐ์ดํ„ฐ์…‹์ด ๋ชจ๋‘ ์ธ์ŠคํŠธ๋ฃจ๋ฉ˜ํƒˆ์ด ์•„๋‹Œ ๊ฒฝ์šฐ, **All Instrumental** ์ฒดํฌ๋ฅผ ํ•ด์ œํ•˜์„ธ์š”
- **Format Lyrics** ๋ฐ **Transcribe Lyrics** ๊ธฐ๋Šฅ์€ ํ˜„์žฌ ๋น„ํ™œ์„ฑํ™” ์ƒํƒœ์ž…๋‹ˆ๋‹ค ([acestep-transcriber](https://huggingface.co/ACE-Step/acestep-transcriber) ๋ฏธ์—ฐ๋™์œผ๋กœ ์ธํ•ด LM ์ง์ ‘ ์‚ฌ์šฉ ์‹œ ํ™˜๊ฐ ๋ฐœ์ƒ ๊ฐ€๋Šฅ)
- **Custom Trigger Tag**๋ฅผ ์ž…๋ ฅํ•˜์„ธ์š” (ํ˜„์žฌ ํšจ๊ณผ๊ฐ€ ์ œํ•œ์ ์ด๋ฉฐ, `Replace Caption` ์ด์™ธ์˜ ์˜ต์…˜์ด๋ฉด ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค)
- **Genre Ratio**๋Š” ์บก์…˜ ๋Œ€์‹  ์žฅ๋ฅด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ƒ˜ํ”Œ ๋น„์œจ์„ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ LM์ด ์ƒ์„ฑํ•˜๋Š” ์žฅ๋ฅด ์„ค๋ช…์€ ์บก์…˜์— ๋น„ํ•ด ๋ถ€์กฑํ•˜๋ฏ€๋กœ 0์œผ๋กœ ์œ ์ง€ํ•˜์„ธ์š”
![](../pics/02_preview_dataset.jpg)
### ๋‹จ๊ณ„ 4: Auto Label Data
- ์ด๋ฏธ ์บก์…˜์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ์ด ๋‹จ๊ณ„๋ฅผ ๊ฑด๋„ˆ๋›ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
- ๋ฐ์ดํ„ฐ์— ์บก์…˜์ด ์—†๋Š” ๊ฒฝ์šฐ, LM ์ถ”๋ก ์„ ํ†ตํ•ด ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
- BPM / Key ๊ฐ’์ด ์—†๋Š” ๊ฒฝ์šฐ, ๋จผ์ € [Key-BPM-Finder](https://vocalremover.org/key-bpm-finder)๋กœ ์ทจ๋“ํ•˜์„ธ์š”. LM์œผ๋กœ ์ง์ ‘ ์ƒ์„ฑํ•˜๋ฉด ํ™˜๊ฐ์ด ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค
![](../pics/03_label_data.jpg)
### ๋‹จ๊ณ„ 5: ๋ฐ์ดํ„ฐ ๋ฏธ๋ฆฌ๋ณด๊ธฐ ๋ฐ ํŽธ์ง‘
ํ•„์š”ํ•œ ๊ฒฝ์šฐ, ๋ฐ์ดํ„ฐ๋ฅผ ํ•ญ๋ชฉ๋ณ„๋กœ ๊ฒ€ํ† ํ•˜๊ณ  ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. **๊ฐ ๋ฐ์ดํ„ฐ ํŽธ์ง‘ ํ›„ ๋ฐ˜๋“œ์‹œ ์ €์žฅ์„ ํด๋ฆญํ•˜์„ธ์š”.**
![](../pics/04_edit_data.jpg)
### ๋‹จ๊ณ„ 6: ๋ฐ์ดํ„ฐ์…‹ ์ €์žฅ
์ €์žฅ ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•˜๊ณ  ๋ฐ์ดํ„ฐ์…‹์„ JSON ํŒŒ์ผ๋กœ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
![](../pics/05_save_dataset.jpg)
### ๋‹จ๊ณ„ 7: ์ „์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•œ Tensor ํŒŒ์ผ ์ƒ์„ฑ
> **์ฃผ์˜:** ์ด์ „์— LM์œผ๋กœ ์บก์…˜์„ ์ƒ์„ฑํ–ˆ๊ณ  VRAM์ด ๋ถ€์กฑํ•œ ๊ฒฝ์šฐ, ๋จผ์ € Gradio๋ฅผ ์žฌ์‹œ์ž‘ํ•˜์—ฌ VRAM์„ ํ™•๋ณดํ•˜์„ธ์š”. ์žฌ์‹œ์ž‘ ์‹œ **LM ๋ชจ๋ธ์„ ์„ ํƒํ•˜์ง€ ๋งˆ์„ธ์š”**. ๏ฟฝ๏ฟฝ์‹œ์ž‘ ํ›„, ์ €์žฅ๋œ JSON ํŒŒ์ผ์˜ ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•˜๊ณ  ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
Tensor ํŒŒ์ผ ์ €์žฅ ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•˜๊ณ  ์ „์ฒ˜๋ฆฌ๋ฅผ ์‹œ์ž‘ํ•œ ํ›„ ์™„๋ฃŒ๋ฅผ ๊ธฐ๋‹ค๋ฆฝ๋‹ˆ๋‹ค.
![](../pics/06_preprocess_tensor.jpg)
---
## ํ•™์Šต
> **์ฃผ์˜:** Tensor ํŒŒ์ผ ์ƒ์„ฑ ํ›„์—๋„ VRAM์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด Gradio๋ฅผ ์žฌ์‹œ์ž‘ํ•˜๋Š” ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
1. **Train LoRA** ํƒญ์œผ๋กœ ์ „ํ™˜ํ•˜๊ณ , Tensor ํŒŒ์ผ ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
2. ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐ์— ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ, ๊ธฐ๋ณธ๊ฐ’์„ ์‚ฌ์šฉํ•ด๋„ ๋ฉ๋‹ˆ๋‹ค.
### ํŒŒ๋ผ๋ฏธํ„ฐ ์ฐธ๊ณ 
| ํŒŒ๋ผ๋ฏธํ„ฐ | ์„ค๋ช… | ๊ถŒ์žฅ๊ฐ’ |
|---------|------|--------|
| **Max Epochs** | ๋ฐ์ดํ„ฐ์…‹ ํฌ๊ธฐ์— ๋”ฐ๋ผ ์กฐ์ • | ์•ฝ 100๊ณก โ†’ 500 ์—ํฌํฌ; 10โ€“20๊ณก โ†’ 800 ์—ํฌํฌ (์ฐธ๊ณ ์šฉ) |
| **Batch Size** | VRAM์ด ์ถฉ๋ถ„ํ•˜๋ฉด ์ฆ๊ฐ€ ๊ฐ€๋Šฅ | 1 (๊ธฐ๋ณธ๊ฐ’), VRAM์ด ์ถฉ๋ถ„ํ•˜๋ฉด 2 ๋˜๋Š” 4 |
| **Save Every N Epochs** | ์ฒดํฌํฌ์ธํŠธ ์ €์žฅ ๊ฐ„๊ฒฉ | Max Epochs๊ฐ€ ์ž‘์œผ๋ฉด ์งง๊ฒŒ, ํฌ๋ฉด ๊ธธ๊ฒŒ ์„ค์ • |
> ์œ„ ์ˆ˜์น˜๋Š” ์ฐธ๊ณ ์šฉ์ž…๋‹ˆ๋‹ค. ์‹ค์ œ ์ƒํ™ฉ์— ๋งž๊ฒŒ ์กฐ์ •ํ•ด ์ฃผ์„ธ์š”.
3. **Start Training**์„ ํด๋ฆญํ•˜๊ณ  ํ•™์Šต ์™„๋ฃŒ๋ฅผ ๊ธฐ๋‹ค๋ฆฝ๋‹ˆ๋‹ค.
![](../pics/07_train.jpg)
---
## LoRA ์‚ฌ์šฉ
1. ํ•™์Šต ์™„๋ฃŒ ํ›„ **Gradio๋ฅผ ์žฌ์‹œ์ž‘**ํ•˜๊ณ  ๋ชจ๋ธ์„ ๋‹ค์‹œ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค (LM ๋ชจ๋ธ์€ ์„ ํƒํ•˜์ง€ ๋งˆ์„ธ์š”).
2. ๋ชจ๋ธ ์ดˆ๊ธฐํ™” ์™„๋ฃŒ ํ›„, ํ•™์Šต๋œ LoRA ๊ฐ€์ค‘์น˜๋ฅผ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
![](../pics/08_load_lora.jpg)
3. ์Œ์•… ์ƒ์„ฑ์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.
์ถ•ํ•˜ํ•ฉ๋‹ˆ๋‹ค! LoRA ํ•™์Šต์˜ ์ „์ฒด ๊ณผ์ •์„ ์™„๋ฃŒํ–ˆ์Šต๋‹ˆ๋‹ค.
---
## ๊ณ ๊ธ‰ ํ•™์Šต: Side-Step
LoRA ํ•™์Šต์„ ๋” ์„ธ๋ฐ€ํ•˜๊ฒŒ ์ œ์–ดํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด โ€” ์ˆ˜์ •๋œ ํƒ€์ž„์Šคํ… ์ƒ˜ํ”Œ๋ง, LoKR ์–ด๋Œ‘ํ„ฐ, CLI ๊ธฐ๋ฐ˜ ์›Œํฌํ”Œ๋กœ์šฐ, VRAM ์ตœ์ ํ™”, ๊ทธ๋ž˜๋””์–ธํŠธ ๊ฐ๋„ ๋ถ„์„ ๋“ฑ โ€” ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ๊ฐœ๋ฐœํ•œ **[Side-Step](https://github.com/koda-dernet/Side-Step)** ํˆดํ‚ท์ด ๊ณ ๊ธ‰ ๋Œ€์•ˆ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ฌธ์„œ๋Š” ์ด ์ €์žฅ์†Œ์˜ `docs/sidestep/` ๋””๋ ‰ํ† ๋ฆฌ์— ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
| ์ฃผ์ œ | ์„ค๋ช… |
|------|------|
| [Getting Started](../sidestep/Getting%20Started.md) | ์„ค์น˜, ์‚ฌ์ „ ์š”๊ตฌ์‚ฌํ•ญ, ์ฒซ ์‹คํ–‰ ์„ค์ • |
| [End-to-End Tutorial](../sidestep/End-to-End%20Tutorial.md) | ์›๋ณธ ์˜ค๋””์˜ค์—์„œ ์ƒ์„ฑ๊นŒ์ง€ ์ „์ฒด ๊ณผ์ • ์•ˆ๋‚ด |
| [Dataset Preparation](../sidestep/Dataset%20Preparation.md) | JSON ์Šคํ‚ค๋งˆ, ์˜ค๋””์˜ค ํ˜•์‹, ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ํ•„๋“œ, ์ปค์Šคํ…€ ํƒœ๊ทธ |
| [Training Guide](../sidestep/Training%20Guide.md) | LoRA vs LoKR, ์ˆ˜์ • ๋ชจ๋“œ vs ๋ฐ”๋‹๋ผ ๋ชจ๋“œ, ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ€์ด๋“œ |
| [Using Your Adapter](../sidestep/Using%20Your%20Adapter.md) | ์ถœ๋ ฅ ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ, Gradio์—์„œ ๋กœ๋“œ, LoKR ์ œํ•œ์‚ฌํ•ญ |
| [VRAM Optimization Guide](../sidestep/VRAM%20Optimization%20Guide.md) | VRAM ์ตœ์ ํ™” ์ „๋žต ๋ฐ GPU ํ‹ฐ์–ด๋ณ„ ์„ค์ • |
| [Estimation Guide](../sidestep/Estimation%20Guide.md) | ํƒ€๊ฒŸ ํ•™์Šต์„ ์œ„ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ ๊ฐ๋„ ๋ถ„์„ |
| [Shift and Timestep Sampling](../sidestep/Shift%20and%20Timestep%20Sampling.md) | ํ•™์Šต ํƒ€์ž„์Šคํ… ์ž‘๋™ ์›๋ฆฌ์™€ Side-Step์˜ ์ฐจ์ด์  |
| [Preset Management](../sidestep/Preset%20Management.md) | ๋‚ด์žฅ ํ”„๋ฆฌ์…‹, ์ €์žฅ/๋กœ๋“œ/๊ฐ€์ ธ์˜ค๊ธฐ/๋‚ด๋ณด๋‚ด๊ธฐ |
| [The Settings Wizard](../sidestep/The%20Settings%20Wizard.md) | ์œ„์ž๋“œ ์„ค์ • ์ „์ฒด ์ฐธ์กฐ |
| [Model Management](../sidestep/Model%20Management.md) | ์ฒดํฌํฌ์ธํŠธ ๊ตฌ์กฐ ๋ฐ ํŒŒ์ธํŠœ๋‹ ๋ชจ๋ธ ์ง€์› |
| [Windows Notes](../sidestep/Windows%20Notes.md) | Windows ์ „์šฉ ์„ค์ • ๋ฐ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• |