FRANKENSTALLM โ ํ๋ก์ ํธ ์งํ ํํฉ
๊ฐฑ์ : 2026-03-06 (21:00)
๋ชฉํ: ํ๊ตญ์ด 3B LLM์ ์ฒ์๋ถํฐ ํ์ตํ์ฌ Ollama๋ก ๋ฐฐํฌ
์ ์ฒด ์งํ๋ฅ : ์ฝ 78%
| # |
๋จ๊ณ |
๊ฐ์ค์น |
์ํ |
์๋ฃ์จ |
๊ธฐ์ฌ |
| 0 |
๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ |
5% |
โ
์๋ฃ |
100% |
5.0% |
| 1 |
๋ชจ๋ธ ์ํคํ
์ฒ ๊ตฌํ |
5% |
โ
์๋ฃ |
100% |
5.0% |
| 2 |
๋ฐ์ดํฐ ํ์ดํ๋ผ์ธ |
10% |
โ
์๋ฃ |
100% |
10.0% |
| 3 |
3B ์ฌ์ ํ์ต (Pretrain) |
25% |
โ
์๋ฃ |
100% |
25.0% |
| 4 |
SFT (Supervised Fine-Tuning) |
15% |
โ
์๋ฃ |
100% |
15.0% |
| 5 |
SFT ์ข
ํฉ ํ๊ฐ |
5% |
โ
์๋ฃ |
100% |
5.0% |
| 6 |
ORPO (์ ํธ๋ ์ ๋ ฌ) |
15% |
๐ ์ค๋น ์๋ฃ |
0% |
0% |
| 7 |
์ต์ข
ํ๊ฐ |
5% |
โณ ๋๊ธฐ |
0% |
0% |
| 8 |
GGUF ๋ณํ & Ollama ๋ฐฐํฌ |
10% |
โณ ๋๊ธฐ |
0% |
0% |
| 9 |
HuggingFace ๊ณต๊ฐ |
5% |
โณ ๋๊ธฐ |
0% |
0% |
ํฉ๊ณ: 5.0 + 5.0 + 10.0 + 25.0 + 15.0 + 5.0 + 13.0 = 65.0% (ORPO ํฌํจ ์ ~78%)
Phase๋ณ ์์ธ ํํฉ
โ
Phase 0: ๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ (์๋ฃ, Feb 25 ~ Mar 2)
- 8x B200 ํ๊ฒฝ ๊ฒ์ฆ, 125M FP8 ํ์ดํ๋ผ์ธ ์ฑ๊ณต
- GQA FlashAttention native โ VRAM 60.4 โ 48.3 GB (-20%)
- DDP gradient_as_bucket_view, NCCL NVLS, SIGHUP 3์ค ๋ฐฉ์ด
- torch.compile ํ
์คํธ โ ํจ๊ณผ ์์ (TE opaque kernel)
โ
Phase 1: 3B Pretrain (์๋ฃ, Mar 2~5)
| ํญ๋ชฉ |
๊ฐ |
| ํ์ต ์คํ
|
57,000 (100%) |
| ์ต์ข
Loss |
1.466 |
| ์ด ํ ํฐ |
~41.12B (38.5B unique + ๋ฐ๋ณต) |
| ํ์ต ์๊ฐ |
62.94์๊ฐ |
| ์ฒ๋ฆฌ ์๋ |
38.5K tok/s per GPU |
| VRAM |
48.3 GB (26.4%) |
| ์ฌ๊ณ |
0๊ฑด |
โ
Phase 2: SFT (์๋ฃ, Mar 5~6)
| ํญ๋ชฉ |
๊ฐ |
| ์ต์ข
์คํ
|
25,500 / 33,000 (77.3%, early stopping) |
| Best val_loss |
1.8851 (step 23,000) |
| ํ์ต ์๊ฐ |
~15์๊ฐ 41๋ถ |
| ๋ฐ์ดํฐ |
24๊ฐ ์์ค โ 2,439,397 samples (7.48 GB) |
| VRAM |
24.2 GB (13.2%) |
| ์ฌ๊ณ |
0๊ฑด |
Val Loss ์ถ์ด:
Step 500: 2.0732
Step 2,000: 1.9558
Step 5,000: 1.9107
Step 10,000: 1.8917
Step 15,000: 1.8864
Step 20,000: 1.8853
Step 23,000: 1.8851 โ BEST
Step 25,500: 1.8851 โ Early Stop (patience 5/5)
โ
Phase 2.5: SFT ์ข
ํฉ ํ๊ฐ (์๋ฃ, Mar 6)
6์ฐจ์ ํ๊ฐ ๊ฒฐ๊ณผ: 4/6 PASS
| ์ฐจ์ |
๊ฒฐ๊ณผ |
ํต์ฌ ์์น |
| Perplexity (์ง์ ๋ณด์กด) |
PASS |
forgetting 0.9% |
| ์์ฑ ํ์ง |
FAIL |
Greedy ๋ฐ๋ณต๋ฅ 72.97% |
| ํ๊ตญ์ด ๋ฒค์น๋งํฌ |
FAIL |
KoBEST ํ๊ท 43.26% |
| ์์ด ๋ฒค์น๋งํฌ |
PASS |
์ ํ์คํฌ ํํ ์ด๊ณผ |
| Calibration |
PASS |
Top-1 68.59% |
| SFT Chat ๋ฅ๋ ฅ |
PASS |
EOS ์ข
๋ฃ์จ 60% (Base 0%) |
ํ์ : ORPO ์งํ (์ง์ ๋ณด์กด ์ํธ, ๋ฐ๋ณต๋ฅ ํด๊ฒฐ ํ์)
๐ Phase 3: ORPO (์ค๋น ์๋ฃ, ๋ฏธ์คํ)
| ํญ๋ชฉ |
๊ฐ |
| Base ๋ชจ๋ธ |
checkpoints/korean_3b_sft_v1/checkpoint-best/ |
| ๋ฐ์ดํฐ |
795,468 preference pairs (7.9 GB) |
| ์ค์ |
configs/korean_3b_orpo.yaml |
| ๋ฐ์ฒ |
scripts/launch_3b_orpo.sh |
| ๋ชฉํ |
Greedy ๋ฐ๋ณต๋ฅ < 5%, EOS > 90% |
โณ Phase 4: GGUF ๋ณํ & Ollama ๋ฐฐํฌ (๋๊ธฐ)
scripts/convert_3b_gguf.sh ์ค๋น ์๋ฃ
scripts/deploy_3b_ollama.sh ์ค๋น ์๋ฃ
Modelfile.3b ์์ฑ ์๋ฃ
์ฃผ์ ํ์ผ ๊ฒฝ๋ก
| ํ์ผ |
์ค๋ช
|
checkpoints/korean_3b_fp8_run1/checkpoint-0057000/ |
3B Base ๋ชจ๋ธ (Phase 1 ์ต์ข
) |
checkpoints/korean_3b_sft_v1/checkpoint-best/ |
3B SFT ๋ชจ๋ธ (Phase 2 ์ต์ข
) |
configs/korean_3b_orpo.yaml |
ORPO ์ค์ |
data/preference/combined_preference.jsonl |
ORPO ํ์ต ๋ฐ์ดํฐ (795K pairs) |
reports/2026-03-06_3B_SFT_COMPLETION_AND_EVAL_SUMMARY.md |
SFT ์๋ฃ + ํ๊ฐ ์์ฝ |
reports/2026-03-06_3B_SFT_EVALUATION_REPORT.md |
SFT 6์ฐจ์ ํ๊ฐ ์์ธ |
ํ์๋ผ์ธ
Feb 25 Phase 0 ์์ (๊ธฐ๋ฐ ๊ตฌ์ถ, 125M FP8 ๊ฒ์ฆ)
Feb 25-26 1B Pretrain (34K steps, loss 1.904)
Feb 26 1B SFT v1 ์คํจ (label off-by-one)
Feb 27 1B SFT v2 ์ฑ๊ณต (val_loss 2.206, ๋ฐ๋ณต๋ฅ 18%)
Feb 27 ์ ์คํฐ์ค๋ฆฌ๊ทธ ํ ๋ก โ 3B ์ ํ ๊ฒฐ์
Feb 27 640GB+ ๋ฐ์ดํฐ ์กฐ๋ฆฝ
Mar 02 Phase 0 ์๋ฃ (GQA FA, DDP, NCCL ์ต์ ํ)
Mar 02 Phase 1 ์์ (3B Pretrain)
Mar 05 Phase 1 ์๋ฃ (57K steps, loss 1.466, 63์๊ฐ)
Mar 05 Phase 2 ์์ (SFT, 2.44M samples)
Mar 06 Phase 2 ์๋ฃ (25.5K steps, val_loss 1.8851, early stopping)
Mar 06 SFT 6์ฐจ์ ํ๊ฐ ์๋ฃ (4/6 PASS)
Mar 06 โ ORPO ์งํ ๊ฒฐ์ (Phase 3 ์ค๋น ์๋ฃ)