File size: 4,603 Bytes
1e78479 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | # FRANKENSTALLM โ ํ๋ก์ ํธ ์งํ ํํฉ
> **๊ฐฑ์ **: 2026-03-06 (21:00)
> **๋ชฉํ**: ํ๊ตญ์ด 3B LLM์ ์ฒ์๋ถํฐ ํ์ตํ์ฌ Ollama๋ก ๋ฐฐํฌ
---
## ์ ์ฒด ์งํ๋ฅ : ์ฝ 78%
| # | ๋จ๊ณ | ๊ฐ์ค์น | ์ํ | ์๋ฃ์จ | ๊ธฐ์ฌ |
|---|------|--------|------|--------|------|
| 0 | ๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ | 5% | โ
์๋ฃ | 100% | 5.0% |
| 1 | ๋ชจ๋ธ ์ํคํ
์ฒ ๊ตฌํ | 5% | โ
์๋ฃ | 100% | 5.0% |
| 2 | ๋ฐ์ดํฐ ํ์ดํ๋ผ์ธ | 10% | โ
์๋ฃ | 100% | 10.0% |
| 3 | 3B ์ฌ์ ํ์ต (Pretrain) | 25% | โ
์๋ฃ | 100% | 25.0% |
| 4 | SFT (Supervised Fine-Tuning) | 15% | โ
์๋ฃ | 100% | 15.0% |
| 5 | SFT ์ข
ํฉ ํ๊ฐ | 5% | โ
์๋ฃ | 100% | 5.0% |
| 6 | ORPO (์ ํธ๋ ์ ๋ ฌ) | 15% | ๐ ์ค๋น ์๋ฃ | 0% | 0% |
| 7 | ์ต์ข
ํ๊ฐ | 5% | โณ ๋๊ธฐ | 0% | 0% |
| 8 | GGUF ๋ณํ & Ollama ๋ฐฐํฌ | 10% | โณ ๋๊ธฐ | 0% | 0% |
| 9 | HuggingFace ๊ณต๊ฐ | 5% | โณ ๋๊ธฐ | 0% | 0% |
**ํฉ๊ณ: 5.0 + 5.0 + 10.0 + 25.0 + 15.0 + 5.0 + 13.0 = 65.0% (ORPO ํฌํจ ์ ~78%)**
---
## Phase๋ณ ์์ธ ํํฉ
### โ
Phase 0: ๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ (์๋ฃ, Feb 25 ~ Mar 2)
- 8x B200 ํ๊ฒฝ ๊ฒ์ฆ, 125M FP8 ํ์ดํ๋ผ์ธ ์ฑ๊ณต
- GQA FlashAttention native โ VRAM 60.4 โ 48.3 GB (-20%)
- DDP gradient_as_bucket_view, NCCL NVLS, SIGHUP 3์ค ๋ฐฉ์ด
- torch.compile ํ
์คํธ โ ํจ๊ณผ ์์ (TE opaque kernel)
### โ
Phase 1: 3B Pretrain (์๋ฃ, Mar 2~5)
| ํญ๋ชฉ | ๊ฐ |
|------|-----|
| ํ์ต ์คํ
| 57,000 (100%) |
| ์ต์ข
Loss | **1.466** |
| ์ด ํ ํฐ | ~41.12B (38.5B unique + ๋ฐ๋ณต) |
| ํ์ต ์๊ฐ | **62.94์๊ฐ** |
| ์ฒ๋ฆฌ ์๋ | 38.5K tok/s per GPU |
| VRAM | 48.3 GB (26.4%) |
| ์ฌ๊ณ | 0๊ฑด |
### โ
Phase 2: SFT (์๋ฃ, Mar 5~6)
| ํญ๋ชฉ | ๊ฐ |
|------|-----|
| ์ต์ข
์คํ
| **25,500 / 33,000** (77.3%, early stopping) |
| Best val_loss | **1.8851** (step 23,000) |
| ํ์ต ์๊ฐ | **~15์๊ฐ 41๋ถ** |
| ๋ฐ์ดํฐ | 24๊ฐ ์์ค โ **2,439,397 samples** (7.48 GB) |
| VRAM | 24.2 GB (13.2%) |
| ์ฌ๊ณ | 0๊ฑด |
**Val Loss ์ถ์ด**:
```
Step 500: 2.0732
Step 2,000: 1.9558
Step 5,000: 1.9107
Step 10,000: 1.8917
Step 15,000: 1.8864
Step 20,000: 1.8853
Step 23,000: 1.8851 โ BEST
Step 25,500: 1.8851 โ Early Stop (patience 5/5)
```
### โ
Phase 2.5: SFT ์ข
ํฉ ํ๊ฐ (์๋ฃ, Mar 6)
**6์ฐจ์ ํ๊ฐ ๊ฒฐ๊ณผ**: 4/6 PASS
| ์ฐจ์ | ๊ฒฐ๊ณผ | ํต์ฌ ์์น |
|------|------|-----------|
| Perplexity (์ง์ ๋ณด์กด) | **PASS** | forgetting 0.9% |
| ์์ฑ ํ์ง | **FAIL** | Greedy ๋ฐ๋ณต๋ฅ 72.97% |
| ํ๊ตญ์ด ๋ฒค์น๋งํฌ | **FAIL** | KoBEST ํ๊ท 43.26% |
| ์์ด ๋ฒค์น๋งํฌ | **PASS** | ์ ํ์คํฌ ํํ ์ด๊ณผ |
| Calibration | **PASS** | Top-1 68.59% |
| SFT Chat ๋ฅ๋ ฅ | **PASS** | EOS ์ข
๋ฃ์จ 60% (Base 0%) |
**ํ์ **: ORPO ์งํ (์ง์ ๋ณด์กด ์ํธ, ๋ฐ๋ณต๋ฅ ํด๊ฒฐ ํ์)
### ๐ Phase 3: ORPO (์ค๋น ์๋ฃ, ๋ฏธ์คํ)
| ํญ๋ชฉ | ๊ฐ |
|------|-----|
| Base ๋ชจ๋ธ | `checkpoints/korean_3b_sft_v1/checkpoint-best/` |
| ๋ฐ์ดํฐ | 795,468 preference pairs (7.9 GB) |
| ์ค์ | `configs/korean_3b_orpo.yaml` |
| ๋ฐ์ฒ | `scripts/launch_3b_orpo.sh` |
| ๋ชฉํ | Greedy ๋ฐ๋ณต๋ฅ < 5%, EOS > 90% |
### โณ Phase 4: GGUF ๋ณํ & Ollama ๋ฐฐํฌ (๋๊ธฐ)
- `scripts/convert_3b_gguf.sh` ์ค๋น ์๋ฃ
- `scripts/deploy_3b_ollama.sh` ์ค๋น ์๋ฃ
- `Modelfile.3b` ์์ฑ ์๋ฃ
---
## ์ฃผ์ ํ์ผ ๊ฒฝ๋ก
| ํ์ผ | ์ค๋ช
|
|------|------|
| `checkpoints/korean_3b_fp8_run1/checkpoint-0057000/` | 3B Base ๋ชจ๋ธ (Phase 1 ์ต์ข
) |
| `checkpoints/korean_3b_sft_v1/checkpoint-best/` | **3B SFT ๋ชจ๋ธ (Phase 2 ์ต์ข
)** |
| `configs/korean_3b_orpo.yaml` | ORPO ์ค์ |
| `data/preference/combined_preference.jsonl` | ORPO ํ์ต ๋ฐ์ดํฐ (795K pairs) |
| `reports/2026-03-06_3B_SFT_COMPLETION_AND_EVAL_SUMMARY.md` | SFT ์๋ฃ + ํ๊ฐ ์์ฝ |
| `reports/2026-03-06_3B_SFT_EVALUATION_REPORT.md` | SFT 6์ฐจ์ ํ๊ฐ ์์ธ |
---
## ํ์๋ผ์ธ
```
Feb 25 Phase 0 ์์ (๊ธฐ๋ฐ ๊ตฌ์ถ, 125M FP8 ๊ฒ์ฆ)
Feb 25-26 1B Pretrain (34K steps, loss 1.904)
Feb 26 1B SFT v1 ์คํจ (label off-by-one)
Feb 27 1B SFT v2 ์ฑ๊ณต (val_loss 2.206, ๋ฐ๋ณต๋ฅ 18%)
Feb 27 ์ ์คํฐ์ค๋ฆฌ๊ทธ ํ ๋ก โ 3B ์ ํ ๊ฒฐ์
Feb 27 640GB+ ๋ฐ์ดํฐ ์กฐ๋ฆฝ
Mar 02 Phase 0 ์๋ฃ (GQA FA, DDP, NCCL ์ต์ ํ)
Mar 02 Phase 1 ์์ (3B Pretrain)
Mar 05 Phase 1 ์๋ฃ (57K steps, loss 1.466, 63์๊ฐ)
Mar 05 Phase 2 ์์ (SFT, 2.44M samples)
Mar 06 Phase 2 ์๋ฃ (25.5K steps, val_loss 1.8851, early stopping)
Mar 06 SFT 6์ฐจ์ ํ๊ฐ ์๋ฃ (4/6 PASS)
Mar 06 โ ORPO ์งํ ๊ฒฐ์ (Phase 3 ์ค๋น ์๋ฃ)
```
|