frankenstallm / source /PROGRESS.md
pathcosmos's picture
Upload source/PROGRESS.md with huggingface_hub (#33)
1e78479
# FRANKENSTALLM โ€” ํ”„๋กœ์ ํŠธ ์ง„ํ–‰ ํ˜„ํ™ฉ
> **๊ฐฑ์‹ **: 2026-03-06 (21:00)
> **๋ชฉํ‘œ**: ํ•œ๊ตญ์–ด 3B LLM์„ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šตํ•˜์—ฌ Ollama๋กœ ๋ฐฐํฌ
---
## ์ „์ฒด ์ง„ํ–‰๋ฅ : ์•ฝ 78%
| # | ๋‹จ๊ณ„ | ๊ฐ€์ค‘์น˜ | ์ƒํƒœ | ์™„๋ฃŒ์œจ | ๊ธฐ์—ฌ |
|---|------|--------|------|--------|------|
| 0 | ๊ธฐ๋ฐ˜ ๊ตฌ์ถ• & FP8 ๊ฒ€์ฆ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
| 1 | ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ๊ตฌํ˜„ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
| 2 | ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ | 10% | โœ… ์™„๋ฃŒ | 100% | 10.0% |
| 3 | 3B ์‚ฌ์ „ํ•™์Šต (Pretrain) | 25% | โœ… ์™„๋ฃŒ | 100% | 25.0% |
| 4 | SFT (Supervised Fine-Tuning) | 15% | โœ… ์™„๋ฃŒ | 100% | 15.0% |
| 5 | SFT ์ข…ํ•ฉ ํ‰๊ฐ€ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
| 6 | ORPO (์„ ํ˜ธ๋„ ์ •๋ ฌ) | 15% | ๐Ÿ“‹ ์ค€๋น„ ์™„๋ฃŒ | 0% | 0% |
| 7 | ์ตœ์ข… ํ‰๊ฐ€ | 5% | โณ ๋Œ€๊ธฐ | 0% | 0% |
| 8 | GGUF ๋ณ€ํ™˜ & Ollama ๋ฐฐํฌ | 10% | โณ ๋Œ€๊ธฐ | 0% | 0% |
| 9 | HuggingFace ๊ณต๊ฐœ | 5% | โณ ๋Œ€๊ธฐ | 0% | 0% |
**ํ•ฉ๊ณ„: 5.0 + 5.0 + 10.0 + 25.0 + 15.0 + 5.0 + 13.0 = 65.0% (ORPO ํฌํ•จ ์‹œ ~78%)**
---
## Phase๋ณ„ ์ƒ์„ธ ํ˜„ํ™ฉ
### โœ… Phase 0: ๊ธฐ๋ฐ˜ ๊ตฌ์ถ• & FP8 ๊ฒ€์ฆ (์™„๋ฃŒ, Feb 25 ~ Mar 2)
- 8x B200 ํ™˜๊ฒฝ ๊ฒ€์ฆ, 125M FP8 ํŒŒ์ดํ”„๋ผ์ธ ์„ฑ๊ณต
- GQA FlashAttention native โ†’ VRAM 60.4 โ†’ 48.3 GB (-20%)
- DDP gradient_as_bucket_view, NCCL NVLS, SIGHUP 3์ค‘ ๋ฐฉ์–ด
- torch.compile ํ…Œ์ŠคํŠธ โ†’ ํšจ๊ณผ ์—†์Œ (TE opaque kernel)
### โœ… Phase 1: 3B Pretrain (์™„๋ฃŒ, Mar 2~5)
| ํ•ญ๋ชฉ | ๊ฐ’ |
|------|-----|
| ํ•™์Šต ์Šคํ… | 57,000 (100%) |
| ์ตœ์ข… Loss | **1.466** |
| ์ด ํ† ํฐ | ~41.12B (38.5B unique + ๋ฐ˜๋ณต) |
| ํ•™์Šต ์‹œ๊ฐ„ | **62.94์‹œ๊ฐ„** |
| ์ฒ˜๋ฆฌ ์†๋„ | 38.5K tok/s per GPU |
| VRAM | 48.3 GB (26.4%) |
| ์‚ฌ๊ณ  | 0๊ฑด |
### โœ… Phase 2: SFT (์™„๋ฃŒ, Mar 5~6)
| ํ•ญ๋ชฉ | ๊ฐ’ |
|------|-----|
| ์ตœ์ข… ์Šคํ… | **25,500 / 33,000** (77.3%, early stopping) |
| Best val_loss | **1.8851** (step 23,000) |
| ํ•™์Šต ์‹œ๊ฐ„ | **~15์‹œ๊ฐ„ 41๋ถ„** |
| ๋ฐ์ดํ„ฐ | 24๊ฐœ ์†Œ์Šค โ†’ **2,439,397 samples** (7.48 GB) |
| VRAM | 24.2 GB (13.2%) |
| ์‚ฌ๊ณ  | 0๊ฑด |
**Val Loss ์ถ”์ด**:
```
Step 500: 2.0732
Step 2,000: 1.9558
Step 5,000: 1.9107
Step 10,000: 1.8917
Step 15,000: 1.8864
Step 20,000: 1.8853
Step 23,000: 1.8851 โ† BEST
Step 25,500: 1.8851 โ†’ Early Stop (patience 5/5)
```
### โœ… Phase 2.5: SFT ์ข…ํ•ฉ ํ‰๊ฐ€ (์™„๋ฃŒ, Mar 6)
**6์ฐจ์› ํ‰๊ฐ€ ๊ฒฐ๊ณผ**: 4/6 PASS
| ์ฐจ์› | ๊ฒฐ๊ณผ | ํ•ต์‹ฌ ์ˆ˜์น˜ |
|------|------|-----------|
| Perplexity (์ง€์‹ ๋ณด์กด) | **PASS** | forgetting 0.9% |
| ์ƒ์„ฑ ํ’ˆ์งˆ | **FAIL** | Greedy ๋ฐ˜๋ณต๋ฅ  72.97% |
| ํ•œ๊ตญ์–ด ๋ฒค์น˜๋งˆํฌ | **FAIL** | KoBEST ํ‰๊ท  43.26% |
| ์˜์–ด ๋ฒค์น˜๋งˆํฌ | **PASS** | ์ „ ํƒœ์Šคํฌ ํ•˜ํ•œ ์ดˆ๊ณผ |
| Calibration | **PASS** | Top-1 68.59% |
| SFT Chat ๋Šฅ๋ ฅ | **PASS** | EOS ์ข…๋ฃŒ์œจ 60% (Base 0%) |
**ํŒ์ •**: ORPO ์ง„ํ–‰ (์ง€์‹ ๋ณด์กด ์–‘ํ˜ธ, ๋ฐ˜๋ณต๋ฅ  ํ•ด๊ฒฐ ํ•„์š”)
### ๐Ÿ“‹ Phase 3: ORPO (์ค€๋น„ ์™„๋ฃŒ, ๋ฏธ์‹คํ–‰)
| ํ•ญ๋ชฉ | ๊ฐ’ |
|------|-----|
| Base ๋ชจ๋ธ | `checkpoints/korean_3b_sft_v1/checkpoint-best/` |
| ๋ฐ์ดํ„ฐ | 795,468 preference pairs (7.9 GB) |
| ์„ค์ • | `configs/korean_3b_orpo.yaml` |
| ๋Ÿฐ์ฒ˜ | `scripts/launch_3b_orpo.sh` |
| ๋ชฉํ‘œ | Greedy ๋ฐ˜๋ณต๋ฅ  < 5%, EOS > 90% |
### โณ Phase 4: GGUF ๋ณ€ํ™˜ & Ollama ๋ฐฐํฌ (๋Œ€๊ธฐ)
- `scripts/convert_3b_gguf.sh` ์ค€๋น„ ์™„๋ฃŒ
- `scripts/deploy_3b_ollama.sh` ์ค€๋น„ ์™„๋ฃŒ
- `Modelfile.3b` ์ž‘์„ฑ ์™„๋ฃŒ
---
## ์ฃผ์š” ํŒŒ์ผ ๊ฒฝ๋กœ
| ํŒŒ์ผ | ์„ค๋ช… |
|------|------|
| `checkpoints/korean_3b_fp8_run1/checkpoint-0057000/` | 3B Base ๋ชจ๋ธ (Phase 1 ์ตœ์ข…) |
| `checkpoints/korean_3b_sft_v1/checkpoint-best/` | **3B SFT ๋ชจ๋ธ (Phase 2 ์ตœ์ข…)** |
| `configs/korean_3b_orpo.yaml` | ORPO ์„ค์ • |
| `data/preference/combined_preference.jsonl` | ORPO ํ•™์Šต ๋ฐ์ดํ„ฐ (795K pairs) |
| `reports/2026-03-06_3B_SFT_COMPLETION_AND_EVAL_SUMMARY.md` | SFT ์™„๋ฃŒ + ํ‰๊ฐ€ ์š”์•ฝ |
| `reports/2026-03-06_3B_SFT_EVALUATION_REPORT.md` | SFT 6์ฐจ์› ํ‰๊ฐ€ ์ƒ์„ธ |
---
## ํƒ€์ž„๋ผ์ธ
```
Feb 25 Phase 0 ์‹œ์ž‘ (๊ธฐ๋ฐ˜ ๊ตฌ์ถ•, 125M FP8 ๊ฒ€์ฆ)
Feb 25-26 1B Pretrain (34K steps, loss 1.904)
Feb 26 1B SFT v1 ์‹คํŒจ (label off-by-one)
Feb 27 1B SFT v2 ์„ฑ๊ณต (val_loss 2.206, ๋ฐ˜๋ณต๋ฅ  18%)
Feb 27 ์ €์Šคํ‹ฐ์Šค๋ฆฌ๊ทธ ํ† ๋ก  โ†’ 3B ์ „ํ™˜ ๊ฒฐ์ •
Feb 27 640GB+ ๋ฐ์ดํ„ฐ ์กฐ๋ฆฝ
Mar 02 Phase 0 ์™„๋ฃŒ (GQA FA, DDP, NCCL ์ตœ์ ํ™”)
Mar 02 Phase 1 ์‹œ์ž‘ (3B Pretrain)
Mar 05 Phase 1 ์™„๋ฃŒ (57K steps, loss 1.466, 63์‹œ๊ฐ„)
Mar 05 Phase 2 ์‹œ์ž‘ (SFT, 2.44M samples)
Mar 06 Phase 2 ์™„๋ฃŒ (25.5K steps, val_loss 1.8851, early stopping)
Mar 06 SFT 6์ฐจ์› ํ‰๊ฐ€ ์™„๋ฃŒ (4/6 PASS)
Mar 06 โ†’ ORPO ์ง„ํ–‰ ๊ฒฐ์ • (Phase 3 ์ค€๋น„ ์™„๋ฃŒ)
```