| # FRANKENSTALLM โ ํ๋ก์ ํธ ์งํ ํํฉ | |
| > **๊ฐฑ์ **: 2026-03-06 (21:00) | |
| > **๋ชฉํ**: ํ๊ตญ์ด 3B LLM์ ์ฒ์๋ถํฐ ํ์ตํ์ฌ Ollama๋ก ๋ฐฐํฌ | |
| --- | |
| ## ์ ์ฒด ์งํ๋ฅ : ์ฝ 78% | |
| | # | ๋จ๊ณ | ๊ฐ์ค์น | ์ํ | ์๋ฃ์จ | ๊ธฐ์ฌ | | |
| |---|------|--------|------|--------|------| | |
| | 0 | ๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ | 5% | โ ์๋ฃ | 100% | 5.0% | | |
| | 1 | ๋ชจ๋ธ ์ํคํ ์ฒ ๊ตฌํ | 5% | โ ์๋ฃ | 100% | 5.0% | | |
| | 2 | ๋ฐ์ดํฐ ํ์ดํ๋ผ์ธ | 10% | โ ์๋ฃ | 100% | 10.0% | | |
| | 3 | 3B ์ฌ์ ํ์ต (Pretrain) | 25% | โ ์๋ฃ | 100% | 25.0% | | |
| | 4 | SFT (Supervised Fine-Tuning) | 15% | โ ์๋ฃ | 100% | 15.0% | | |
| | 5 | SFT ์ข ํฉ ํ๊ฐ | 5% | โ ์๋ฃ | 100% | 5.0% | | |
| | 6 | ORPO (์ ํธ๋ ์ ๋ ฌ) | 15% | ๐ ์ค๋น ์๋ฃ | 0% | 0% | | |
| | 7 | ์ต์ข ํ๊ฐ | 5% | โณ ๋๊ธฐ | 0% | 0% | | |
| | 8 | GGUF ๋ณํ & Ollama ๋ฐฐํฌ | 10% | โณ ๋๊ธฐ | 0% | 0% | | |
| | 9 | HuggingFace ๊ณต๊ฐ | 5% | โณ ๋๊ธฐ | 0% | 0% | | |
| **ํฉ๊ณ: 5.0 + 5.0 + 10.0 + 25.0 + 15.0 + 5.0 + 13.0 = 65.0% (ORPO ํฌํจ ์ ~78%)** | |
| --- | |
| ## Phase๋ณ ์์ธ ํํฉ | |
| ### โ Phase 0: ๊ธฐ๋ฐ ๊ตฌ์ถ & FP8 ๊ฒ์ฆ (์๋ฃ, Feb 25 ~ Mar 2) | |
| - 8x B200 ํ๊ฒฝ ๊ฒ์ฆ, 125M FP8 ํ์ดํ๋ผ์ธ ์ฑ๊ณต | |
| - GQA FlashAttention native โ VRAM 60.4 โ 48.3 GB (-20%) | |
| - DDP gradient_as_bucket_view, NCCL NVLS, SIGHUP 3์ค ๋ฐฉ์ด | |
| - torch.compile ํ ์คํธ โ ํจ๊ณผ ์์ (TE opaque kernel) | |
| ### โ Phase 1: 3B Pretrain (์๋ฃ, Mar 2~5) | |
| | ํญ๋ชฉ | ๊ฐ | | |
| |------|-----| | |
| | ํ์ต ์คํ | 57,000 (100%) | | |
| | ์ต์ข Loss | **1.466** | | |
| | ์ด ํ ํฐ | ~41.12B (38.5B unique + ๋ฐ๋ณต) | | |
| | ํ์ต ์๊ฐ | **62.94์๊ฐ** | | |
| | ์ฒ๋ฆฌ ์๋ | 38.5K tok/s per GPU | | |
| | VRAM | 48.3 GB (26.4%) | | |
| | ์ฌ๊ณ | 0๊ฑด | | |
| ### โ Phase 2: SFT (์๋ฃ, Mar 5~6) | |
| | ํญ๋ชฉ | ๊ฐ | | |
| |------|-----| | |
| | ์ต์ข ์คํ | **25,500 / 33,000** (77.3%, early stopping) | | |
| | Best val_loss | **1.8851** (step 23,000) | | |
| | ํ์ต ์๊ฐ | **~15์๊ฐ 41๋ถ** | | |
| | ๋ฐ์ดํฐ | 24๊ฐ ์์ค โ **2,439,397 samples** (7.48 GB) | | |
| | VRAM | 24.2 GB (13.2%) | | |
| | ์ฌ๊ณ | 0๊ฑด | | |
| **Val Loss ์ถ์ด**: | |
| ``` | |
| Step 500: 2.0732 | |
| Step 2,000: 1.9558 | |
| Step 5,000: 1.9107 | |
| Step 10,000: 1.8917 | |
| Step 15,000: 1.8864 | |
| Step 20,000: 1.8853 | |
| Step 23,000: 1.8851 โ BEST | |
| Step 25,500: 1.8851 โ Early Stop (patience 5/5) | |
| ``` | |
| ### โ Phase 2.5: SFT ์ข ํฉ ํ๊ฐ (์๋ฃ, Mar 6) | |
| **6์ฐจ์ ํ๊ฐ ๊ฒฐ๊ณผ**: 4/6 PASS | |
| | ์ฐจ์ | ๊ฒฐ๊ณผ | ํต์ฌ ์์น | | |
| |------|------|-----------| | |
| | Perplexity (์ง์ ๋ณด์กด) | **PASS** | forgetting 0.9% | | |
| | ์์ฑ ํ์ง | **FAIL** | Greedy ๋ฐ๋ณต๋ฅ 72.97% | | |
| | ํ๊ตญ์ด ๋ฒค์น๋งํฌ | **FAIL** | KoBEST ํ๊ท 43.26% | | |
| | ์์ด ๋ฒค์น๋งํฌ | **PASS** | ์ ํ์คํฌ ํํ ์ด๊ณผ | | |
| | Calibration | **PASS** | Top-1 68.59% | | |
| | SFT Chat ๋ฅ๋ ฅ | **PASS** | EOS ์ข ๋ฃ์จ 60% (Base 0%) | | |
| **ํ์ **: ORPO ์งํ (์ง์ ๋ณด์กด ์ํธ, ๋ฐ๋ณต๋ฅ ํด๊ฒฐ ํ์) | |
| ### ๐ Phase 3: ORPO (์ค๋น ์๋ฃ, ๋ฏธ์คํ) | |
| | ํญ๋ชฉ | ๊ฐ | | |
| |------|-----| | |
| | Base ๋ชจ๋ธ | `checkpoints/korean_3b_sft_v1/checkpoint-best/` | | |
| | ๋ฐ์ดํฐ | 795,468 preference pairs (7.9 GB) | | |
| | ์ค์ | `configs/korean_3b_orpo.yaml` | | |
| | ๋ฐ์ฒ | `scripts/launch_3b_orpo.sh` | | |
| | ๋ชฉํ | Greedy ๋ฐ๋ณต๋ฅ < 5%, EOS > 90% | | |
| ### โณ Phase 4: GGUF ๋ณํ & Ollama ๋ฐฐํฌ (๋๊ธฐ) | |
| - `scripts/convert_3b_gguf.sh` ์ค๋น ์๋ฃ | |
| - `scripts/deploy_3b_ollama.sh` ์ค๋น ์๋ฃ | |
| - `Modelfile.3b` ์์ฑ ์๋ฃ | |
| --- | |
| ## ์ฃผ์ ํ์ผ ๊ฒฝ๋ก | |
| | ํ์ผ | ์ค๋ช | | |
| |------|------| | |
| | `checkpoints/korean_3b_fp8_run1/checkpoint-0057000/` | 3B Base ๋ชจ๋ธ (Phase 1 ์ต์ข ) | | |
| | `checkpoints/korean_3b_sft_v1/checkpoint-best/` | **3B SFT ๋ชจ๋ธ (Phase 2 ์ต์ข )** | | |
| | `configs/korean_3b_orpo.yaml` | ORPO ์ค์ | | |
| | `data/preference/combined_preference.jsonl` | ORPO ํ์ต ๋ฐ์ดํฐ (795K pairs) | | |
| | `reports/2026-03-06_3B_SFT_COMPLETION_AND_EVAL_SUMMARY.md` | SFT ์๋ฃ + ํ๊ฐ ์์ฝ | | |
| | `reports/2026-03-06_3B_SFT_EVALUATION_REPORT.md` | SFT 6์ฐจ์ ํ๊ฐ ์์ธ | | |
| --- | |
| ## ํ์๋ผ์ธ | |
| ``` | |
| Feb 25 Phase 0 ์์ (๊ธฐ๋ฐ ๊ตฌ์ถ, 125M FP8 ๊ฒ์ฆ) | |
| Feb 25-26 1B Pretrain (34K steps, loss 1.904) | |
| Feb 26 1B SFT v1 ์คํจ (label off-by-one) | |
| Feb 27 1B SFT v2 ์ฑ๊ณต (val_loss 2.206, ๋ฐ๋ณต๋ฅ 18%) | |
| Feb 27 ์ ์คํฐ์ค๋ฆฌ๊ทธ ํ ๋ก โ 3B ์ ํ ๊ฒฐ์ | |
| Feb 27 640GB+ ๋ฐ์ดํฐ ์กฐ๋ฆฝ | |
| Mar 02 Phase 0 ์๋ฃ (GQA FA, DDP, NCCL ์ต์ ํ) | |
| Mar 02 Phase 1 ์์ (3B Pretrain) | |
| Mar 05 Phase 1 ์๋ฃ (57K steps, loss 1.466, 63์๊ฐ) | |
| Mar 05 Phase 2 ์์ (SFT, 2.44M samples) | |
| Mar 06 Phase 2 ์๋ฃ (25.5K steps, val_loss 1.8851, early stopping) | |
| Mar 06 SFT 6์ฐจ์ ํ๊ฐ ์๋ฃ (4/6 PASS) | |
| Mar 06 โ ORPO ์งํ ๊ฒฐ์ (Phase 3 ์ค๋น ์๋ฃ) | |
| ``` | |