somebody-to-love commited on
Commit
b85cbc9
ยท
verified ยท
1 Parent(s): 3d85abb

Upload source/PROGRESS.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. source/PROGRESS.md +133 -0
source/PROGRESS.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FRANKENSTALLM โ€” ํ”„๋กœ์ ํŠธ ์ง„ํ–‰ ํ˜„ํ™ฉ
2
+
3
+ > **๊ฐฑ์‹ **: 2026-03-06 (21:00)
4
+ > **๋ชฉํ‘œ**: ํ•œ๊ตญ์–ด 3B LLM์„ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šตํ•˜์—ฌ Ollama๋กœ ๋ฐฐํฌ
5
+
6
+ ---
7
+
8
+ ## ์ „์ฒด ์ง„ํ–‰๋ฅ : ์•ฝ 78%
9
+
10
+ | # | ๋‹จ๊ณ„ | ๊ฐ€์ค‘์น˜ | ์ƒํƒœ | ์™„๋ฃŒ์œจ | ๊ธฐ์—ฌ |
11
+ |---|------|--------|------|--------|------|
12
+ | 0 | ๊ธฐ๋ฐ˜ ๊ตฌ์ถ• & FP8 ๊ฒ€์ฆ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
13
+ | 1 | ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ๊ตฌํ˜„ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
14
+ | 2 | ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ | 10% | โœ… ์™„๋ฃŒ | 100% | 10.0% |
15
+ | 3 | 3B ์‚ฌ์ „ํ•™์Šต (Pretrain) | 25% | โœ… ์™„๋ฃŒ | 100% | 25.0% |
16
+ | 4 | SFT (Supervised Fine-Tuning) | 15% | โœ… ์™„๋ฃŒ | 100% | 15.0% |
17
+ | 5 | SFT ์ข…ํ•ฉ ํ‰๊ฐ€ | 5% | โœ… ์™„๋ฃŒ | 100% | 5.0% |
18
+ | 6 | ORPO (์„ ํ˜ธ๋„ ์ •๋ ฌ) | 15% | ๐Ÿ“‹ ์ค€๋น„ ์™„๋ฃŒ | 0% | 0% |
19
+ | 7 | ์ตœ์ข… ํ‰๊ฐ€ | 5% | โณ ๋Œ€๊ธฐ | 0% | 0% |
20
+ | 8 | GGUF ๋ณ€ํ™˜ & Ollama ๋ฐฐํฌ | 10% | โณ ๋Œ€๊ธฐ | 0% | 0% |
21
+ | 9 | HuggingFace ๊ณต๊ฐœ | 5% | โณ ๋Œ€๊ธฐ | 0% | 0% |
22
+
23
+ **ํ•ฉ๊ณ„: 5.0 + 5.0 + 10.0 + 25.0 + 15.0 + 5.0 + 13.0 = 65.0% (ORPO ํฌํ•จ ์‹œ ~78%)**
24
+
25
+ ---
26
+
27
+ ## Phase๋ณ„ ์ƒ์„ธ ํ˜„ํ™ฉ
28
+
29
+ ### โœ… Phase 0: ๊ธฐ๋ฐ˜ ๊ตฌ์ถ• & FP8 ๊ฒ€์ฆ (์™„๋ฃŒ, Feb 25 ~ Mar 2)
30
+
31
+ - 8x B200 ํ™˜๊ฒฝ ๊ฒ€์ฆ, 125M FP8 ํŒŒ์ดํ”„๋ผ์ธ ์„ฑ๊ณต
32
+ - GQA FlashAttention native โ†’ VRAM 60.4 โ†’ 48.3 GB (-20%)
33
+ - DDP gradient_as_bucket_view, NCCL NVLS, SIGHUP 3์ค‘ ๋ฐฉ์–ด
34
+ - torch.compile ํ…Œ์ŠคํŠธ โ†’ ํšจ๊ณผ ์—†์Œ (TE opaque kernel)
35
+
36
+ ### โœ… Phase 1: 3B Pretrain (์™„๋ฃŒ, Mar 2~5)
37
+
38
+ | ํ•ญ๋ชฉ | ๊ฐ’ |
39
+ |------|-----|
40
+ | ํ•™์Šต ์Šคํ… | 57,000 (100%) |
41
+ | ์ตœ์ข… Loss | **1.466** |
42
+ | ์ด ํ† ํฐ | ~41.12B (38.5B unique + ๋ฐ˜๋ณต) |
43
+ | ํ•™์Šต ์‹œ๊ฐ„ | **62.94์‹œ๊ฐ„** |
44
+ | ์ฒ˜๋ฆฌ ์†๋„ | 38.5K tok/s per GPU |
45
+ | VRAM | 48.3 GB (26.4%) |
46
+ | ์‚ฌ๊ณ  | 0๊ฑด |
47
+
48
+ ### โœ… Phase 2: SFT (์™„๋ฃŒ, Mar 5~6)
49
+
50
+ | ํ•ญ๋ชฉ | ๊ฐ’ |
51
+ |------|-----|
52
+ | ์ตœ์ข… ์Šคํ… | **25,500 / 33,000** (77.3%, early stopping) |
53
+ | Best val_loss | **1.8851** (step 23,000) |
54
+ | ํ•™์Šต ์‹œ๊ฐ„ | **~15์‹œ๊ฐ„ 41๋ถ„** |
55
+ | ๋ฐ์ดํ„ฐ | 24๊ฐœ ์†Œ์Šค โ†’ **2,439,397 samples** (7.48 GB) |
56
+ | VRAM | 24.2 GB (13.2%) |
57
+ | ์‚ฌ๊ณ  | 0๊ฑด |
58
+
59
+ **Val Loss ์ถ”์ด**:
60
+ ```
61
+ Step 500: 2.0732
62
+ Step 2,000: 1.9558
63
+ Step 5,000: 1.9107
64
+ Step 10,000: 1.8917
65
+ Step 15,000: 1.8864
66
+ Step 20,000: 1.8853
67
+ Step 23,000: 1.8851 โ† BEST
68
+ Step 25,500: 1.8851 โ†’ Early Stop (patience 5/5)
69
+ ```
70
+
71
+ ### โœ… Phase 2.5: SFT ์ข…ํ•ฉ ํ‰๊ฐ€ (์™„๋ฃŒ, Mar 6)
72
+
73
+ **6์ฐจ์› ํ‰๊ฐ€ ๊ฒฐ๊ณผ**: 4/6 PASS
74
+
75
+ | ์ฐจ์› | ๊ฒฐ๊ณผ | ํ•ต์‹ฌ ์ˆ˜์น˜ |
76
+ |------|------|-----------|
77
+ | Perplexity (์ง€์‹ ๋ณด์กด) | **PASS** | forgetting 0.9% |
78
+ | ์ƒ์„ฑ ํ’ˆ์งˆ | **FAIL** | Greedy ๋ฐ˜๋ณต๋ฅ  72.97% |
79
+ | ํ•œ๊ตญ์–ด ๋ฒค์น˜๋งˆํฌ | **FAIL** | KoBEST ํ‰๊ท  43.26% |
80
+ | ์˜์–ด ๋ฒค์น˜๋งˆํฌ | **PASS** | ์ „ ํƒœ์Šคํฌ ํ•˜ํ•œ ์ดˆ๊ณผ |
81
+ | Calibration | **PASS** | Top-1 68.59% |
82
+ | SFT Chat ๋Šฅ๋ ฅ | **PASS** | EOS ์ข…๋ฃŒ์œจ 60% (Base 0%) |
83
+
84
+ **ํŒ์ •**: ORPO ์ง„ํ–‰ (์ง€์‹ ๋ณด์กด ์–‘ํ˜ธ, ๋ฐ˜๋ณต๋ฅ  ํ•ด๊ฒฐ ํ•„์š”)
85
+
86
+ ### ๐Ÿ“‹ Phase 3: ORPO (์ค€๋น„ ์™„๋ฃŒ, ๋ฏธ์‹คํ–‰)
87
+
88
+ | ํ•ญ๋ชฉ | ๊ฐ’ |
89
+ |------|-----|
90
+ | Base ๋ชจ๋ธ | `checkpoints/korean_3b_sft_v1/checkpoint-best/` |
91
+ | ๋ฐ์ดํ„ฐ | 795,468 preference pairs (7.9 GB) |
92
+ | ์„ค์ • | `configs/korean_3b_orpo.yaml` |
93
+ | ๋Ÿฐ์ฒ˜ | `scripts/launch_3b_orpo.sh` |
94
+ | ๋ชฉํ‘œ | Greedy ๋ฐ˜๋ณต๋ฅ  < 5%, EOS > 90% |
95
+
96
+ ### โณ Phase 4: GGUF ๋ณ€ํ™˜ & Ollama ๋ฐฐํฌ (๋Œ€๊ธฐ)
97
+
98
+ - `scripts/convert_3b_gguf.sh` ์ค€๋น„ ์™„๋ฃŒ
99
+ - `scripts/deploy_3b_ollama.sh` ์ค€๋น„ ์™„๋ฃŒ
100
+ - `Modelfile.3b` ์ž‘์„ฑ ์™„๋ฃŒ
101
+
102
+ ---
103
+
104
+ ## ์ฃผ์š” ํŒŒ์ผ ๊ฒฝ๋กœ
105
+
106
+ | ํŒŒ์ผ | ์„ค๋ช… |
107
+ |------|------|
108
+ | `checkpoints/korean_3b_fp8_run1/checkpoint-0057000/` | 3B Base ๋ชจ๋ธ (Phase 1 ์ตœ์ข…) |
109
+ | `checkpoints/korean_3b_sft_v1/checkpoint-best/` | **3B SFT ๋ชจ๋ธ (Phase 2 ์ตœ์ข…)** |
110
+ | `configs/korean_3b_orpo.yaml` | ORPO ์„ค์ • |
111
+ | `data/preference/combined_preference.jsonl` | ORPO ํ•™์Šต ๋ฐ์ดํ„ฐ (795K pairs) |
112
+ | `reports/2026-03-06_3B_SFT_COMPLETION_AND_EVAL_SUMMARY.md` | SFT ์™„๋ฃŒ + ํ‰๊ฐ€ ์š”์•ฝ |
113
+ | `reports/2026-03-06_3B_SFT_EVALUATION_REPORT.md` | SFT 6์ฐจ์› ํ‰๊ฐ€ ์ƒ์„ธ |
114
+
115
+ ---
116
+
117
+ ## ํƒ€์ž„๋ผ์ธ
118
+
119
+ ```
120
+ Feb 25 Phase 0 ์‹œ์ž‘ (๊ธฐ๋ฐ˜ ๊ตฌ์ถ•, 125M FP8 ๊ฒ€์ฆ)
121
+ Feb 25-26 1B Pretrain (34K steps, loss 1.904)
122
+ Feb 26 1B SFT v1 ์‹คํŒจ (label off-by-one)
123
+ Feb 27 1B SFT v2 ์„ฑ๊ณต (val_loss 2.206, ๋ฐ˜๋ณต๋ฅ  18%)
124
+ Feb 27 ์ €์Šคํ‹ฐ์Šค๋ฆฌ๊ทธ ํ† ๋ก  โ†’ 3B ์ „ํ™˜ ๊ฒฐ์ •
125
+ Feb 27 640GB+ ๋ฐ์ดํ„ฐ ์กฐ๋ฆฝ
126
+ Mar 02 Phase 0 ์™„๋ฃŒ (GQA FA, DDP, NCCL ์ตœ์ ํ™”)
127
+ Mar 02 Phase 1 ์‹œ์ž‘ (3B Pretrain)
128
+ Mar 05 Phase 1 ์™„๋ฃŒ (57K steps, loss 1.466, 63์‹œ๊ฐ„)
129
+ Mar 05 Phase 2 ์‹œ์ž‘ (SFT, 2.44M samples)
130
+ Mar 06 Phase 2 ์™„๋ฃŒ (25.5K steps, val_loss 1.8851, early stopping)
131
+ Mar 06 SFT 6์ฐจ์› ํ‰๊ฐ€ ์™„๋ฃŒ (4/6 PASS)
132
+ Mar 06 โ†’ ORPO ์ง„ํ–‰ ๊ฒฐ์ • (Phase 3 ์ค€๋น„ ์™„๋ฃŒ)
133
+ ```