midorin-Linux commited on
Commit
d337ae3
·
verified ·
1 Parent(s): dfbd05c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -36,17 +36,17 @@ GPT-OSS-20B Base Model
36
  │ ├─ Layers: MLP + Attention
37
  │ └─ Goal: Establish coding + reasoning foundation
38
 
39
- ├─── Phase 1.5: Knowledge Consolidation (Week 5)
40
  │ ├─ Data: Mixed replay of Phase 1 data
41
  │ ├─ Layers: MLP + Attention
42
  │ └─ Goal: Prevent early forgetting
43
 
44
- ├─── Phase 2: Specialization Training (Weeks 6-8)
45
  │ ├─ Data: Claude Sonnet (250) + GPT-5.2 high (250) + Replay (150)
46
  │ ├─ Layers: MLP + Adapter
47
  │ └─ Goal: Integrate balanced reasoning + maintain coding
48
 
49
- └─── Phase 2.5: Gradual Unfreezing (Week 8, Optional)
50
  ├─ Data: Full mixed dataset
51
  ├─ Layers: Upper Attention layers + MLP + Adapter
52
  └─ Goal: Fine-tune attention patterns if needed
 
36
  │ ├─ Layers: MLP + Attention
37
  │ └─ Goal: Establish coding + reasoning foundation
38
 
39
+ ├─── Phase 1.5: Knowledge Consolidation
40
  │ ├─ Data: Mixed replay of Phase 1 data
41
  │ ├─ Layers: MLP + Attention
42
  │ └─ Goal: Prevent early forgetting
43
 
44
+ ├─── Phase 2: Specialization Training
45
  │ ├─ Data: Claude Sonnet (250) + GPT-5.2 high (250) + Replay (150)
46
  │ ├─ Layers: MLP + Adapter
47
  │ └─ Goal: Integrate balanced reasoning + maintain coding
48
 
49
+ └─── Phase 2.5: Gradual Unfreezing
50
  ├─ Data: Full mixed dataset
51
  ├─ Layers: Upper Attention layers + MLP + Adapter
52
  └─ Goal: Fine-tune attention patterns if needed