tvastr commited on
Commit
f6ba325
Β·
verified Β·
1 Parent(s): f54708a

docs: correct training curriculum (6-phase 1500-step sprint), wall-clock ~7 days, Raccoon 6.1B

Browse files
Files changed (1) hide show
  1. README.md +16 -13
README.md CHANGED
@@ -61,22 +61,25 @@ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
61
  > guha@rtaforge.in for access). This model uses a custom SSM architecture
62
  > not compatible with standard HuggingFace `AutoModel`.
63
 
64
- ## Training Curriculum
65
 
66
- One epoch, single NVIDIA L4, ~15,000 steps across 8 phases + 1,500-step Scholar Sprint.
67
- Phases 1–5 (pretraining corpus progression) not shown.
 
68
 
69
- | Phase | Steps | Dataset | Focus |
70
- |-------|-------|---------|-------|
71
- | 6 | 2,000 | Glaive alignment | Alignment |
72
- | 7 | 1,500 | Glaive alignment | Alignment |
73
 
74
- Final Scholar Sprint: 1,500 steps, Phase 5 saturation (Logic Giants corpus).
75
- **Final checkpoint: Step 1,500.**
 
 
 
 
 
 
76
 
77
- Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
78
- where Sisya proposes weight deltas and Guru applies them after validation.
79
- SFT imprint applied using surface-only gate-layer fine-tuning.
80
 
81
  ## Evaluation Results (Step 1,500)
82
 
@@ -118,7 +121,7 @@ capability.
118
  | Model | Params | seq_len | Status |
119
  |-------|--------|---------|--------|
120
  | **Rabbit** | ~2.7B | 64 | βœ… This model β€” v0.1 Alpha |
121
- | **Raccoon** | ~2.7B | 512 | In training β€” reasoning curriculum (math Γ—2, logic Γ—2) |
122
  | **Polar Bear** | ~13B | 512 | Planned β€” STEM + AEVA anti-hallucination layer |
123
 
124
  The delta between Rabbit and Raccoon is the story. One epoch β†’ two epochs,
 
61
  > guha@rtaforge.in for access). This model uses a custom SSM architecture
62
  > not compatible with standard HuggingFace `AutoModel`.
63
 
64
+ ## Training
65
 
66
+ Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
67
+ where Sisya proposes weight deltas and Guru applies them after validation.
68
+ SFT imprint applied using surface-only gate-layer fine-tuning (65 examples, 3 epochs).
69
 
70
+ **1,500 accepted proposals across 6 phases on a single AceCloud L4 (24GB VRAM).
71
+ ~7 days of effective training time (total elapsed higher due to crash recovery and VRAM leak debugging).**
 
 
72
 
73
+ | Phase | Proposals | Dataset | Focus |
74
+ |-------|-----------|---------|-------|
75
+ | 0 | 125 | CAMEL Physics | Physical reasoning |
76
+ | 1 | 125 | CAMEL Chemistry | Chemical reasoning |
77
+ | 2 | 125 | CAMEL Biology | Biological reasoning |
78
+ | 3 | 250 | Raccoon Phase 1 | General reasoning |
79
+ | 4 | 500 | Rabbit E2 Phase 4 | Extended curriculum |
80
+ | 5 | 375 | Raccoon Phase 3 (consolidation re-run) | Pattern consolidation |
81
 
82
+ **Final checkpoint: Step 1,500.** seq_len=64, batch_size=3, optimizer=Lion, lr=1e-5.
 
 
83
 
84
  ## Evaluation Results (Step 1,500)
85
 
 
121
  | Model | Params | seq_len | Status |
122
  |-------|--------|---------|--------|
123
  | **Rabbit** | ~2.7B | 64 | βœ… This model β€” v0.1 Alpha |
124
+ | **Raccoon** | ~6.1B | 512 | In training β€” reasoning curriculum (math Γ—2, logic Γ—2) |
125
  | **Polar Bear** | ~13B | 512 | Planned β€” STEM + AEVA anti-hallucination layer |
126
 
127
  The delta between Rabbit and Raccoon is the story. One epoch β†’ two epochs,