docs: correct training curriculum (6-phase 1500-step sprint), wall-clock ~7 days, Raccoon 6.1B
Browse files
README.md
CHANGED
|
@@ -61,22 +61,25 @@ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
|
|
| 61 |
> guha@rtaforge.in for access). This model uses a custom SSM architecture
|
| 62 |
> not compatible with standard HuggingFace `AutoModel`.
|
| 63 |
|
| 64 |
-
## Training
|
| 65 |
|
| 66 |
-
|
| 67 |
-
|
|
|
|
| 68 |
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
| 6 | 2,000 | Glaive alignment | Alignment |
|
| 72 |
-
| 7 | 1,500 | Glaive alignment | Alignment |
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
-
|
| 78 |
-
where Sisya proposes weight deltas and Guru applies them after validation.
|
| 79 |
-
SFT imprint applied using surface-only gate-layer fine-tuning.
|
| 80 |
|
| 81 |
## Evaluation Results (Step 1,500)
|
| 82 |
|
|
@@ -118,7 +121,7 @@ capability.
|
|
| 118 |
| Model | Params | seq_len | Status |
|
| 119 |
|-------|--------|---------|--------|
|
| 120 |
| **Rabbit** | ~2.7B | 64 | β
This model β v0.1 Alpha |
|
| 121 |
-
| **Raccoon** | ~
|
| 122 |
| **Polar Bear** | ~13B | 512 | Planned β STEM + AEVA anti-hallucination layer |
|
| 123 |
|
| 124 |
The delta between Rabbit and Raccoon is the story. One epoch β two epochs,
|
|
|
|
| 61 |
> guha@rtaforge.in for access). This model uses a custom SSM architecture
|
| 62 |
> not compatible with standard HuggingFace `AutoModel`.
|
| 63 |
|
| 64 |
+
## Training
|
| 65 |
|
| 66 |
+
Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
|
| 67 |
+
where Sisya proposes weight deltas and Guru applies them after validation.
|
| 68 |
+
SFT imprint applied using surface-only gate-layer fine-tuning (65 examples, 3 epochs).
|
| 69 |
|
| 70 |
+
**1,500 accepted proposals across 6 phases on a single AceCloud L4 (24GB VRAM).
|
| 71 |
+
~7 days of effective training time (total elapsed higher due to crash recovery and VRAM leak debugging).**
|
|
|
|
|
|
|
| 72 |
|
| 73 |
+
| Phase | Proposals | Dataset | Focus |
|
| 74 |
+
|-------|-----------|---------|-------|
|
| 75 |
+
| 0 | 125 | CAMEL Physics | Physical reasoning |
|
| 76 |
+
| 1 | 125 | CAMEL Chemistry | Chemical reasoning |
|
| 77 |
+
| 2 | 125 | CAMEL Biology | Biological reasoning |
|
| 78 |
+
| 3 | 250 | Raccoon Phase 1 | General reasoning |
|
| 79 |
+
| 4 | 500 | Rabbit E2 Phase 4 | Extended curriculum |
|
| 80 |
+
| 5 | 375 | Raccoon Phase 3 (consolidation re-run) | Pattern consolidation |
|
| 81 |
|
| 82 |
+
**Final checkpoint: Step 1,500.** seq_len=64, batch_size=3, optimizer=Lion, lr=1e-5.
|
|
|
|
|
|
|
| 83 |
|
| 84 |
## Evaluation Results (Step 1,500)
|
| 85 |
|
|
|
|
| 121 |
| Model | Params | seq_len | Status |
|
| 122 |
|-------|--------|---------|--------|
|
| 123 |
| **Rabbit** | ~2.7B | 64 | β
This model β v0.1 Alpha |
|
| 124 |
+
| **Raccoon** | ~6.1B | 512 | In training β reasoning curriculum (math Γ2, logic Γ2) |
|
| 125 |
| **Polar Bear** | ~13B | 512 | Planned β STEM + AEVA anti-hallucination layer |
|
| 126 |
|
| 127 |
The delta between Rabbit and Raccoon is the story. One epoch β two epochs,
|