tvastr commited on
Commit
0f8f1bf
Β·
verified Β·
1 Parent(s): f6ba325

docs: improved model card with training narrative and polaris-revival link

Browse files
Files changed (1) hide show
  1. README.md +32 -16
README.md CHANGED
@@ -14,7 +14,8 @@ base_model: RtaForge/Anvaya-Rabbit-2.7B
14
 
15
  # Anvaya-Rabbit 2.7B β€” v0.1 Alpha
16
 
17
- **The architecture, training protocol, and infrastructure are the story.**
 
18
  Rabbit is the first model in the Anvaya series β€” a proof of concept demonstrating
19
  that a fully custom State-Space Model (SSM) can be trained from scratch, on a
20
  single consumer-grade GPU, with no dependence on attention or transformer
@@ -61,14 +62,25 @@ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
61
  > guha@rtaforge.in for access). This model uses a custom SSM architecture
62
  > not compatible with standard HuggingFace `AutoModel`.
63
 
 
 
 
 
64
  ## Training
65
 
66
- Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
67
- where Sisya proposes weight deltas and Guru applies them after validation.
68
- SFT imprint applied using surface-only gate-layer fine-tuning (65 examples, 3 epochs).
 
 
 
 
69
 
70
- **1,500 accepted proposals across 6 phases on a single AceCloud L4 (24GB VRAM).
71
- ~7 days of effective training time (total elapsed higher due to crash recovery and VRAM leak debugging).**
 
 
 
72
 
73
  | Phase | Proposals | Dataset | Focus |
74
  |-------|-----------|---------|-------|
@@ -81,6 +93,9 @@ SFT imprint applied using surface-only gate-layer fine-tuning (65 examples, 3 ep
81
 
82
  **Final checkpoint: Step 1,500.** seq_len=64, batch_size=3, optimizer=Lion, lr=1e-5.
83
 
 
 
 
84
  ## Evaluation Results (Step 1,500)
85
 
86
  ### Internal β€” Scale-Invariant Metrics
@@ -103,18 +118,19 @@ capability.
103
 
104
  ### Commercial Benchmarks (lm-eval harness)
105
 
106
- > **Important caveat**: Rabbit was trained at seq_len=64. Standard lm-eval prompts
107
- > (few-shot examples + question) typically run 150–400 tokens. The scores below
108
- > reflect inference at context lengths the model was never trained on.
109
- > Raccoon (seq_len=512) will be evaluated without this constraint.
 
110
 
111
  | Benchmark | Score | Notes |
112
  |-----------|-------|-------|
113
- | HellaSwag | 25.89% | Near-random; context length exceeds training seq_len |
114
- | ARC-Challenge | 26.71% | Near-random; context length exceeds training seq_len |
115
- | MMLU | 26.89% | Near-random; 5-shot prompts well beyond training seq_len |
116
- | WinoGrande | 48.62% | Near-random |
117
- | TruthfulQA MC1 | 21.91% | β€” |
118
 
119
  ## What Comes Next
120
 
@@ -125,5 +141,5 @@ capability.
125
  | **Polar Bear** | ~13B | 512 | Planned β€” STEM + AEVA anti-hallucination layer |
126
 
127
  The delta between Rabbit and Raccoon is the story. One epoch β†’ two epochs,
128
- seq_len 64 β†’ 512. Same pipeline, same hardware philosophy.
129
  **Give us more resources and watch what happens.**
 
14
 
15
  # Anvaya-Rabbit 2.7B β€” v0.1 Alpha
16
 
17
+ > **The architecture, training protocol, and infrastructure are the story.**
18
+
19
  Rabbit is the first model in the Anvaya series β€” a proof of concept demonstrating
20
  that a fully custom State-Space Model (SSM) can be trained from scratch, on a
21
  single consumer-grade GPU, with no dependence on attention or transformer
 
62
  > guha@rtaforge.in for access). This model uses a custom SSM architecture
63
  > not compatible with standard HuggingFace `AutoModel`.
64
 
65
+ **Training infrastructure**: [`Rta-Forge/polaris-revival`](https://github.com/Rta-Forge/polaris-revival) β€”
66
+ patched ROCm 7.2 runtime restoring native HIP dispatch on gfx803 (RX 560X), with
67
+ fused SSM recurrence kernels. MIT licensed.
68
+
69
  ## Training
70
 
71
+ Two proprietary components make this training regime possible:
72
+
73
+ - **Subsuminator** β€” migrates learned weights across architectures without
74
+ retraining from scratch, enabling efficient curriculum transfer.
75
+ - **Gurukul** β€” a constitutional Sisya/Guru proposal-validation loop. Sisya
76
+ proposes weight deltas; Guru validates them against constitutional constraints
77
+ before applying. Strong learning signals extracted from limited data and compute.
78
 
79
+ Together they are why Rabbit trained in 7 days on a single consumer GPU.
80
+
81
+ **1,500 accepted Gurukul proposals across 6 phases on a single AceCloud L4 (24GB VRAM).
82
+ ~7 days effective training time (total elapsed higher due to crash recovery and VRAM
83
+ leak debugging).**
84
 
85
  | Phase | Proposals | Dataset | Focus |
86
  |-------|-----------|---------|-------|
 
93
 
94
  **Final checkpoint: Step 1,500.** seq_len=64, batch_size=3, optimizer=Lion, lr=1e-5.
95
 
96
+ SFT imprint applied using surface-only gate-layer fine-tuning (65 examples, 3 epochs),
97
+ trained with the Anvaya Gurukul protocol.
98
+
99
  ## Evaluation Results (Step 1,500)
100
 
101
  ### Internal β€” Scale-Invariant Metrics
 
118
 
119
  ### Commercial Benchmarks (lm-eval harness)
120
 
121
+ > **Standard academic benchmarks are not yet meaningful here.** Rabbit was
122
+ > deliberately trained at seq_len=64 as a pure architecture proof. Standard
123
+ > lm-eval prompts (few-shot examples + question) run 150–400 tokens β€” well
124
+ > beyond Rabbit's training context. Raccoon (seq_len=512) removes this
125
+ > constraint entirely.
126
 
127
  | Benchmark | Score | Notes |
128
  |-----------|-------|-------|
129
+ | HellaSwag | 25.89% | Prompt exceeds training seq_len |
130
+ | ARC-Challenge | 26.71% | Prompt exceeds training seq_len |
131
+ | MMLU | 26.89% | Prompt exceeds training seq_len |
132
+ | WinoGrande | 48.62% | Prompt exceeds training seq_len |
133
+ | TruthfulQA MC1 | 21.91% | Prompt exceeds training seq_len |
134
 
135
  ## What Comes Next
136
 
 
141
  | **Polar Bear** | ~13B | 512 | Planned β€” STEM + AEVA anti-hallucination layer |
142
 
143
  The delta between Rabbit and Raccoon is the story. One epoch β†’ two epochs,
144
+ seq_len 64 β†’ 512, 2.7B β†’ 6.1B. Same pipeline, same hardware philosophy.
145
  **Give us more resources and watch what happens.**