tvastr commited on
Commit
3cec8f2
Β·
verified Β·
1 Parent(s): 27b5ab4

docs: reframe as PoC, add step-1500 internal evals, honest lm-eval disclaimer

Browse files
Files changed (1) hide show
  1. README.md +71 -42
README.md CHANGED
@@ -19,6 +19,28 @@ tags:
19
 
20
  ---
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## Model Lineage
23
 
24
  ```
@@ -40,35 +62,24 @@ representations from a source architecture into a structurally distinct target m
40
 
41
  ---
42
 
43
- ## Model Description
44
-
45
- Rabbit-RtaSSM is a 2.7B parameter State Space Model (SSM) trained by [RtaForge](https://rtaforge.in)
46
- as part of the **Anvaya** small language model series. It uses the proprietary **Durga fu-64**
47
- architecture β€” a custom SSM variant with fortress layers and constitutional governance via the
48
- Gurukul training framework.
49
-
50
- Rabbit is the fast, general-purpose runner of the Anvaya trio (Rabbit Β· Raccoon Β· Polar Bear),
51
- optimised for high-throughput instruction following, logic, math, STEM, and tool dispatch.
52
-
53
- ### Architecture
54
 
55
  | Property | Value |
56
  |----------|-------|
57
  | Architecture | Durga fu-64 (custom SSM) |
58
  | Base lineage | Mamba2 2.7B (weight subsumination) |
59
- | Parameters | ~2.7B |
60
  | Tokenizer | EleutherAI/gpt-neox-20b (vocab 50,280) |
61
- | Sequence length | 512 |
62
  | Optimizer | Lion (lr 1e-5) |
 
63
  | Training framework | Gurukul Phase 2 Hardened |
64
 
65
  ---
66
 
67
  ## Training Curriculum
68
 
69
- Two campaigns on an NVIDIA L4 GPU (Ace Cloud):
70
-
71
- ### Campaign 1 β€” 8 phases, ~15,000 steps
72
 
73
  | Phase | Steps | Dataset | Focus |
74
  |-------|-------|---------|-------|
@@ -81,46 +92,64 @@ Two campaigns on an NVIDIA L4 GPU (Ace Cloud):
81
  | 6 | 2,000 | Glaive alignment | Alignment |
82
  | 7 | 1,500 | Glaive alignment | Alignment |
83
 
84
- ### Campaign 2 β€” Scholar Sprint, 1,500 steps
85
-
86
- Phase 5 saturation (Logic Giants corpus), Lion lr=1e-5.
87
- Final base checkpoint: **Step 1,500**.
88
 
89
  ---
90
 
91
- ## Evaluation Results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
- Evaluated using scale-invariant metrics (Top-K accuracy, Mean Reciprocal Rank)
94
- vs. random-initialised baseline. 100 samples per corpus, seq_len=512.
95
 
96
- | Corpus | Metric | Random Init | Trained | Gain |
97
- |--------|--------|-------------|---------|------|
98
- | Biology | Top-1 Accuracy | baseline | **10Γ— baseline** | +10Γ— |
99
- | Chemistry | Top-1 Accuracy | baseline | **10Γ— baseline** | +10Γ— |
100
- | Deep Math | MRR | 0.008 | **0.186** | **+22Γ—** |
101
 
102
- *Full Step 1,500 evaluation results will be added upon final publication.*
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
  ---
105
 
106
- ## Repository Structure
107
 
108
- ```
109
- RtaForge/Anvaya-Raccoon2.7B
110
- β”œβ”€β”€ base/
111
- β”‚ └── pytorch_model.bin ← base model weights (step 1,500)
112
- β”œβ”€β”€ imprint/
113
- β”‚ └── pytorch_model.bin ← base + Rabbit personality SFT
114
- └── logs/
115
- └── training_logs_1500.zip
116
- ```
117
 
118
  ---
119
 
120
  ## Usage
121
 
122
- This model uses a custom SSM architecture and requires the RtaForge inference stack.
123
- Standard HuggingFace `AutoModel` is not supported.
124
 
125
  ```python
126
  # Requires: rtaforge-substrates + torch, transformers
@@ -155,7 +184,7 @@ The model weights in this repository are licensed under
155
 
156
  ```
157
  @misc{rtaforge2026rabbit,
158
- title = {Rabbit-RtaSSM: Anvaya 2.7B State Space Model},
159
  author = {RtaForge},
160
  year = {2026},
161
  url = {https://huggingface.co/RtaForge/Anvaya-Raccoon2.7B}
 
19
 
20
  ---
21
 
22
+ ## ⚠️ This is a Proof of Concept
23
+
24
+ **Rabbit is not a finished product. It is not meant to be.**
25
+
26
+ This is the first public model in the Anvaya family β€” a single-epoch run on a single NVIDIA L4 GPU, trained to validate the architecture, the training pipeline, and the weight subsumination technique. It is a flag planted, not a summit reached.
27
+
28
+ What this model demonstrates:
29
+ - The **Durga fu-64** SSM architecture trains and converges
30
+ - **Weight subsumination** from Mamba2 works (patent pending)
31
+ - The **Gurukul** constitutional training framework functions at scale
32
+ - A 2.6B SSM can learn meaningful representations on a single L4 in one epoch
33
+
34
+ What this model is not:
35
+ - A competitor to GPT-4, Claude, or Gemini
36
+ - A production-ready assistant
37
+ - The best we can do β€” not even close
38
+
39
+ **Raccoon (6.1B, seq_len=512, reasoning-heavy curriculum) and Polar Bear are in training.**
40
+ The benchmark story gets told there.
41
+
42
+ ---
43
+
44
  ## Model Lineage
45
 
46
  ```
 
62
 
63
  ---
64
 
65
+ ## Architecture
 
 
 
 
 
 
 
 
 
 
66
 
67
  | Property | Value |
68
  |----------|-------|
69
  | Architecture | Durga fu-64 (custom SSM) |
70
  | Base lineage | Mamba2 2.7B (weight subsumination) |
71
+ | Parameters | ~2.6B |
72
  | Tokenizer | EleutherAI/gpt-neox-20b (vocab 50,280) |
73
+ | Training seq length | 64 |
74
  | Optimizer | Lion (lr 1e-5) |
75
+ | Training hardware | Single NVIDIA L4 (24GB) |
76
  | Training framework | Gurukul Phase 2 Hardened |
77
 
78
  ---
79
 
80
  ## Training Curriculum
81
 
82
+ One epoch, single L4, ~15,000 steps across 8 phases + 1,500-step Scholar Sprint.
 
 
83
 
84
  | Phase | Steps | Dataset | Focus |
85
  |-------|-------|---------|-------|
 
92
  | 6 | 2,000 | Glaive alignment | Alignment |
93
  | 7 | 1,500 | Glaive alignment | Alignment |
94
 
95
+ Final Scholar Sprint: 1,500 steps, Phase 5 saturation (Logic Giants corpus).
96
+ **Final checkpoint: Step 1,500.**
 
 
97
 
98
  ---
99
 
100
+ ## Evaluation Results (Step 1,500)
101
+
102
+ ### Internal β€” Scale-Invariant Metrics
103
+
104
+ Evaluated using Top-K accuracy and Mean Reciprocal Rank vs. random-initialised baseline.
105
+ 50 samples per corpus, seq_len=64.
106
+
107
+ | Metric | Random Init | Trained (Step 1,500) | Gain |
108
+ |--------|-------------|----------------------|------|
109
+ | Top-1 Accuracy (aggregate) | 0.24% | **1.90%** | **~8Γ—** |
110
+ | Top-10 Accuracy (aggregate) | 0.24% | **35.84%** | **~149Γ—** |
111
+ | MRR (aggregate) | 0.0026 | **0.1724** | **~66Γ—** |
112
+ | MRR β€” Deep Math | 0.0084 | **0.186** | **22Γ—** |
113
+ | Top-10 β€” Biology | ~1.3% | **~12%** | **~10Γ—** |
114
+ | Top-10 β€” Chemistry | ~1.3% | **~13%** | **~10Γ—** |
115
 
116
+ These gains are measured against a randomly initialised model of identical architecture β€”
117
+ they reflect what the training curriculum taught, not absolute capability.
118
 
119
+ ### Commercial Benchmarks (lm-eval)
 
 
 
 
120
 
121
+ > **Important caveat**: Rabbit was trained at seq_len=64. Standard lm-eval prompts
122
+ > (few-shot examples + question) typically run 150–400 tokens. Scores below reflect
123
+ > inference at context lengths the model was not trained on.
124
+ > Raccoon (seq_len=512) will be evaluated without this constraint.
125
+
126
+ | Benchmark | Score | Notes |
127
+ |-----------|-------|-------|
128
+ | HellaSwag | TBD | |
129
+ | ARC-Challenge | TBD | |
130
+ | MMLU | TBD | Expect near-random due to long prompts |
131
+ | WinoGrande | TBD | |
132
+ | TruthfulQA | TBD | Alignment corpus benefit expected |
133
+
134
+ *lm-eval in progress β€” scores will be updated upon completion.*
135
 
136
  ---
137
 
138
+ ## What Comes Next
139
 
140
+ | Model | Params | seq_len | Status |
141
+ |-------|--------|---------|--------|
142
+ | **Rabbit** | 2.6B | 64 | βœ… This model |
143
+ | **Raccoon** | 6.1B | 512 | In training β€” reasoning-heavy curriculum (math Γ—2, logic Γ—2) |
144
+ | **Polar Bear** | ~13B | 512 | Planned β€” STEM + AEVA anti-hallucination layer |
145
+
146
+ The delta between Rabbit and Raccoon is the story. One epoch β†’ two epochs, seq_len 64 β†’ 512, 2.6B β†’ 6.1B. Same pipeline, same hardware philosophy. **Give us more resources and watch what happens.**
 
 
147
 
148
  ---
149
 
150
  ## Usage
151
 
152
+ This model uses a custom SSM architecture. Standard HuggingFace `AutoModel` is not supported.
 
153
 
154
  ```python
155
  # Requires: rtaforge-substrates + torch, transformers
 
184
 
185
  ```
186
  @misc{rtaforge2026rabbit,
187
+ title = {Rabbit-RtaSSM: Anvaya 2.7B State Space Model (Proof of Concept)},
188
  author = {RtaForge},
189
  year = {2026},
190
  url = {https://huggingface.co/RtaForge/Anvaya-Raccoon2.7B}