RtaForge
/

Anvaya-Rabbit-2.7B

@@ -19,6 +19,28 @@ tags:
 ---
 ## Model Lineage
 ```
@@ -40,35 +62,24 @@ representations from a source architecture into a structurally distinct target m
 ---
-## Model Description
-Rabbit-RtaSSM is a 2.7B parameter State Space Model (SSM) trained by [RtaForge](https://rtaforge.in)
-as part of the **Anvaya** small language model series. It uses the proprietary **Durga fu-64**
-architecture — a custom SSM variant with fortress layers and constitutional governance via the
-Gurukul training framework.
-Rabbit is the fast, general-purpose runner of the Anvaya trio (Rabbit · Raccoon · Polar Bear),
-optimised for high-throughput instruction following, logic, math, STEM, and tool dispatch.
-### Architecture
 | Property | Value |
 |----------|-------|
 | Architecture | Durga fu-64 (custom SSM) |
 | Base lineage | Mamba2 2.7B (weight subsumination) |
-| Parameters | ~2.7B |
 | Tokenizer | EleutherAI/gpt-neox-20b (vocab 50,280) |
-| Sequence length | 512 |
 | Optimizer | Lion (lr 1e-5) |
 | Training framework | Gurukul Phase 2 Hardened |
 ---
 ## Training Curriculum
-Two campaigns on an NVIDIA L4 GPU (Ace Cloud):
-### Campaign 1 — 8 phases, ~15,000 steps
 | Phase | Steps | Dataset | Focus |
 |-------|-------|---------|-------|
@@ -81,46 +92,64 @@ Two campaigns on an NVIDIA L4 GPU (Ace Cloud):
 | 6 | 2,000 | Glaive alignment | Alignment |
 | 7 | 1,500 | Glaive alignment | Alignment |
-### Campaign 2 — Scholar Sprint, 1,500 steps
-Phase 5 saturation (Logic Giants corpus), Lion lr=1e-5.
-Final base checkpoint: **Step 1,500**.
 ---
-## Evaluation Results
-Evaluated using scale-invariant metrics (Top-K accuracy, Mean Reciprocal Rank)
-vs. random-initialised baseline. 100 samples per corpus, seq_len=512.
-| Corpus | Metric | Random Init | Trained | Gain |
-|--------|--------|-------------|---------|------|
-| Biology | Top-1 Accuracy | baseline | **10× baseline** | +10× |
-| Chemistry | Top-1 Accuracy | baseline | **10× baseline** | +10× |
-| Deep Math | MRR | 0.008 | **0.186** | **+22×** |
-*Full Step 1,500 evaluation results will be added upon final publication.*
 ---
-## Repository Structure
-```
-RtaForge/Anvaya-Raccoon2.7B
-├── base/
-│   └── pytorch_model.bin     ← base model weights (step 1,500)
-├── imprint/
-│   └── pytorch_model.bin     ← base + Rabbit personality SFT
-└── logs/
-    └── training_logs_1500.zip
-```
 ---
 ## Usage
-This model uses a custom SSM architecture and requires the RtaForge inference stack.
-Standard HuggingFace `AutoModel` is not supported.
 ```python
 # Requires: rtaforge-substrates + torch, transformers
@@ -155,7 +184,7 @@ The model weights in this repository are licensed under
 ```
 @misc{rtaforge2026rabbit,
-  title  = {Rabbit-RtaSSM: Anvaya 2.7B State Space Model},
   author = {RtaForge},
   year   = {2026},
   url    = {https://huggingface.co/RtaForge/Anvaya-Raccoon2.7B}

 ---
+## ⚠️ This is a Proof of Concept
+**Rabbit is not a finished product. It is not meant to be.**
+This is the first public model in the Anvaya family — a single-epoch run on a single NVIDIA L4 GPU, trained to validate the architecture, the training pipeline, and the weight subsumination technique. It is a flag planted, not a summit reached.
+What this model demonstrates:
+- The **Durga fu-64** SSM architecture trains and converges
+- **Weight subsumination** from Mamba2 works (patent pending)
+- The **Gurukul** constitutional training framework functions at scale
+- A 2.6B SSM can learn meaningful representations on a single L4 in one epoch
+What this model is not:
+- A competitor to GPT-4, Claude, or Gemini
+- A production-ready assistant
+- The best we can do — not even close
+**Raccoon (6.1B, seq_len=512, reasoning-heavy curriculum) and Polar Bear are in training.**
+The benchmark story gets told there.
+---
 ## Model Lineage
 ```
 ---
+## Architecture
 | Property | Value |
 |----------|-------|
 | Architecture | Durga fu-64 (custom SSM) |
 | Base lineage | Mamba2 2.7B (weight subsumination) |
+| Parameters | ~2.6B |
 | Tokenizer | EleutherAI/gpt-neox-20b (vocab 50,280) |
+| Training seq length | 64 |
 | Optimizer | Lion (lr 1e-5) |
+| Training hardware | Single NVIDIA L4 (24GB) |
 | Training framework | Gurukul Phase 2 Hardened |
 ---
 ## Training Curriculum
+One epoch, single L4, ~15,000 steps across 8 phases + 1,500-step Scholar Sprint.
 | Phase | Steps | Dataset | Focus |
 |-------|-------|---------|-------|
 | 6 | 2,000 | Glaive alignment | Alignment |
 | 7 | 1,500 | Glaive alignment | Alignment |
+Final Scholar Sprint: 1,500 steps, Phase 5 saturation (Logic Giants corpus).
+**Final checkpoint: Step 1,500.**
 ---
+## Evaluation Results (Step 1,500)
+### Internal — Scale-Invariant Metrics
+Evaluated using Top-K accuracy and Mean Reciprocal Rank vs. random-initialised baseline.
+50 samples per corpus, seq_len=64.
+| Metric | Random Init | Trained (Step 1,500) | Gain |
+|--------|-------------|----------------------|------|
+| Top-1 Accuracy (aggregate) | 0.24% | **1.90%** | **~8×** |
+| Top-10 Accuracy (aggregate) | 0.24% | **35.84%** | **~149×** |
+| MRR (aggregate) | 0.0026 | **0.1724** | **~66×** |
+| MRR — Deep Math | 0.0084 | **0.186** | **22×** |
+| Top-10 — Biology | ~1.3% | **~12%** | **~10×** |
+| Top-10 — Chemistry | ~1.3% | **~13%** | **~10×** |
+These gains are measured against a randomly initialised model of identical architecture —
+they reflect what the training curriculum taught, not absolute capability.
+### Commercial Benchmarks (lm-eval)
+> **Important caveat**: Rabbit was trained at seq_len=64. Standard lm-eval prompts
+> (few-shot examples + question) typically run 150–400 tokens. Scores below reflect
+> inference at context lengths the model was not trained on.
+> Raccoon (seq_len=512) will be evaluated without this constraint.
+| Benchmark | Score | Notes |
+|-----------|-------|-------|
+| HellaSwag | TBD | |
+| ARC-Challenge | TBD | |
+| MMLU | TBD | Expect near-random due to long prompts |
+| WinoGrande | TBD | |
+| TruthfulQA | TBD | Alignment corpus benefit expected |
+*lm-eval in progress — scores will be updated upon completion.*
 ---
+## What Comes Next
+| Model | Params | seq_len | Status |
+|-------|--------|---------|--------|
+| **Rabbit** | 2.6B | 64 | ✅ This model |
+| **Raccoon** | 6.1B | 512 | In training — reasoning-heavy curriculum (math ×2, logic ×2) |
+| **Polar Bear** | ~13B | 512 | Planned — STEM + AEVA anti-hallucination layer |
+The delta between Rabbit and Raccoon is the story. One epoch → two epochs, seq_len 64 → 512, 2.6B → 6.1B. Same pipeline, same hardware philosophy. **Give us more resources and watch what happens.**
 ---
 ## Usage
+This model uses a custom SSM architecture. Standard HuggingFace `AutoModel` is not supported.
 ```python
 # Requires: rtaforge-substrates + torch, transformers
 ```
 @misc{rtaforge2026rabbit,
+  title  = {Rabbit-RtaSSM: Anvaya 2.7B State Space Model (Proof of Concept)},
   author = {RtaForge},
   year   = {2026},
   url    = {https://huggingface.co/RtaForge/Anvaya-Raccoon2.7B}