RAE Training Methodology

Recursive Abstraction Engine as Training-Time Cognitive Installation

The handwriting principle: Slow, multi-representational, generative reconstruction during training installs richer internal representations — producing fast, effortless retrieval at inference. The hand was slow so the mind could be fast later.

Core Thesis

Standard fine-tuning trains models on flat input → output pairs. This is typing — discriminative lookup from heavy context. RAE Training forces models through multi-phase generative reconstruction, creating the neural equivalent of handwriting:

Property	Handwriting	RAE Training
Forced sequential reconstruction	Must regenerate each letter from memory	Must generate each cognitive phase from internal state
Multi-pathway co-firing	Motor + visual + spatial + linguistic	Saturation + abstraction + descent + integration
Temporal bottleneck	Slowness forces deeper encoding	Multi-phase chain forces richer weight geometry
Variability	No two handwritten letters identical	Stochastic phase generation prevents rote memorization
Closed-loop embodiment	Proprioceptive error correction	Phase-to-phase coherence loss creates self-correction

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                  RAE TRAINING PIPELINE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐  │
│  │SATURATION│───►│ABSTRACTION│───►│ DESCENT  │───►│INTEGRATE │  │
│  │  tokens  │    │  tokens   │    │  tokens  │    │  tokens  │  │
│  └──────────┘    └──────────┘    └──────────┘    └──────────┘  │
│       ▲                                               │         │
│       └───────────────────────────────────────────────┘         │
│                                                                  │
│  Loss = λ₁·L_sat + λ₂·L_abs + λ₃·L_desc + λ₄·L_int           │
│       + λ_coh·L_coherence + λ_comp·L_compression                │
│                                                                  │
│  Key: ALL phases contribute to loss, not just final answer      │
└─────────────────────────────────────────────────────────────────┘

Training Objectives (Multi-Objective Co-Training)

Phase Generation Loss — Each RAE phase must be generated correctly
Cross-Phase Coherence Loss — Abstractions must logically follow from saturation
Compression Loss — Abstraction phase penalized for being longer than saturation
Prediction Accuracy Loss — Descent-phase predictions evaluated against ground truth
Integration Quality Loss — Final synthesis must incorporate phase outputs

Quick Start

Option A: AutoTrain (No-Code)

pip install autotrain-advanced
autotrain --config configs/autotrain_rae_sft.yaml

Option B: Custom Trainer (Full Control)

pip install -r requirements.txt
python src/train_rae.py --config configs/rae_training_config.json

Option C: HuggingFace Spaces

Upload to a Space with GPU — see scripts/deploy_to_hf_space.sh

Dataset Format

RAE training data uses JSONL with structured multi-phase reasoning:

{
  "messages": [
    {"role": "system", "content": "You are an RAE-trained reasoner..."},
    {"role": "user", "content": "<problem>"},
    {"role": "assistant", "content": "<SATURATION>...</SATURATION><ABSTRACTION>...</ABSTRACTION><DESCENT>...</DESCENT><INTEGRATION>...</INTEGRATION>"}
  ]
}

Files

rae-training/
├── README.md                          # This file
├── requirements.txt                   # Python dependencies
├── configs/
│   ├── autotrain_rae_sft.yaml        # AutoTrain config (no-code path)
│   ├── rae_training_config.json      # Custom trainer config
│   └── base_models.json              # Tested base model registry
├── src/
│   ├── dataset_generator.py          # Generates RAE-structured training data
│   ├── rae_data_formatter.py         # Formats raw data into RAE phases
│   ├── train_rae.py                  # Custom RAE trainer with multi-phase loss
│   ├── rae_loss.py                   # Multi-objective loss functions
│   └── rae_tokenizer_utils.py        # Phase-aware tokenization
├── evaluation/
│   ├── eval_rae_model.py             # Evaluation harness
│   └── benchmarks.json               # Test problems for before/after comparison
├── data/
│   └── seed_problems.jsonl           # Seed problems for dataset generation
└── scripts/
    ├── generate_dataset.sh           # End-to-end dataset generation
    ├── run_training.sh               # Training launcher
    └── deploy_to_hf_space.sh         # HF Spaces deployment

Theory: Why This Works

See the companion document THEORY.md for the full neuroscience-to-ML mapping.

TL;DR: Handwriting activates widespread brain connectivity because it forces generative reconstruction through multiple representational modalities simultaneously under a temporal bottleneck. RAE training replicates this by forcing the model to traverse Saturation → Abstraction → Descent → Integration phases, with loss computed on ALL phases — meaning the model cannot shortcut to the answer. The multi-phase structure installs richer weight geometry that persists as faster, more capable inference after training.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support