YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
RAE Training Methodology
Recursive Abstraction Engine as Training-Time Cognitive Installation
The handwriting principle: Slow, multi-representational, generative reconstruction during training installs richer internal representations β producing fast, effortless retrieval at inference. The hand was slow so the mind could be fast later.
Core Thesis
Standard fine-tuning trains models on flat input β output pairs. This is typing β
discriminative lookup from heavy context. RAE Training forces models through multi-phase
generative reconstruction, creating the neural equivalent of handwriting:
| Property | Handwriting | RAE Training |
|---|---|---|
| Forced sequential reconstruction | Must regenerate each letter from memory | Must generate each cognitive phase from internal state |
| Multi-pathway co-firing | Motor + visual + spatial + linguistic | Saturation + abstraction + descent + integration |
| Temporal bottleneck | Slowness forces deeper encoding | Multi-phase chain forces richer weight geometry |
| Variability | No two handwritten letters identical | Stochastic phase generation prevents rote memorization |
| Closed-loop embodiment | Proprioceptive error correction | Phase-to-phase coherence loss creates self-correction |
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RAE TRAINING PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β βSATURATIONβββββΊβABSTRACTIONβββββΊβ DESCENT βββββΊβINTEGRATE β β
β β tokens β β tokens β β tokens β β tokens β β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β² β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Loss = Ξ»βΒ·L_sat + Ξ»βΒ·L_abs + Ξ»βΒ·L_desc + Ξ»βΒ·L_int β
β + Ξ»_cohΒ·L_coherence + Ξ»_compΒ·L_compression β
β β
β Key: ALL phases contribute to loss, not just final answer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Training Objectives (Multi-Objective Co-Training)
- Phase Generation Loss β Each RAE phase must be generated correctly
- Cross-Phase Coherence Loss β Abstractions must logically follow from saturation
- Compression Loss β Abstraction phase penalized for being longer than saturation
- Prediction Accuracy Loss β Descent-phase predictions evaluated against ground truth
- Integration Quality Loss β Final synthesis must incorporate phase outputs
Quick Start
Option A: AutoTrain (No-Code)
pip install autotrain-advanced
autotrain --config configs/autotrain_rae_sft.yaml
Option B: Custom Trainer (Full Control)
pip install -r requirements.txt
python src/train_rae.py --config configs/rae_training_config.json
Option C: HuggingFace Spaces
Upload to a Space with GPU β see scripts/deploy_to_hf_space.sh
Dataset Format
RAE training data uses JSONL with structured multi-phase reasoning:
{
"messages": [
{"role": "system", "content": "You are an RAE-trained reasoner..."},
{"role": "user", "content": "<problem>"},
{"role": "assistant", "content": "<SATURATION>...</SATURATION><ABSTRACTION>...</ABSTRACTION><DESCENT>...</DESCENT><INTEGRATION>...</INTEGRATION>"}
]
}
Files
rae-training/
βββ README.md # This file
βββ requirements.txt # Python dependencies
βββ configs/
β βββ autotrain_rae_sft.yaml # AutoTrain config (no-code path)
β βββ rae_training_config.json # Custom trainer config
β βββ base_models.json # Tested base model registry
βββ src/
β βββ dataset_generator.py # Generates RAE-structured training data
β βββ rae_data_formatter.py # Formats raw data into RAE phases
β βββ train_rae.py # Custom RAE trainer with multi-phase loss
β βββ rae_loss.py # Multi-objective loss functions
β βββ rae_tokenizer_utils.py # Phase-aware tokenization
βββ evaluation/
β βββ eval_rae_model.py # Evaluation harness
β βββ benchmarks.json # Test problems for before/after comparison
βββ data/
β βββ seed_problems.jsonl # Seed problems for dataset generation
βββ scripts/
βββ generate_dataset.sh # End-to-end dataset generation
βββ run_training.sh # Training launcher
βββ deploy_to_hf_space.sh # HF Spaces deployment
Theory: Why This Works
See the companion document THEORY.md for the full neuroscience-to-ML mapping.
TL;DR: Handwriting activates widespread brain connectivity because it forces generative reconstruction through multiple representational modalities simultaneously under a temporal bottleneck. RAE training replicates this by forcing the model to traverse Saturation β Abstraction β Descent β Integration phases, with loss computed on ALL phases β meaning the model cannot shortcut to the answer. The multi-phase structure installs richer weight geometry that persists as faster, more capable inference after training.