File size: 3,983 Bytes
cc7f228 5d1dd4c cc7f228 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | ---
license: agpl-3.0
language:
- en
tags:
- text-generation
- smol
- permacomputer
- bandit-curriculum
pipeline_tag: text-generation
---
# ANDREA-12M
**A**utonomous **N**eural **D**ata **R**ecipe for **E**ducation and **A**gency
A 12.8M parameter language model grown on a single RTX 4090 using a bandit-controlled curriculum.
Part of the permacomputer project β open source, open data, open weights.
## Model Details
| Property | Value |
|----------|-------|
| Parameters | 12.8M |
| Architecture | Transformer decoder, 384d/12h/6L |
| Embedding dim | 384 |
| Heads | 12 |
| Layers | 6 |
| Context | 1024 tokens |
| Tokenizer | Harris morpheme (2048 segments, 2305 vocab) |
| Training steps | 43,587 |
| Final SMMA loss | 2.0 |
| Best single-step loss | 0.21 |
| Training time | ~72 hours |
| Hardware | Single NVIDIA RTX 4090 (24GB VRAM, 1.4GB used) |
| CUDA engine | microgpt_cuda.cu (custom, FP32) |
| Born | 2026-03-21 12:53 UTC / 08:53 EST |
| License | AGPL-3.0 |
## Files
| File | Step | Description |
|------|------|-------------|
| `ANDREA-12M.bin` | 43,587 | Final checkpoint (SMMA 2.0) |
| `ANDREA-12M-best.bin` | 42,300 | Best checkpoint (lowest loss during training) |
| `harris_segments.json` | β | Harris tokenizer segments (required for inference and fine-tuning) |
### Checkpoint format
Binary, little-endian: `[int32 step][int32 n_params][n_params Γ float32 weights][n_params Γ float32 m][n_params Γ float32 v]`
- **Weights**: model parameters (12.8M floats, ~49MB)
- **m**: Adam first moment (same size)
- **v**: Adam second moment (same size)
- Total: ~147MB per checkpoint
Use either checkpoint to resume fine-tuning (weights + optimizer state preserved)
or extract weights only for inference (first `n_params` floats after the 8-byte header).
## Training Data
Trained on a curated mix of open conversational and educational data:
- **NousResearch/Hermes-3-Dataset** (general, creative, roleplay) β 590K conversations
- **Dictionary** β 88K word definitions distilled from Hermes 3 8B
- **Gutenberg** β public domain literature (Project Gutenberg)
- Additional: chat, smoltalk, oasst, dolly, IRC, repo-docs
Data mix controlled by a UCB1 multi-armed bandit with dice-based phase control.
The bandit dynamically adjusts source weights during training based on per-source
loss trajectories. Full curriculum specification in the white paper.
## Training Recipe
- Harris morpheme tokenizer (2048 segments)
- Cosine LR schedule with warm restart at step 25K (0.0004 peak)
- Phase-based bandit: 2 focus arms, 1d3 dice, source floors
- Checkpoints every 100 steps, SIGTERM-safe
- Per-source reward attribution, epoch penalty, coverage tracking
## Capabilities
ANDREA-12M learns patterns, not facts. At 12.8M parameters it produces:
- Correct Q&A turn structure (`> question / < answer`)
- Definition-style responses
- Multi-sentence outputs with plausible grammar
- Instruction-following scaffolding ("explain", "define", "describe")
It does NOT produce factually accurate content β it's a pattern machine.
Factual accuracy requires scaling to ANDREA-120M (planned).
## Usage
```python
# Inference via microgpt
from microgpt import load_model, generate_fast
model = load_model('ANDREA-12M.json')
results = generate_fast(model['state_dict'], model['uchars'], model['bos'],
384, 12, 6, 1024, prefix='> what is an apple? / <')
print(results[0][0])
```
## White Paper
[ANDREA-12M-WHITEPAPER.pdf](ANDREA-12M-WHITEPAPER.pdf) β full technical paper covering architecture, bandit curriculum, data sources, training recipe, and results.
Source: `whitepaper/ANDREA/WHITEPAPER.rst` in the [uncloseai-cli repository](https://git.unturf.com/engineering/unturf/uncloseai-cli).
## Citation
```
ANDREA: Autonomous Neural Data Recipe for Education and Agency
TimeHexOn, foxhop, russell@unturf
March 2026, permacomputer.com
```
## License
AGPL-3.0. Code outlasts authors. Infrastructure outlasts builders.
β β
|