ANDREA / README.md

Upload README.md with huggingface_hub

5d1dd4c verified 25 days ago

3.98 kB

	---
	license: agpl-3.0
	language:
	- en
	tags:
	- text-generation
	- smol
	- permacomputer
	- bandit-curriculum
	pipeline_tag: text-generation
	---

	# ANDREA-12M

	Autonomous Neural Data Recipe for Education and Agency

	A 12.8M parameter language model grown on a single RTX 4090 using a bandit-controlled curriculum.
	Part of the permacomputer project — open source, open data, open weights.

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Parameters \| 12.8M \|
	\| Architecture \| Transformer decoder, 384d/12h/6L \|
	\| Embedding dim \| 384 \|
	\| Heads \| 12 \|
	\| Layers \| 6 \|
	\| Context \| 1024 tokens \|
	\| Tokenizer \| Harris morpheme (2048 segments, 2305 vocab) \|
	\| Training steps \| 43,587 \|
	\| Final SMMA loss \| 2.0 \|
	\| Best single-step loss \| 0.21 \|
	\| Training time \| ~72 hours \|
	\| Hardware \| Single NVIDIA RTX 4090 (24GB VRAM, 1.4GB used) \|
	\| CUDA engine \| microgpt_cuda.cu (custom, FP32) \|
	\| Born \| 2026-03-21 12:53 UTC / 08:53 EST \|
	\| License \| AGPL-3.0 \|

	## Files

	\| File \| Step \| Description \|
	\|------\|------\|-------------\|
	\| `ANDREA-12M.bin` \| 43,587 \| Final checkpoint (SMMA 2.0) \|
	\| `ANDREA-12M-best.bin` \| 42,300 \| Best checkpoint (lowest loss during training) \|
	\| `harris_segments.json` \| — \| Harris tokenizer segments (required for inference and fine-tuning) \|

	### Checkpoint format

	Binary, little-endian: `[int32 step][int32 n_params][n_params × float32 weights][n_params × float32 m][n_params × float32 v]`

	- Weights: model parameters (12.8M floats, ~49MB)
	- m: Adam first moment (same size)
	- v: Adam second moment (same size)
	- Total: ~147MB per checkpoint

	Use either checkpoint to resume fine-tuning (weights + optimizer state preserved)
	or extract weights only for inference (first `n_params` floats after the 8-byte header).

	## Training Data

	Trained on a curated mix of open conversational and educational data:

	- NousResearch/Hermes-3-Dataset (general, creative, roleplay) — 590K conversations
	- Dictionary — 88K word definitions distilled from Hermes 3 8B
	- Gutenberg — public domain literature (Project Gutenberg)
	- Additional: chat, smoltalk, oasst, dolly, IRC, repo-docs

	Data mix controlled by a UCB1 multi-armed bandit with dice-based phase control.
	The bandit dynamically adjusts source weights during training based on per-source
	loss trajectories. Full curriculum specification in the white paper.

	## Training Recipe

	- Harris morpheme tokenizer (2048 segments)
	- Cosine LR schedule with warm restart at step 25K (0.0004 peak)
	- Phase-based bandit: 2 focus arms, 1d3 dice, source floors
	- Checkpoints every 100 steps, SIGTERM-safe
	- Per-source reward attribution, epoch penalty, coverage tracking

	## Capabilities

	ANDREA-12M learns patterns, not facts. At 12.8M parameters it produces:
	- Correct Q&A turn structure (`> question / < answer`)
	- Definition-style responses
	- Multi-sentence outputs with plausible grammar
	- Instruction-following scaffolding ("explain", "define", "describe")

	It does NOT produce factually accurate content — it's a pattern machine.
	Factual accuracy requires scaling to ANDREA-120M (planned).

	## Usage

	```python
	# Inference via microgpt
	from microgpt import load_model, generate_fast

	model = load_model('ANDREA-12M.json')
	results = generate_fast(model['state_dict'], model['uchars'], model['bos'],
	384, 12, 6, 1024, prefix='> what is an apple? / <')
	print(results[0][0])
	```

	## White Paper

	[ANDREA-12M-WHITEPAPER.pdf](ANDREA-12M-WHITEPAPER.pdf) — full technical paper covering architecture, bandit curriculum, data sources, training recipe, and results.

	Source: `whitepaper/ANDREA/WHITEPAPER.rst` in the [uncloseai-cli repository](https://git.unturf.com/engineering/unturf/uncloseai-cli).

	## Citation

	```
	ANDREA: Autonomous Neural Data Recipe for Education and Agency
	TimeHexOn, foxhop, russell@unturf
	March 2026, permacomputer.com
	```

	## License

	AGPL-3.0. Code outlasts authors. Infrastructure outlasts builders.

	● ○