license: apache-2.0
language:
- en
pipeline_tag: text-generation
tags:
- gpt2
- from-scratch
- tinystories
- text-generation
- causal-lm
RecursiveComplete
A small GPT-2-style language model (~18.3M parameters) trained completely from scratch by an AI, end to end — the architecture, training code, tokenizer, data prep, and training run were all written and executed by an AI agent with no pre-existing weights or fine-tuning from another model.
This is a text-completion model, not an instruction-tuned chatbot. It's good at continuing short prose and simple stories. It is not good at answering questions, following instructions, or factual recall.
Note: this is a custom-format model, not a
transformersmodel. You load it with the included scripts (gpt2.py+chat.py), notAutoModelForCausalLM. It does say GPT-2 in the file names. But that is just because the model used the same architecture style
Model details
| Type | Decoder-only transformer (GPT-2 style) |
| Parameters | ~18.3M |
Embedding dim (n_embd) |
448 |
Heads (n_head) |
7 |
Layers (n_layer) |
6 |
Context length (block_size) |
256 |
| Vocab size | 8192 |
| Tokenizer | Byte-level BPE (<eot> id = 0) |
| Dropout | 0.1 |
| Final train loss | ~1.86 |
Training data
Trained primarily on TinyStories (~90M tokens) with a small amount of Alpaca-style data. The model learned general English sentence structure and simple narrative flow, not world knowledge.
Files in this repo
| File | What it is |
|---|---|
model.safetensors |
The model weights |
config.json |
Architecture config (custom format) |
gpt2.py |
Model definition (the GPT-2-style architecture) |
chat.py |
Run / generate from the model |
tokenizer_bpe/vocab.json, tokenizer_bpe/merges.txt |
Byte-level BPE tokenizer |
big.pt |
Full training checkpoint (model + optimizer), for resuming training only |
train_big.py, prep_bpe.py |
Training and data-prep scripts |
Intended use
- Story / prose continuation
- Experimentation and education (a clean, fully-from-scratch small LM)
How to use
This model uses its own minimal code, not the transformers library.
# 1. Install deps
pip install torch tokenizers safetensors numpy
# 2. Download this repo (gives you the scripts + weights + tokenizer)
pip install huggingface_hub
hf download Gentraxyz/RecursiveComplete --local-dir RecursiveComplete
cd RecursiveComplete
# 3. Generate
python chat.py
chat.py loads gpt2.py (the architecture), the weights from model.safetensors, and the BPE tokenizer in tokenizer_bpe/, then lets you prompt the model for completions.
Tip: it's a completion model — give it the start of something ("Once upon a time there was a small robot who") rather than a question.
Limitations
- Completion only — will not reliably answer questions or follow instructions.
- No factual reliability; it will confidently make things up.
- Small context (256 tokens) and small vocab (8192).
- English only.
License
Apache 2.0.
Note
This model was trained entirely by an AI — including writing the model code, the tokenizer, the data pipeline, and running the training. It is shared as a small from-scratch experiment.