RecursiveComplete — 18M GPT-2 trained from scratch

Everything you need to upload, train, or use this model is in this folder.

First, once:

pip install -r requirements.txt

A) USE IT (generate text)

python chat.py "Once upon a time"
python chat.py "The little robot wanted to"

Needs: chat.py, gpt2.py, big.pt, tokenizer_bpe/

B) CONTINUE TRAINING

Resumes automatically from where it stopped (~iter 7000):

python train_big.py

Bump max_iters inside train_big.py (currently 12000) to train longer. Needs: train_big.py, gpt2.py, big.pt, data/train.bin, data/meta.json

C) UPLOAD TO HUGGING FACE (repo: Gentraxyz/RecursiveComplete)

Clean inference files are already generated for you (model.safetensors + config.json).

Windows / PowerShell:

powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
hf auth login
copy HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .

Mac/Linux:

pip install huggingface_hub
hf auth login
cp HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .

What to upload vs skip

UPLOAD: model.safetensors, config.json, gpt2.py, tokenizer_bpe/, README.md, chat.py
SKIP (optional/large): big.pt (training checkpoint, has optimizer state), data/train.bin (180MB corpus — only upload if you want others to retrain)

To upload just the essentials instead of everything:

hf upload Gentraxyz/RecursiveComplete model.safetensors
hf upload Gentraxyz/RecursiveComplete config.json
hf upload Gentraxyz/RecursiveComplete gpt2.py

File guide

File	Used for	Notes
big.pt	train / use	full checkpoint (weights + optimizer + iter) ~210MB
model.safetensors	upload / use	clean weights only, HF-standard
config.json	upload / use	model dimensions
gpt2.py	all	model architecture
chat.py	use	generation script
train_big.py	train	auto-resumes from big.pt
prep_bpe.py	(optional)	rebuild train.bin from raw text
tokenizer_bpe/	all	BPE vocab.json + merges.txt
data/train.bin	train	tokenized 90M-token corpus
data/meta.json	train	vocab size + eot id
HF_README.md	upload	rename to README.md on HF
requirements.txt	all	torch, tokenizers, numpy, safetensors

Model: 18.3M params, 448d / 7 heads / 6 layers / 256 ctx, BPE 8192 vocab. Base completion model (not instruction-tuned). Best at short story-style English.