RecursiveComplete — 18M GPT-2 trained from scratch
Everything you need to upload, train, or use this model is in this folder.
First, once:
pip install -r requirements.txt
A) USE IT (generate text)
python chat.py "Once upon a time"
python chat.py "The little robot wanted to"
Needs: chat.py, gpt2.py, big.pt, tokenizer_bpe/
B) CONTINUE TRAINING
Resumes automatically from where it stopped (~iter 7000):
python train_big.py
Bump max_iters inside train_big.py (currently 12000) to train longer.
Needs: train_big.py, gpt2.py, big.pt, data/train.bin, data/meta.json
C) UPLOAD TO HUGGING FACE (repo: Gentraxyz/RecursiveComplete)
Clean inference files are already generated for you (model.safetensors + config.json).
Windows / PowerShell:
powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
hf auth login
copy HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .
Mac/Linux:
pip install huggingface_hub
hf auth login
cp HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .
What to upload vs skip
- UPLOAD: model.safetensors, config.json, gpt2.py, tokenizer_bpe/, README.md, chat.py
- SKIP (optional/large): big.pt (training checkpoint, has optimizer state), data/train.bin (180MB corpus — only upload if you want others to retrain)
To upload just the essentials instead of everything:
hf upload Gentraxyz/RecursiveComplete model.safetensors
hf upload Gentraxyz/RecursiveComplete config.json
hf upload Gentraxyz/RecursiveComplete gpt2.py
File guide
| File | Used for | Notes |
|---|---|---|
| big.pt | train / use | full checkpoint (weights + optimizer + iter) ~210MB |
| model.safetensors | upload / use | clean weights only, HF-standard |
| config.json | upload / use | model dimensions |
| gpt2.py | all | model architecture |
| chat.py | use | generation script |
| train_big.py | train | auto-resumes from big.pt |
| prep_bpe.py | (optional) | rebuild train.bin from raw text |
| tokenizer_bpe/ | all | BPE vocab.json + merges.txt |
| data/train.bin | train | tokenized 90M-token corpus |
| data/meta.json | train | vocab size + eot id |
| HF_README.md | upload | rename to README.md on HF |
| requirements.txt | all | torch, tokenizers, numpy, safetensors |
Model: 18.3M params, 448d / 7 heads / 6 layers / 256 ctx, BPE 8192 vocab. Base completion model (not instruction-tuned). Best at short story-style English.