| # RecursiveComplete — 18M GPT-2 trained from scratch |
|
|
| Everything you need to **upload**, **train**, or **use** this model is in this folder. |
|
|
| First, once: |
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| --- |
|
|
| ## A) USE IT (generate text) |
| ```bash |
| python chat.py "Once upon a time" |
| python chat.py "The little robot wanted to" |
| ``` |
| Needs: chat.py, gpt2.py, big.pt, tokenizer_bpe/ |
| |
| ## B) CONTINUE TRAINING |
| Resumes automatically from where it stopped (~iter 7000): |
| ```bash |
| python train_big.py |
| ``` |
| Bump `max_iters` inside train_big.py (currently 12000) to train longer. |
| Needs: train_big.py, gpt2.py, big.pt, data/train.bin, data/meta.json |
| |
| ## C) UPLOAD TO HUGGING FACE (repo: Gentraxyz/RecursiveComplete) |
| Clean inference files are already generated for you (model.safetensors + config.json). |
| |
| Windows / PowerShell: |
| ```powershell |
| powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex" |
| hf auth login |
| copy HF_README.md README.md |
| hf upload Gentraxyz/RecursiveComplete . |
| ``` |
| Mac/Linux: |
| ```bash |
| pip install huggingface_hub |
| hf auth login |
| cp HF_README.md README.md |
| hf upload Gentraxyz/RecursiveComplete . |
| ``` |
| |
| ### What to upload vs skip |
| - UPLOAD: model.safetensors, config.json, gpt2.py, tokenizer_bpe/, README.md, chat.py |
| - SKIP (optional/large): big.pt (training checkpoint, has optimizer state), |
| data/train.bin (180MB corpus — only upload if you want others to retrain) |
|
|
| To upload just the essentials instead of everything: |
| ```bash |
| hf upload Gentraxyz/RecursiveComplete model.safetensors |
| hf upload Gentraxyz/RecursiveComplete config.json |
| hf upload Gentraxyz/RecursiveComplete gpt2.py |
| ``` |
|
|
| --- |
|
|
| ## File guide |
| | File | Used for | Notes | |
| |---|---|---| |
| | big.pt | train / use | full checkpoint (weights + optimizer + iter) ~210MB | |
| | model.safetensors | upload / use | clean weights only, HF-standard | |
| | config.json | upload / use | model dimensions | |
| | gpt2.py | all | model architecture | |
| | chat.py | use | generation script | |
| | train_big.py | train | auto-resumes from big.pt | |
| | prep_bpe.py | (optional) | rebuild train.bin from raw text | |
| | tokenizer_bpe/ | all | BPE vocab.json + merges.txt | |
| | data/train.bin | train | tokenized 90M-token corpus | |
| | data/meta.json | train | vocab size + eot id | |
| | HF_README.md | upload | rename to README.md on HF | |
| | requirements.txt | all | torch, tokenizers, numpy, safetensors | |
|
|
| Model: 18.3M params, 448d / 7 heads / 6 layers / 256 ctx, BPE 8192 vocab. |
| Base completion model (not instruction-tuned). Best at short story-style English. |
|
|