RecursiveComplete / START_HERE.md
Gentraxyz's picture
Upload folder using huggingface_hub
3c38b94 verified
|
Raw
History Blame Contribute Delete
2.5 kB

RecursiveComplete — 18M GPT-2 trained from scratch

Everything you need to upload, train, or use this model is in this folder.

First, once:

pip install -r requirements.txt

A) USE IT (generate text)

python chat.py "Once upon a time"
python chat.py "The little robot wanted to"

Needs: chat.py, gpt2.py, big.pt, tokenizer_bpe/

B) CONTINUE TRAINING

Resumes automatically from where it stopped (~iter 7000):

python train_big.py

Bump max_iters inside train_big.py (currently 12000) to train longer. Needs: train_big.py, gpt2.py, big.pt, data/train.bin, data/meta.json

C) UPLOAD TO HUGGING FACE (repo: Gentraxyz/RecursiveComplete)

Clean inference files are already generated for you (model.safetensors + config.json).

Windows / PowerShell:

powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
hf auth login
copy HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .

Mac/Linux:

pip install huggingface_hub
hf auth login
cp HF_README.md README.md
hf upload Gentraxyz/RecursiveComplete .

What to upload vs skip

  • UPLOAD: model.safetensors, config.json, gpt2.py, tokenizer_bpe/, README.md, chat.py
  • SKIP (optional/large): big.pt (training checkpoint, has optimizer state), data/train.bin (180MB corpus — only upload if you want others to retrain)

To upload just the essentials instead of everything:

hf upload Gentraxyz/RecursiveComplete model.safetensors
hf upload Gentraxyz/RecursiveComplete config.json
hf upload Gentraxyz/RecursiveComplete gpt2.py

File guide

File Used for Notes
big.pt train / use full checkpoint (weights + optimizer + iter) ~210MB
model.safetensors upload / use clean weights only, HF-standard
config.json upload / use model dimensions
gpt2.py all model architecture
chat.py use generation script
train_big.py train auto-resumes from big.pt
prep_bpe.py (optional) rebuild train.bin from raw text
tokenizer_bpe/ all BPE vocab.json + merges.txt
data/train.bin train tokenized 90M-token corpus
data/meta.json train vocab size + eot id
HF_README.md upload rename to README.md on HF
requirements.txt all torch, tokenizers, numpy, safetensors

Model: 18.3M params, 448d / 7 heads / 6 layers / 256 ctx, BPE 8192 vocab. Base completion model (not instruction-tuned). Best at short story-style English.