Gentraxyz
/

RecursiveComplete

Text Generation

Model card Files Files and versions

RecursiveComplete / START_HERE.md

Gentraxyz's picture

Upload folder using huggingface_hub

3c38b94 verified 12 days ago

|

History Blame Contribute Delete

2.5 kB

	# RecursiveComplete — 18M GPT-2 trained from scratch

	Everything you need to upload, train, or use this model is in this folder.

	First, once:
	```bash
	pip install -r requirements.txt
	```

	---

	## A) USE IT (generate text)
	```bash
	python chat.py "Once upon a time"
	python chat.py "The little robot wanted to"
	```
	Needs: chat.py, gpt2.py, big.pt, tokenizer_bpe/

	## B) CONTINUE TRAINING
	Resumes automatically from where it stopped (~iter 7000):
	```bash
	python train_big.py
	```
	Bump `max_iters` inside train_big.py (currently 12000) to train longer.
	Needs: train_big.py, gpt2.py, big.pt, data/train.bin, data/meta.json

	## C) UPLOAD TO HUGGING FACE (repo: Gentraxyz/RecursiveComplete)
	Clean inference files are already generated for you (model.safetensors + config.json).

	Windows / PowerShell:
	```powershell
	powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 \| iex"
	hf auth login
	copy HF_README.md README.md
	hf upload Gentraxyz/RecursiveComplete .
	```
	Mac/Linux:
	```bash
	pip install huggingface_hub
	hf auth login
	cp HF_README.md README.md
	hf upload Gentraxyz/RecursiveComplete .
	```

	### What to upload vs skip
	- UPLOAD: model.safetensors, config.json, gpt2.py, tokenizer_bpe/, README.md, chat.py
	- SKIP (optional/large): big.pt (training checkpoint, has optimizer state),
	data/train.bin (180MB corpus — only upload if you want others to retrain)

	To upload just the essentials instead of everything:
	```bash
	hf upload Gentraxyz/RecursiveComplete model.safetensors
	hf upload Gentraxyz/RecursiveComplete config.json
	hf upload Gentraxyz/RecursiveComplete gpt2.py
	```

	---

	## File guide
	\| File \| Used for \| Notes \|
	\|---\|---\|---\|
	\| big.pt \| train / use \| full checkpoint (weights + optimizer + iter) ~210MB \|
	\| model.safetensors \| upload / use \| clean weights only, HF-standard \|
	\| config.json \| upload / use \| model dimensions \|
	\| gpt2.py \| all \| model architecture \|
	\| chat.py \| use \| generation script \|
	\| train_big.py \| train \| auto-resumes from big.pt \|
	\| prep_bpe.py \| (optional) \| rebuild train.bin from raw text \|
	\| tokenizer_bpe/ \| all \| BPE vocab.json + merges.txt \|
	\| data/train.bin \| train \| tokenized 90M-token corpus \|
	\| data/meta.json \| train \| vocab size + eot id \|
	\| HF_README.md \| upload \| rename to README.md on HF \|
	\| requirements.txt \| all \| torch, tokenizers, numpy, safetensors \|

	Model: 18.3M params, 448d / 7 heads / 6 layers / 256 ctx, BPE 8192 vocab.
	Base completion model (not instruction-tuned). Best at short story-style English.