# My GPT — Text Generation from Scratch A 30M-parameter GPT-style transformer built from scratch in PyTorch, trained on Shakespeare + Alpaca + OpenWebText, with a Flask streaming chat interface. ## Project Structure ``` ai-model-by-me/ ├── model.py # GPT architecture (multi-head attention, transformer blocks) ├── tokenizer.py # BPE tokenizer (GPT-2/tiktoken) + char-level fallback ├── train.py # Training script (Apple M1/MPS optimized, checkpoint resume) ├── data_loader.py # Dataset loaders (Shakespeare, Alpaca, OpenWebText, custom) ├── generate.py # CLI text generation ├── app.py # Flask streaming chat interface └── upload_to_hf.py # Upload to Hugging Face Hub ``` ## Setup ```bash conda create -n slm-env python=3.11 conda activate slm-env pip install torch numpy flask tiktoken datasets huggingface_hub ``` ## Step 1 — Train ```bash python train.py --datasets shakespeare,alpaca,openwebtext \ --max_iters 15000 --batch_size 16 --n_layer 6 --n_head 6 --n_embd 384 ``` Resume from a checkpoint: ```bash python train.py --datasets shakespeare,alpaca,openwebtext \ --max_iters 15000 --lr 1e-4 --resume ``` Saves best checkpoint to `checkpoints/best_model.pt`. ## Step 2 — Generate Text (CLI) ```bash python generate.py --prompt "To be or not to be" --max_new_tokens 300 ``` Alpaca instruction-style: ```bash python generate.py --instruction "Write a poem about the sea" ``` ## Step 3 — Run Chat Interface ```bash python app.py ``` Open [http://127.0.0.1:5000](http://127.0.0.1:5000) in your browser (use incognito if your browser blocks localhost). ## Model Architecture | Parameter | Value | |-----------------|-----------| | Type | GPT (decoder-only transformer) | | Tokenizer | BPE — GPT-2 encoding (50,257 vocab) | | Layers | 6 transformer blocks | | Attention heads | 6 | | Embedding dim | 384 | | Context length | 256 tokens | | Parameters | ~30M | | Training data | Shakespeare + Alpaca 52K + OpenWebText sample | | Best val loss | 3.4163 | ## Hardware Optimized for Apple M1 via PyTorch MPS backend. Falls back to CUDA or CPU automatically. ## Upload to Hugging Face ```bash export HF_TOKEN=your_token_here python upload_to_hf.py --username YOUR_HF_USERNAME --repo_name my-gpt-from-scratch ```