| # IB-Physics-Mini-GPT (from scratch) | |
| **Model type:** small GPT-2–style decoder-only LM | |
| **Params:** ~30M (n_layer=6, n_head=6, n_embed=384) | |
| **Context length:** 256 | |
| **Training:** tiny pretrain on physics notes → SFT on instruction pairs | |
| ## Intended Use | |
| Educational demo and concept explainer for IB Physics HL topics. | |
| ## Limitations | |
| Small context, tiny dataset, not a fact oracle. Double-check results. | |
| ## How Trained | |
| 1) Tokenizer: BPE (vocab 16k) on `corpus_raw.txt`. | |
| 2) Pretrain: next-token prediction. | |
| 3) Finetune: instruction-style Q&A (short). | |
| ## Eval | |
| - Perplexity on held-out notes (see `eval/` scripts) | |
| - Manual Q&A sanity checks. | |
| ## License | |
| MIT for code. Dataset licensing is your responsibility. | |