tiny-gpt-lab-v0.2-broadened-lightfineweb
Summary
This is the current best broadened checkpoint from the tiny GPT home-lab project.
It continues a strong TinyStories + Cosmopedia base with a lighter FineWeb-Edu broadening blend. The goal of this stage was to widen the model's base knowledge without losing too much story quality.
Source
- Source run:
runs/tinystories-cosmopedia-fineweb-light-vocab-8192-gpu - Dataset blend: TinyStories + Cosmopedia + FineWeb-Edu (
6:2:1) - Tokenizer: BPE, vocab
8192
Model
n_layers=12n_heads=8n_embd=512block_size=512
Training Result
- Best validation loss:
1.3451at step95000 - Final validation loss:
1.3644
Known Limitations
- Output quality is still inconsistent on longer or more expository prompts.
- The model can drift into templated educational language.
- This is still an experimental local model, not a production assistant.