tiny-gpt-lab-v0.2-broadened-lightfineweb

Summary

This is the current best broadened checkpoint from the tiny GPT home-lab project.

It continues a strong TinyStories + Cosmopedia base with a lighter FineWeb-Edu broadening blend. The goal of this stage was to widen the model's base knowledge without losing too much story quality.

Source

Source run: runs/tinystories-cosmopedia-fineweb-light-vocab-8192-gpu
Dataset blend: TinyStories + Cosmopedia + FineWeb-Edu (6:2:1)
Tokenizer: BPE, vocab 8192

Model

n_layers=12
n_heads=8
n_embd=512
block_size=512

Training Result

Best validation loss: 1.3451 at step 95000
Final validation loss: 1.3644

Known Limitations

Output quality is still inconsistent on longer or more expository prompts.
The model can drift into templated educational language.
This is still an experimental local model, not a production assistant.

Downloads last month: -; Downloads are not tracked for this model. How to track