tiny-gpt-lab-v0.2-broadened-lightfineweb

Summary

This is the current best broadened checkpoint from the tiny GPT home-lab project.

It continues a strong TinyStories + Cosmopedia base with a lighter FineWeb-Edu broadening blend. The goal of this stage was to widen the model's base knowledge without losing too much story quality.

Source

  • Source run: runs/tinystories-cosmopedia-fineweb-light-vocab-8192-gpu
  • Dataset blend: TinyStories + Cosmopedia + FineWeb-Edu (6:2:1)
  • Tokenizer: BPE, vocab 8192

Model

  • n_layers=12
  • n_heads=8
  • n_embd=512
  • block_size=512

Training Result

  • Best validation loss: 1.3451 at step 95000
  • Final validation loss: 1.3644

Known Limitations

  • Output quality is still inconsistent on longer or more expository prompts.
  • The model can drift into templated educational language.
  • This is still an experimental local model, not a production assistant.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support