pikoGPT-16M

The first and the smallest version of GPT-2 trained for fun/education.

Not intended to be a proper model, just a pathfinder for me to learn.

Training

Trained on a single 3090 for ~20k steps with Karpathy's llm.c train_gpt2.py script. Dataset used is edu_fineweb10B from the aforementioned repo.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train pagarsky/pikoGPT-16M