HuggingFaceFW/fineweb-edu
Viewer • Updated • 3.5B • 638k • 1.1k
The first and the smallest version of GPT-2 trained for fun/education.
Not intended to be a proper model, just a pathfinder for me to learn.
Trained on a single 3090 for ~20k steps with Karpathy's llm.c train_gpt2.py script. Dataset used is edu_fineweb10B from the aforementioned repo.