HuggingFaceFW/fineweb
Viewer • Updated • 52.5B • 284k • 2.9k
Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).
baseline/
nvidia-smi)To experimentally check the neural scaling law:
(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)
Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo
(WIP)
Base model
openai-community/gpt2