File size: 610 Bytes
6c5511e b60af8c 6c5511e f352d32 b60af8c 01ba586 b60af8c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ---
library_name: transformers
license: mit
datasets:
- HuggingFaceFW/fineweb-edu
---
This is a GPT-2 model trained for 330K steps from scratch (of 1M batch size) on FineWeb-EDU i.e around 300B Tokens.
### Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub..
Developed by: Ameer H
Shared by [optional]: Ameer H
Model type: GPT2
Language(s) (NLP): English
License: MIT
### Bias, Risks, and Limitations
Will produce blabbers and unintended slurs racial or anything. Do not blame this is just an experiment.
Forked from Andrej Karparthy's original model. |