| | --- |
| | library_name: transformers |
| | license: mit |
| | datasets: |
| | - HuggingFaceFW/fineweb-edu |
| | --- |
| | |
| | This is a GPT-2 model trained for 330K steps from scratch (of 1M batch size) on FineWeb-EDU i.e around 300B Tokens. |
| |
|
| | ### Model Description |
| | This is the model card of a 🤗 transformers model that has been pushed on the Hub.. |
| |
|
| | Developed by: Ameer H |
| | Shared by [optional]: Ameer H |
| | Model type: GPT2 |
| | Language(s) (NLP): English |
| | License: MIT |
| |
|
| | ### Bias, Risks, and Limitations |
| | Will produce blabbers and unintended slurs racial or anything. Do not blame this is just an experiment. |
| |
|
| |
|
| |
|
| |
|
| | Forked from Andrej Karparthy's original model. |