Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,10 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
This is a GPT-2 model trained for 330K steps from scratch (of 1M batch size) on FineWeb-EDU i.e around 300B Tokens.
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
license: mit
|
| 4 |
+
datasets:
|
| 5 |
+
- HuggingFaceFW/fineweb-edu
|
| 6 |
---
|
| 7 |
|
| 8 |
This is a GPT-2 model trained for 330K steps from scratch (of 1M batch size) on FineWeb-EDU i.e around 300B Tokens.
|
| 9 |
+
|
| 10 |
+
Forked from Andrej Karparthy's original model.
|