readme update
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
|
|
| 14 |
|
| 15 |
# Training a Nano GPT from scratch
|
| 16 |
|
| 17 |
-
This repo contains code for training a nano GPT from scratch on any dataset. The implementation is taken from Andrej Karpathy's [repo](https://github.com/karpathy/nanoGPT/tree/master).
|
| 18 |
|
| 19 |
## Model Architecture
|
| 20 |
The Bigram Language Model is based on the Transformer architecture, which has been widely adopted in natural language processing tasks due to its ability to capture long-range dependencies in sequential data. Here's a detailed explanation of each component in the model:
|
|
|
|
| 14 |
|
| 15 |
# Training a Nano GPT from scratch
|
| 16 |
|
| 17 |
+
This repo contains code for training a nano GPT from scratch on any dataset. The implementation is taken from Andrej Karpathy's [repo](https://github.com/karpathy/nanoGPT/tree/master). The github repo with the notebooks used for model training can be found [here](https://github.com/mkthoma/nanoGPT).
|
| 18 |
|
| 19 |
## Model Architecture
|
| 20 |
The Bigram Language Model is based on the Transformer architecture, which has been widely adopted in natural language processing tasks due to its ability to capture long-range dependencies in sequential data. Here's a detailed explanation of each component in the model:
|