Update README.md
Browse files
README.md
CHANGED
|
@@ -50,7 +50,7 @@ The model was trained on the TinyStories dataset, which consist of synthetic sho
|
|
| 50 |
|
| 51 |
### Training Procedure
|
| 52 |
|
| 53 |
-
The model was trained from scratch on a **NVIDIA T4** GPU for around 3 hours to achieve a loss of
|
| 54 |
We used **EleutherAI/gpt-neo-125M** tokenizer model training and inference.
|
| 55 |
|
| 56 |
#### Training Hyperparameters
|
|
|
|
| 50 |
|
| 51 |
### Training Procedure
|
| 52 |
|
| 53 |
+
The model was trained from scratch on a **NVIDIA T4** GPU for around 3 hours to achieve a loss of `2.17`. The model was trained for `0.22` epochs estimating around `55K` steps.
|
| 54 |
We used **EleutherAI/gpt-neo-125M** tokenizer model training and inference.
|
| 55 |
|
| 56 |
#### Training Hyperparameters
|