Update README.md
Browse files
README.md
CHANGED
|
@@ -764,11 +764,11 @@ if __name__ == "__main__":
|
|
| 764 |
|
| 765 |
# 5. Our training results
|
| 766 |
We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations for ~3 days.
|
| 767 |
-
Out final `val loss` value was **
|
| 768 |
|
| 769 |
# 6. Thanks to...
|
| 770 |
1. Andrej Karpathy for his nanoGPT Code and his YouTube Videos in the make-mode-series
|
| 771 |
-
2.
|
| 772 |
3. Yahma for the alpaca-cleaned dataset for the finetuning
|
| 773 |
4. My dad for his support <3
|
| 774 |
5. My GPU for training and running my new model ;-)
|
|
|
|
| 764 |
|
| 765 |
# 5. Our training results
|
| 766 |
We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations for ~3 days.
|
| 767 |
+
Out final `val loss` value was **2.8175** and our final `train loss` was **2.8008**.
|
| 768 |
|
| 769 |
# 6. Thanks to...
|
| 770 |
1. Andrej Karpathy for his nanoGPT Code and his YouTube Videos in the make-mode-series
|
| 771 |
+
2. HuggingfaceTW for the Fineweb-Edu-10BT-Sample Training Dataset
|
| 772 |
3. Yahma for the alpaca-cleaned dataset for the finetuning
|
| 773 |
4. My dad for his support <3
|
| 774 |
5. My GPU for training and running my new model ;-)
|