Update README.md
Browse files
README.md
CHANGED
|
@@ -430,7 +430,7 @@ python3 train.py \
|
|
| 430 |
```
|
| 431 |
|
| 432 |
# 3. Finetuning
|
| 433 |
-
To
|
| 434 |
```python
|
| 435 |
import os
|
| 436 |
import numpy as np
|
|
@@ -676,7 +676,6 @@ if __name__ == "__main__":
|
|
| 676 |
|
| 677 |
# 5. Our training results
|
| 678 |
## 5.1 Pretraining results
|
| 679 |
-
|
| 680 |
We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations for ~3 days.
|
| 681 |
Out final `val loss` value was **3.0450** and our final `train loss` was **3.0719**.
|
| 682 |
|
|
@@ -692,6 +691,13 @@ We tested our finetuned model a lot:
|
|
| 692 |
--> Answer:
|
| 693 |
2. ...
|
| 694 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 695 |
---
|
| 696 |
license: apache-2.0
|
| 697 |
datasets:
|
|
|
|
| 430 |
```
|
| 431 |
|
| 432 |
# 3. Finetuning
|
| 433 |
+
To finetune your model to answer your questions, run this code to prepare the finetuning data:
|
| 434 |
```python
|
| 435 |
import os
|
| 436 |
import numpy as np
|
|
|
|
| 676 |
|
| 677 |
# 5. Our training results
|
| 678 |
## 5.1 Pretraining results
|
|
|
|
| 679 |
We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations for ~3 days.
|
| 680 |
Out final `val loss` value was **3.0450** and our final `train loss` was **3.0719**.
|
| 681 |
|
|
|
|
| 691 |
--> Answer:
|
| 692 |
2. ...
|
| 693 |
|
| 694 |
+
# 7. Thanks to...
|
| 695 |
+
1. Andrej Karpathy for his nanoGPT Code and his YouTube Videos in the make-mode-series
|
| 696 |
+
2. HugginfaceTW for the Fineweb-Edu-10BT-Sample Training Dataset
|
| 697 |
+
3. Yahma for the alpaca-cleaned dataset for the finetuning
|
| 698 |
+
4. My dad for his support
|
| 699 |
+
5. My GPU for training and running my new model ;-)
|
| 700 |
+
|
| 701 |
---
|
| 702 |
license: apache-2.0
|
| 703 |
datasets:
|