Update README.md
Browse files
README.md
CHANGED
|
@@ -7,6 +7,6 @@ Contains files for a Transformer model that answers 6-digit subtraction question
|
|
| 7 |
This subtraction model has 3 layers, 4 attention heads, d-model = 510, d-head = 170.
|
| 8 |
The subtraction model was initialised with a very-low-loss Addition model (2 layers, 3 attention heads, 9e-9 loss), before being trained for 45K epochs.
|
| 9 |
|
| 10 |
-
The CoLab used to train the model is here: https://github.com/
|
| 11 |
|
| 12 |
-
The CoLab used to analyse the model is here: https://github.com/
|
|
|
|
| 7 |
This subtraction model has 3 layers, 4 attention heads, d-model = 510, d-head = 170.
|
| 8 |
The subtraction model was initialised with a very-low-loss Addition model (2 layers, 3 attention heads, 9e-9 loss), before being trained for 45K epochs.
|
| 9 |
|
| 10 |
+
The CoLab used to train the model is here: https://github.com/apartresearch/Verified_addition/blob/main/assets/Accurate_Math_Train.ipynb
|
| 11 |
|
| 12 |
+
The CoLab used to analyse the model is here: https://github.com/apartresearch/Verified_addition/blob/main/assets/Accurate_Math_Analyse.ipynb
|