|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
Contains file for Transformer model that answers 5-digit addition questions (e.g. 12345+67890=) with near zero low loss. |
|
|
Model has answered 1 million addition questions with any errors. |
|
|
|
|
|
Model has 2 layers, 3 attention heads, d-model = 510, d-head = 170, and was trained for 30K epochs. |
|
|
The CoLab used to train the model is here: |
|
|
https://github.com/PhilipQuirke/transformer-maths/blob/main/assets/Accurate_Addition_Train.ipynb |
|
|
|
|
|
The CoLab used to analyse the model is here: |
|
|
https://github.com/PhilipQuirke/transformer-maths/blob/main/assets/Accurate_Addition_Analyse.ipynb |
|
|
|
|
|
|