File size: 2,625 Bytes
2ba7dd5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76a1be0
2ba7dd5
 
 
 
 
 
76a1be0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: apache-2.0
---


# ๐Ÿงฎ NanoCalc-1M

**NanoCalc-1M** is a ultra-compact, character-level Seq2Seq Transformer based on the T5 architecture, specifically trained to perform arithmetic operations with high precision.

* **Architecture:** T5-based Encoder-Decoder
* **Parameters:** 0.99M
* **Precision:** Mixed Precision (BF16/FP16)
* **Vocab:** Character-level (0-9, +, -, *, /, =)
* **Training Data:** 2,000,000 synthetic samples (3-digit arithmetic)
* **Max Input Length:** 20 tokens
* **Performance:** ~97% Accuracy on 4-operation math (Validation Set)

## Performance Chart
| Epoch | Training Loss | Val Accuracy | Status |
| :--- | :--- | :--- | :--- |
| 1 | 1.1420 | 54.89% | ๐Ÿ”ด Learnt Format |
| 2 | 0.3931 | 78.79% | ๐ŸŸก Learnt Digits |
| 5 | 0.1638 | 91.91% | ๐ŸŸข Learning subtleties |
| 9 | 0.1051 | 97.15% | ๐Ÿ”ต High Precision |
| **10** | **0.1004** | **97.73%** | ๐Ÿš€ **Near Perfect** |

## How to use
To use this model, download `model.pt` and `use.py` and run it on any type of device with Python3.

## Examples
Model loaded (Accuracy: 97.73% from epoch 10)

--- Mini Math Model interactive ---
Enter an arithmetic task (e.g. 15*15) or type 'exit' to quit this.

Task >  0*567
Model: 0 | Correct: 0 โœ…

Task >  999+999
Model: 1998 | Correct: 1998 โœ…

Task >  1/1
Model: 1 | Correct: 1 โœ…

Task >  1684*8787
Model: 6398 | Correct: 14797308 โŒ

Task >  124*598
Model: 2452 | Correct: 74152 โŒ

Task >  12/68
Model: 4 | Correct: 0 โŒ

Task >  123*123
Model: 499 | Correct: 15129 โŒ

Task >  47*5
Model: 235 | Correct: 235 โœ…

Task >  456+125
Model: 581 | Correct: 581 โœ…

Task >  957-234
Model: 723 | Correct: 723 โœ…

Task >  120-7650
Model: -550 | Correct: -7530 โŒ

Task >  450-750
Model: -300 | Correct: -300 โœ…

Task >  453-97
Model: 356 | Correct: 356 โœ…

Task >  129-462
Model: -333 | Correct: -333 โœ…

Task >  8*8
Model: 64 | Correct: 64 โœ…

Task >  54*54
Model: 2916 | Correct: 2916 โœ…

Task >  102*78
Model: 748 | Correct: 7956 โŒ

Task >  74*9
Model: 666 | Correct: 666 โœ…

Task >  103-34
Model: 69 | Correct: 69 โœ…

## Overall accuracy
The overall accuracy after 10 epochs of training is ~97% for tasks with max. 3 digits each like `74*9` or `103-34`.

## Limitations
The can't do:
- Tasks with more than 3 digits like `3984-125`
- Multiplication tasks with numbers above 99 like `293*21`
- Complex tasks

## Training
We trained for 10 epochs (~20 minutes of training on Kaggle 2x T4) with 2 million randomly generated training samples.

## Final thoughts
We may be releasing an improved version of this that can solve really complex tasks and much more...stay tuned!