--- license: apache-2.0 --- # 🧮 NanoCalc-1M **NanoCalc-1M** is a ultra-compact, character-level Seq2Seq Transformer based on the T5 architecture, specifically trained to perform arithmetic operations with high precision. * **Architecture:** T5-based Encoder-Decoder * **Parameters:** 0.99M * **Precision:** Mixed Precision (BF16/FP16) * **Vocab:** Character-level (0-9, +, -, *, /, =) * **Training Data:** 2,000,000 synthetic samples (3-digit arithmetic) * **Max Input Length:** 20 tokens * **Performance:** ~97% Accuracy on 4-operation math (Validation Set) ## Performance Chart | Epoch | Training Loss | Val Accuracy | Status | | :--- | :--- | :--- | :--- | | 1 | 1.1420 | 54.89% | 🔴 Learnt Format | | 2 | 0.3931 | 78.79% | 🟡 Learnt Digits | | 5 | 0.1638 | 91.91% | 🟢 Learning subtleties | | 9 | 0.1051 | 97.15% | 🔵 High Precision | | **10** | **0.1004** | **97.73%** | 🚀 **Near Perfect** | ## How to use To use this model, download `model.pt` and `use.py` and run it on any type of device with Python3. ## Examples Model loaded (Accuracy: 97.73% from epoch 10) --- Mini Math Model interactive --- Enter an arithmetic task (e.g. 15*15) or type 'exit' to quit this. Task > 0*567 Model: 0 | Correct: 0 ✅ Task > 999+999 Model: 1998 | Correct: 1998 ✅ Task > 1/1 Model: 1 | Correct: 1 ✅ Task > 1684*8787 Model: 6398 | Correct: 14797308 ❌ Task > 124*598 Model: 2452 | Correct: 74152 ❌ Task > 12/68 Model: 4 | Correct: 0 ❌ Task > 123*123 Model: 499 | Correct: 15129 ❌ Task > 47*5 Model: 235 | Correct: 235 ✅ Task > 456+125 Model: 581 | Correct: 581 ✅ Task > 957-234 Model: 723 | Correct: 723 ✅ Task > 120-7650 Model: -550 | Correct: -7530 ❌ Task > 450-750 Model: -300 | Correct: -300 ✅ Task > 453-97 Model: 356 | Correct: 356 ✅ Task > 129-462 Model: -333 | Correct: -333 ✅ Task > 8*8 Model: 64 | Correct: 64 ✅ Task > 54*54 Model: 2916 | Correct: 2916 ✅ Task > 102*78 Model: 748 | Correct: 7956 ❌ Task > 74*9 Model: 666 | Correct: 666 ✅ Task > 103-34 Model: 69 | Correct: 69 ✅ ## Overall accuracy The overall accuracy after 10 epochs of training is ~97% for tasks with max. 3 digits each like `74*9` or `103-34`. ## Limitations The can't do: - Tasks with more than 3 digits like `3984-125` - Multiplication tasks with numbers above 99 like `293*21` - Complex tasks ## Training We trained for 10 epochs (~20 minutes of training on Kaggle 2x T4) with 2 million randomly generated training samples. ## Final thoughts We may be releasing an improved version of this that can solve really complex tasks and much more...stay tuned!