LH-Tech-AI
/

NanoCalc-1M

Model card Files Files and versions

NanoCalc-1M / README.md

LH-Tech-AI's picture

Update README.md

76a1be0 verified 8 days ago

|

history blame contribute delete

2.63 kB

	---
	license: apache-2.0
	---


	# 🧮 NanoCalc-1M

	NanoCalc-1M is a ultra-compact, character-level Seq2Seq Transformer based on the T5 architecture, specifically trained to perform arithmetic operations with high precision.

	* Architecture: T5-based Encoder-Decoder
	* Parameters: 0.99M
	* Precision: Mixed Precision (BF16/FP16)
	* Vocab: Character-level (0-9, +, -, *, /, =)
	* Training Data: 2,000,000 synthetic samples (3-digit arithmetic)
	* Max Input Length: 20 tokens
	* Performance: ~97% Accuracy on 4-operation math (Validation Set)

	## Performance Chart
	\| Epoch \| Training Loss \| Val Accuracy \| Status \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| 1 \| 1.1420 \| 54.89% \| 🔴 Learnt Format \|
	\| 2 \| 0.3931 \| 78.79% \| 🟡 Learnt Digits \|
	\| 5 \| 0.1638 \| 91.91% \| 🟢 Learning subtleties \|
	\| 9 \| 0.1051 \| 97.15% \| 🔵 High Precision \|
	\| 10 \| 0.1004 \| 97.73% \| 🚀 Near Perfect \|

	## How to use
	To use this model, download `model.pt` and `use.py` and run it on any type of device with Python3.

	## Examples
	Model loaded (Accuracy: 97.73% from epoch 10)

	--- Mini Math Model interactive ---
	Enter an arithmetic task (e.g. 15*15) or type 'exit' to quit this.

	Task > 0*567
	Model: 0 \| Correct: 0 ✅

	Task > 999+999
	Model: 1998 \| Correct: 1998 ✅

	Task > 1/1
	Model: 1 \| Correct: 1 ✅

	Task > 1684*8787
	Model: 6398 \| Correct: 14797308 ❌

	Task > 124*598
	Model: 2452 \| Correct: 74152 ❌

	Task > 12/68
	Model: 4 \| Correct: 0 ❌

	Task > 123*123
	Model: 499 \| Correct: 15129 ❌

	Task > 47*5
	Model: 235 \| Correct: 235 ✅

	Task > 456+125
	Model: 581 \| Correct: 581 ✅

	Task > 957-234
	Model: 723 \| Correct: 723 ✅

	Task > 120-7650
	Model: -550 \| Correct: -7530 ❌

	Task > 450-750
	Model: -300 \| Correct: -300 ✅

	Task > 453-97
	Model: 356 \| Correct: 356 ✅

	Task > 129-462
	Model: -333 \| Correct: -333 ✅

	Task > 8*8
	Model: 64 \| Correct: 64 ✅

	Task > 54*54
	Model: 2916 \| Correct: 2916 ✅

	Task > 102*78
	Model: 748 \| Correct: 7956 ❌

	Task > 74*9
	Model: 666 \| Correct: 666 ✅

	Task > 103-34
	Model: 69 \| Correct: 69 ✅

	## Overall accuracy
	The overall accuracy after 10 epochs of training is ~97% for tasks with max. 3 digits each like `74*9` or `103-34`.

	## Limitations
	The can't do:
	- Tasks with more than 3 digits like `3984-125`
	- Multiplication tasks with numbers above 99 like `293*21`
	- Complex tasks

	## Training
	We trained for 10 epochs (~20 minutes of training on Kaggle 2x T4) with 2 million randomly generated training samples.

	## Final thoughts
	We may be releasing an improved version of this that can solve really complex tasks and much more...stay tuned!