| --- |
| license: apache-2.0 |
| --- |
| # TrueMath |
|
|
| A 1-layer TrueACT model trained to generate chain-of-thought reasoning for arithmetic expressions with `+`, `-`, `*`, and parentheses. |
|
|
| Trained from scratch on purely random binary-tree expressions. Every non-root subexpression is parenthesized, forcing the model to learn step-by-step reduction of nested arithmetic. |
|
|
| ## Performance |
|
|
|  |
|
|
| | Test | Accuracy | |
| |------|----------| |
| | Fixed 12-case benchmark | 12/12 (100%) | |
| | Random 500 expressions | 91.6% | |
|
|
| Errors are exclusively multi-digit arithmetic mistakes (e.g., `42×88=3524`). The model's structural reasoning (parenthesis resolution, operator precedence, chain-of-thought decomposition) is near-perfect. |
|
|
| ## Files |
|
|
| | File | Description | |
| |------|-------------| |
| | `model.pt` | TorchScript-traced model for inference (no Python source needed) | |
| | `checkpoint.pt` | Original PyTorch checkpoint (requires the TrueACT architecture code to load) | |
| | `infer.py` | Standalone inference script | |
| | `plot.py` | Generate the training curves figure from the training log | |
|
|
| ## Usage |
|
|
| ```bash |
| # Single prompt |
| python infer.py model.pt '((5*5)+(10*2))=' |
| |
| # Interactive mode |
| python infer.py model.pt |
| Prompt > (1+2)*3= |
| ``` |
|
|
| The model expects a prompt ending with `=` and generates the chain-of-thought steps: |
|
|
| ``` |
| Input: ((5*5)+(10*2))= |
| Output: ((5*5)+(10*2))=(25+(10*2))=(25+20)=45 |
| ``` |
|
|
| ## Requirements |
|
|
| - Python 3.10+ |
| - PyTorch 2.0+ |
|
|
| No other dependencies. |