File size: 14,512 Bytes
bc18020
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc74c6a
 
 
 
 
 
 
 
 
 
 
bc18020
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc74c6a
 
 
 
 
 
bc18020
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
---
license: mit
tags:
- AI
- AnveshAI
---
# AnveshAI Edge

**A fully offline, hybrid AI assistant with chain-of-thought reasoning and a symbolic mathematics engine.**

AnveshAI Edge is a terminal-based AI assistant designed to run entirely on-device (CPU only). It combines a rule-based symbolic math engine, a local knowledge base, and a compact large language model into a unified hierarchical pipeline. A dedicated **chain-of-thought reasoning engine** decomposes every problem before the LLM is invoked, dramatically improving answer quality and reducing hallucinations.

---

## Table of Contents

1. [Architecture Overview](#architecture-overview)
2. [Benchmark test & Evaluation experiments](#Benchmark-test-&-Evaluation-experiments)
3. [Components](#components)
4. [Advanced Math Engine](#advanced-math-engine)
5. [Reasoning Engine](#reasoning-engine)
6. [Getting Started](#getting-started)
7. [Usage Examples](#usage-examples)
8. [Commands](#commands)
9. [Design Principles](#design-principles)
10. [File Structure](#file-structure)
11. [Technical Details](#technical-details)
12. [Links](#link)

---

## Architecture Overview

AnveshAI Edge uses a **hierarchical fallback pipeline** with reasoning at every non-trivial stage:

```
User Input
    β”‚
    β”œβ”€β”€ [/command]      ──► System Handler (instant)
    β”‚
    β”œβ”€β”€ [arithmetic]    ──► Math Engine (AST safe-eval, instant)
    β”‚
    β”œβ”€β”€ [advanced math] ──► Reasoning Engine: analyze()
    β”‚                          β”‚  (problem decomposition, strategy selection)
    β”‚                          β–Ό
    β”‚                       Advanced Math Engine (SymPy)
    β”‚                          β”‚  EXACT symbolic answer computed
    β”‚                          β–Ό
    β”‚                       Reasoning Engine: build_math_prompt()
    β”‚                          β”‚  (CoT plan embedded in LLM prompt)
    β”‚                          β–Ό
    β”‚                       LLM (Qwen2.5-0.5B)
    β”‚                          β””β–Ί Step-by-step explanation β†’ User
    β”‚
    β”œβ”€β”€ [knowledge]     ──► Knowledge Engine (local KB)
    β”‚                          β”œβ”€β”€ match found β†’ User
    β”‚                          └── no match:
    β”‚                               Reasoning Engine: analyze() + build_general_prompt()
    β”‚                                   β””β–Ί LLM with CoT context β†’ User
    β”‚
    └── [conversation]  ──► Conversation Engine (pattern rules)
                               β”œβ”€β”€ matched β†’ User
                               └── no match:
                                    Reasoning Engine: analyze() + build_general_prompt()
                                        β””β–Ί LLM with CoT context β†’ User
```

## Benchmark test & Evaluation experiments

<img src="https://raw.githubusercontent.com/AnveshAI/AnveshAI-Edge/refs/heads/main/diagram/download%20(1).png">
<img src="https://raw.githubusercontent.com/AnveshAI/AnveshAI-Edge/refs/heads/main/diagram/download.png">


### Key Design Principle

> **Correctness-first:** For mathematics, the symbolic engine computes the exact answer *before* the LLM is called. The LLM's only task is to explain the working β€” it cannot invent a wrong answer.

---

## Components

| Module | File | Role |
|--------|------|------|
| **Intent Router** | `router.py` | Keyword + regex classifier. Outputs: `system`, `advanced_math`, `math`, `knowledge`, `conversation`. Checked in priority order β€” advanced math is always detected before simple arithmetic. |
| **Math Engine** | `math_engine.py` | Safe AST-based evaluator for plain arithmetic (`2 + 3 * (4^2)`). No `eval()` β€” uses a whitelist of allowed AST node types. |
| **Advanced Math Engine** | `advanced_math_engine.py` | SymPy symbolic computation engine. 31+ operation types. Returns `(success, result_str, latex_str)`. |
| **Reasoning Engine** | `reasoning_engine.py` | Chain-of-thought decomposer. Identifies problem type, selects strategy, generates ordered sub-steps, assigns confidence, flags warnings. Builds structured LLM prompts. |
| **Knowledge Engine** | `knowledge_engine.py` | Local knowledge-base lookup from `knowledge.txt`. Returns `(response, found: bool)`. |
| **Conversation Engine** | `conversation_engine.py` | Pattern-matching response rules from `conversation.txt`. Returns `(response, matched: bool)`. |
| **LLM Engine** | `llm_engine.py` | Lazy-loading `Qwen2.5-0.5B-Instruct` (GGUF, Q4_K_M, ~350 MB) via `llama-cpp-python`. CPU-only, no GPU required. |
| **Memory** | `memory.py` | SQLite-backed conversation history. Powers the `/history` command. |
| **Main** | `main.py` | Terminal REPL loop. Orchestrates all engines. Displays colour-coded output. |

---

## Advanced Math Engine

The engine supports **31+ symbolic mathematics operations** across 11 categories:

### Calculus
| Operation | Example Input |
|-----------|--------------|
| Indefinite integration | `integrate x^2 sin(x)` |
| Definite integration | `definite integral of x^2 from 0 to 3` |
| Differentiation (any order) | `second derivative of sin(x) * e^x` |
| Limits (including ±∞) | `limit of sin(x)/x as x approaches 0` |

### Algebra & Equations
| Operation | Example Input |
|-----------|--------------|
| Equation solving | `solve x^2 - 5x + 6 = 0` |
| Factorisation | `factor x^3 - 8` |
| Expansion | `expand (x + y)^4` |
| Simplification | `simplify (x^2 - 1)/(x - 1)` |
| Partial fractions | `partial fraction 1/(x^2 - 1)` |
| Trig simplification | `simplify trig sin^2(x) + cos^2(x)` |

### Differential Equations
| Operation | Example Input |
|-----------|--------------|
| ODE solving (dsolve) | `solve differential equation y'' + y = 0` |
| First-order ODEs | `solve ode dy/dx = y` |

### Series & Transforms
| Operation | Example Input |
|-----------|--------------|
| Taylor / Maclaurin series | `taylor series of e^x around 0 order 6` |
| Laplace transform | `laplace transform of sin(t)` |
| Inverse Laplace transform | `inverse laplace of 1/(s^2 + 1)` |
| Fourier transform | `fourier transform of exp(-x^2)` |

### Linear Algebra
| Operation | Example Input |
|-----------|--------------|
| Determinant | `determinant of [[1,2],[3,4]]` |
| Matrix inverse | `inverse matrix [[2,1],[5,3]]` |
| Eigenvalues & eigenvectors | `eigenvalue [[4,1],[2,3]]` |
| Matrix rank | `rank of matrix [[1,2,3],[4,5,6]]` |
| Matrix trace | `trace of matrix [[1,2],[3,4]]` |

### Number Theory
| Operation | Example Input |
|-----------|--------------|
| GCD | `gcd of 48 and 18` |
| LCM | `lcm of 12 and 15` |
| Prime factorisation | `prime factorization of 360` |
| Modular arithmetic | `17 mod 5` |
| Modular inverse | `modular inverse of 3 mod 7` |

### Statistics
| Operation | Example Input |
|-----------|--------------|
| Descriptive stats | `mean of 2, 4, 6, 8, 10` |
| Standard deviation | `standard deviation of 1, 2, 3, 4, 5` |

### Combinatorics
| Operation | Example Input |
|-----------|--------------|
| Factorial | `factorial of 10` |
| Binomial coefficient | `binomial coefficient 10 choose 3` |
| Permutations | `permutation 6 P 2` |

### Summations & Products
| Operation | Example Input |
|-----------|--------------|
| Finite sum | `sum of k^2 for k from 1 to 10` |
| Infinite series | `summation of 1/n^2 for n from 1 to infinity` |

### Complex Numbers
| Operation | Example Input |
|-----------|--------------|
| All properties | `modulus of 3 + 4*I` |

---

## Reasoning Engine

The **ReasoningEngine** adds structured chain-of-thought reasoning at every stage.

### Reasoning Pipeline (4 Stages)

**Stage 1 β€” Problem Analysis**
- Detects domain (calculus, linear algebra, statistics, physics, …)
- Classifies problem type (integration, ODE, comparative analysis, …)
- Identifies sub-questions implicit in the problem

**Stage 2 β€” Strategy Selection**
- Chooses the optimal solution method (u-substitution, L'HΓ΄pital, characteristic equation, …)
- Decomposes the problem into an ordered list of numbered reasoning steps

**Stage 3 β€” Verification & Confidence**
- Assigns confidence: `HIGH` (symbolic answer available), `MEDIUM`, or `LOW`
- Detects warnings: missing bounds, undetected variables, potential singularities

**Stage 4 β€” Prompt Engineering**
- Builds a structured LLM prompt that embeds the full reasoning plan
- For math: forces the LLM to follow the exact numbered steps toward the verified answer
- For knowledge: guides the LLM through the identified sub-questions in order

### ReasoningPlan Data Structure

```python
@dataclass
class ReasoningPlan:
    problem_type:    str        # e.g. "integration", "equation_solving"
    domain:          str        # e.g. "calculus", "linear_algebra"
    sub_problems:    list[str]  # ordered reasoning steps
    strategy:        str        # solution method description
    expected_form:   str        # what the answer should look like
    assumptions:     list[str]  # stated assumptions
    confidence:      str        # HIGH / MEDIUM / LOW
    warnings:        list[str]  # potential issues flagged
```

### Why Chain-of-Thought Matters for a Small LLM

The Qwen2.5-0.5B model has only 0.5 billion parameters. Without guidance it frequently:
- Skips algebraic steps
- Produces a plausible-looking but incorrect final answer
- Confuses method with result

By embedding a detailed reasoning plan in every prompt β€” and for math problems, by **providing the correct answer upfront** β€” the model's role becomes that of a *step-by-step explainer* rather than an *unsupervised solver*. This dramatically improves output quality without requiring a larger model.

---

## Getting Started

### Requirements

```bash
pip install sympy llama-cpp-python colorama
```

The LLM model (`Qwen2.5-0.5B-Instruct-GGUF`, Q4_K_M, ~350 MB) is downloaded automatically from HuggingFace on first use and cached locally.

### Running

```bash
python main.py
```

---

## Usage Examples

```
You β€Ί integrate x^2 sin(x)

  SymPy β†’ ∫ (x**2*sin(x)) dx = -x**2*cos(x) + 2*x*sin(x) + 2*cos(x) + C
  Reasoning engine: decomposing problem…
  [Reasoning] domain=calculus | strategy=Apply integration rules (IBP) | confidence=HIGH
  Building chain-of-thought prompt β†’ LLM…

AnveshAI [AdvMath+CoT+LLM] β€Ί -x**2*cos(x) + 2*x*sin(x) + 2*cos(x) + C
  [Reasoning: integration | Apply integration rules (IBP)]
  Step 1: We use Integration by Parts twice…
  …
```

```
You β€Ί solve differential equation y'' + y = 0

AnveshAI [AdvMath+CoT+LLM] β€Ί Eq(y(x), C1*sin(x) + C2*cos(x))
  [Reasoning: ode_solving | Classify ODE and apply characteristic equation]
  Step 1: Classify the ODE β€” this is a 2nd-order linear homogeneous ODE…
  …
```

```
You β€Ί summation of 1/n^2 for n from 1 to infinity

AnveshAI [AdvMath+CoT+LLM] β€Ί Ξ£(n**(-2), n=1..oo) = pi**2/6
  [Reasoning: summation]
  Step 1: This is the Basel Problem…
```

---

## Commands

| Command | Action |
|---------|--------|
| `/help` | Show all commands and usage examples |
| `/history` | Display the last 10 conversation turns |
| `/exit` | Quit the assistant |

---

## Design Principles

1. **Correctness-first for mathematics** β€” The symbolic engine always runs before the LLM for mathematical queries. The LLM explains, it does not compute.

2. **Offline-first** β€” All computation runs locally. No API keys, no internet connection required after the one-time model download.

3. **Transparency** β€” The system prints its internal reasoning trace to the console (engine used, reasoning plan summary, confidence, warnings).

4. **Graceful degradation** β€” Every engine has a fallback: SymPy failures fall back to CoT-guided LLM, KB misses fall back to reasoning-guided LLM, and so on.

5. **Safety** β€” Arithmetic uses AST-based safe-eval (no `eval()`). Matrix parsing uses a validated bracket pattern before `eval()`. The LLM prompt explicitly forbids inventing a different answer.

6. **Modularity** β€” Every engine is independent and communicates through simple return types. Adding a new math operation requires only a new handler function and a keyword entry.

---

## File Structure

```
anveshai/
β”œβ”€β”€ main.py                 # REPL loop, orchestration, response composer
β”œβ”€β”€ router.py               # Intent classification (regex + keyword)
β”œβ”€β”€ math_engine.py          # Safe AST arithmetic evaluator
β”œβ”€β”€ advanced_math_engine.py # SymPy symbolic engine (31+ operations)
β”œβ”€β”€ reasoning_engine.py     # Chain-of-thought reasoning (CoT)
β”œβ”€β”€ llm_engine.py           # Qwen2.5-0.5B-Instruct GGUF loader
β”œβ”€β”€ knowledge_engine.py     # Local KB lookup (knowledge.txt)
β”œβ”€β”€ conversation_engine.py  # Pattern-response engine (conversation.txt)
β”œβ”€β”€ memory.py               # SQLite conversation history
β”œβ”€β”€ knowledge.txt           # Local knowledge base paragraphs
β”œβ”€β”€ conversation.txt        # PATTERN|||RESPONSE rule pairs
└── anveshai_memory.db      # Auto-created SQLite DB (gitignored)
```

---

## Technical Details

### Model
- **Name:** Qwen2.5-0.5B-Instruct
- **Format:** GGUF (Q4_K_M quantisation, ~350 MB)
- **Runtime:** llama-cpp-python (CPU-only via llama.cpp)
- **Context window:** 16,384 tokens
- **Parameters:** 0.5B
- **Threads:** 4 CPU threads

### Symbolic Engine
- **Library:** SymPy 1.x
- **Parsing:** `sympy.parsing.sympy_parser` with implicit multiplication and XOR-to-power transforms
- **Supported variables:** x, y, z, t, n, k, a, b, c, m, n, p, q, r, s
- **Special constants:** Ο€, e, i (imaginary), ∞

### Chain-of-Thought Reasoning
- **Domain detection:** 12 domain categories with keyword matching
- **Problem type classification:** 18 problem types via regex
- **Strategy library:** Pre-defined strategies for 18 problem types
- **Decomposition:** Problem-specific step generators for 15 operation types
- **Confidence levels:** HIGH (symbolic result available) / MEDIUM / LOW

### Response Labels

| Label | Meaning |
|-------|---------|
| `Math` | Instant arithmetic result |
| `AdvMath+CoT+LLM` | SymPy exact answer + CoT plan + LLM explanation |
| `AdvMath+CoT` | CoT-guided LLM fallback (SymPy failed) |
| `Knowledge` | Local KB answer |
| `LLM+CoT-KB` | KB miss β†’ reasoning-guided LLM |
| `Chat` | Conversation pattern match |
| `LLM+CoT` | Reasoning-guided LLM for open conversation |


## Link

- **GitHub:-** [Click Here!](https://github.com/AnveshAI/AnveshAI-Edge)
- **Zenodo:-** [Click Here!](https://zenodo.org/records/19045466)


---
license: mit
---