developeranveshraman commited on
Commit
bc18020
Β·
verified Β·
1 Parent(s): 5d8fd4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +365 -3
README.md CHANGED
@@ -1,3 +1,365 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - AI
5
+ - AnveshAI
6
+ ---
7
+ # AnveshAI Edge
8
+
9
+ **A fully offline, hybrid AI assistant with chain-of-thought reasoning and a symbolic mathematics engine.**
10
+
11
+ AnveshAI Edge is a terminal-based AI assistant designed to run entirely on-device (CPU only). It combines a rule-based symbolic math engine, a local knowledge base, and a compact large language model into a unified hierarchical pipeline. A dedicated **chain-of-thought reasoning engine** decomposes every problem before the LLM is invoked, dramatically improving answer quality and reducing hallucinations.
12
+
13
+ ---
14
+
15
+ ## Table of Contents
16
+
17
+ 1. [Architecture Overview](#architecture-overview)
18
+ 2. [Components](#components)
19
+ 3. [Advanced Math Engine](#advanced-math-engine)
20
+ 4. [Reasoning Engine](#reasoning-engine)
21
+ 5. [Getting Started](#getting-started)
22
+ 6. [Usage Examples](#usage-examples)
23
+ 7. [Commands](#commands)
24
+ 8. [Design Principles](#design-principles)
25
+ 9. [File Structure](#file-structure)
26
+ 10. [Technical Details](#technical-details)
27
+ 11. [Links](#link)
28
+
29
+ ---
30
+
31
+ ## Architecture Overview
32
+
33
+ AnveshAI Edge uses a **hierarchical fallback pipeline** with reasoning at every non-trivial stage:
34
+
35
+ ```
36
+ User Input
37
+ β”‚
38
+ β”œβ”€β”€ [/command] ──► System Handler (instant)
39
+ β”‚
40
+ β”œβ”€β”€ [arithmetic] ──► Math Engine (AST safe-eval, instant)
41
+ β”‚
42
+ β”œβ”€β”€ [advanced math] ──► Reasoning Engine: analyze()
43
+ β”‚ β”‚ (problem decomposition, strategy selection)
44
+ β”‚ β–Ό
45
+ β”‚ Advanced Math Engine (SymPy)
46
+ β”‚ β”‚ EXACT symbolic answer computed
47
+ β”‚ β–Ό
48
+ β”‚ Reasoning Engine: build_math_prompt()
49
+ β”‚ β”‚ (CoT plan embedded in LLM prompt)
50
+ β”‚ β–Ό
51
+ β”‚ LLM (Qwen2.5-0.5B)
52
+ β”‚ β””β–Ί Step-by-step explanation β†’ User
53
+ β”‚
54
+ β”œβ”€β”€ [knowledge] ──► Knowledge Engine (local KB)
55
+ β”‚ β”œβ”€β”€ match found β†’ User
56
+ β”‚ └── no match:
57
+ β”‚ Reasoning Engine: analyze() + build_general_prompt()
58
+ β”‚ β””β–Ί LLM with CoT context β†’ User
59
+ β”‚
60
+ └── [conversation] ──► Conversation Engine (pattern rules)
61
+ β”œβ”€β”€ matched β†’ User
62
+ └── no match:
63
+ Reasoning Engine: analyze() + build_general_prompt()
64
+ β””β–Ί LLM with CoT context β†’ User
65
+ ```
66
+
67
+ ### Key Design Principle
68
+
69
+ > **Correctness-first:** For mathematics, the symbolic engine computes the exact answer *before* the LLM is called. The LLM's only task is to explain the working β€” it cannot invent a wrong answer.
70
+
71
+ ---
72
+
73
+ ## Components
74
+
75
+ | Module | File | Role |
76
+ |--------|------|------|
77
+ | **Intent Router** | `router.py` | Keyword + regex classifier. Outputs: `system`, `advanced_math`, `math`, `knowledge`, `conversation`. Checked in priority order β€” advanced math is always detected before simple arithmetic. |
78
+ | **Math Engine** | `math_engine.py` | Safe AST-based evaluator for plain arithmetic (`2 + 3 * (4^2)`). No `eval()` β€” uses a whitelist of allowed AST node types. |
79
+ | **Advanced Math Engine** | `advanced_math_engine.py` | SymPy symbolic computation engine. 31+ operation types. Returns `(success, result_str, latex_str)`. |
80
+ | **Reasoning Engine** | `reasoning_engine.py` | Chain-of-thought decomposer. Identifies problem type, selects strategy, generates ordered sub-steps, assigns confidence, flags warnings. Builds structured LLM prompts. |
81
+ | **Knowledge Engine** | `knowledge_engine.py` | Local knowledge-base lookup from `knowledge.txt`. Returns `(response, found: bool)`. |
82
+ | **Conversation Engine** | `conversation_engine.py` | Pattern-matching response rules from `conversation.txt`. Returns `(response, matched: bool)`. |
83
+ | **LLM Engine** | `llm_engine.py` | Lazy-loading `Qwen2.5-0.5B-Instruct` (GGUF, Q4_K_M, ~350 MB) via `llama-cpp-python`. CPU-only, no GPU required. |
84
+ | **Memory** | `memory.py` | SQLite-backed conversation history. Powers the `/history` command. |
85
+ | **Main** | `main.py` | Terminal REPL loop. Orchestrates all engines. Displays colour-coded output. |
86
+
87
+ ---
88
+
89
+ ## Advanced Math Engine
90
+
91
+ The engine supports **31+ symbolic mathematics operations** across 11 categories:
92
+
93
+ ### Calculus
94
+ | Operation | Example Input |
95
+ |-----------|--------------|
96
+ | Indefinite integration | `integrate x^2 sin(x)` |
97
+ | Definite integration | `definite integral of x^2 from 0 to 3` |
98
+ | Differentiation (any order) | `second derivative of sin(x) * e^x` |
99
+ | Limits (including ±∞) | `limit of sin(x)/x as x approaches 0` |
100
+
101
+ ### Algebra & Equations
102
+ | Operation | Example Input |
103
+ |-----------|--------------|
104
+ | Equation solving | `solve x^2 - 5x + 6 = 0` |
105
+ | Factorisation | `factor x^3 - 8` |
106
+ | Expansion | `expand (x + y)^4` |
107
+ | Simplification | `simplify (x^2 - 1)/(x - 1)` |
108
+ | Partial fractions | `partial fraction 1/(x^2 - 1)` |
109
+ | Trig simplification | `simplify trig sin^2(x) + cos^2(x)` |
110
+
111
+ ### Differential Equations
112
+ | Operation | Example Input |
113
+ |-----------|--------------|
114
+ | ODE solving (dsolve) | `solve differential equation y'' + y = 0` |
115
+ | First-order ODEs | `solve ode dy/dx = y` |
116
+
117
+ ### Series & Transforms
118
+ | Operation | Example Input |
119
+ |-----------|--------------|
120
+ | Taylor / Maclaurin series | `taylor series of e^x around 0 order 6` |
121
+ | Laplace transform | `laplace transform of sin(t)` |
122
+ | Inverse Laplace transform | `inverse laplace of 1/(s^2 + 1)` |
123
+ | Fourier transform | `fourier transform of exp(-x^2)` |
124
+
125
+ ### Linear Algebra
126
+ | Operation | Example Input |
127
+ |-----------|--------------|
128
+ | Determinant | `determinant of [[1,2],[3,4]]` |
129
+ | Matrix inverse | `inverse matrix [[2,1],[5,3]]` |
130
+ | Eigenvalues & eigenvectors | `eigenvalue [[4,1],[2,3]]` |
131
+ | Matrix rank | `rank of matrix [[1,2,3],[4,5,6]]` |
132
+ | Matrix trace | `trace of matrix [[1,2],[3,4]]` |
133
+
134
+ ### Number Theory
135
+ | Operation | Example Input |
136
+ |-----------|--------------|
137
+ | GCD | `gcd of 48 and 18` |
138
+ | LCM | `lcm of 12 and 15` |
139
+ | Prime factorisation | `prime factorization of 360` |
140
+ | Modular arithmetic | `17 mod 5` |
141
+ | Modular inverse | `modular inverse of 3 mod 7` |
142
+
143
+ ### Statistics
144
+ | Operation | Example Input |
145
+ |-----------|--------------|
146
+ | Descriptive stats | `mean of 2, 4, 6, 8, 10` |
147
+ | Standard deviation | `standard deviation of 1, 2, 3, 4, 5` |
148
+
149
+ ### Combinatorics
150
+ | Operation | Example Input |
151
+ |-----------|--------------|
152
+ | Factorial | `factorial of 10` |
153
+ | Binomial coefficient | `binomial coefficient 10 choose 3` |
154
+ | Permutations | `permutation 6 P 2` |
155
+
156
+ ### Summations & Products
157
+ | Operation | Example Input |
158
+ |-----------|--------------|
159
+ | Finite sum | `sum of k^2 for k from 1 to 10` |
160
+ | Infinite series | `summation of 1/n^2 for n from 1 to infinity` |
161
+
162
+ ### Complex Numbers
163
+ | Operation | Example Input |
164
+ |-----------|--------------|
165
+ | All properties | `modulus of 3 + 4*I` |
166
+
167
+ ---
168
+
169
+ ## Reasoning Engine
170
+
171
+ The **ReasoningEngine** adds structured chain-of-thought reasoning at every stage.
172
+
173
+ ### Reasoning Pipeline (4 Stages)
174
+
175
+ **Stage 1 β€” Problem Analysis**
176
+ - Detects domain (calculus, linear algebra, statistics, physics, …)
177
+ - Classifies problem type (integration, ODE, comparative analysis, …)
178
+ - Identifies sub-questions implicit in the problem
179
+
180
+ **Stage 2 β€” Strategy Selection**
181
+ - Chooses the optimal solution method (u-substitution, L'HΓ΄pital, characteristic equation, …)
182
+ - Decomposes the problem into an ordered list of numbered reasoning steps
183
+
184
+ **Stage 3 β€” Verification & Confidence**
185
+ - Assigns confidence: `HIGH` (symbolic answer available), `MEDIUM`, or `LOW`
186
+ - Detects warnings: missing bounds, undetected variables, potential singularities
187
+
188
+ **Stage 4 β€” Prompt Engineering**
189
+ - Builds a structured LLM prompt that embeds the full reasoning plan
190
+ - For math: forces the LLM to follow the exact numbered steps toward the verified answer
191
+ - For knowledge: guides the LLM through the identified sub-questions in order
192
+
193
+ ### ReasoningPlan Data Structure
194
+
195
+ ```python
196
+ @dataclass
197
+ class ReasoningPlan:
198
+ problem_type: str # e.g. "integration", "equation_solving"
199
+ domain: str # e.g. "calculus", "linear_algebra"
200
+ sub_problems: list[str] # ordered reasoning steps
201
+ strategy: str # solution method description
202
+ expected_form: str # what the answer should look like
203
+ assumptions: list[str] # stated assumptions
204
+ confidence: str # HIGH / MEDIUM / LOW
205
+ warnings: list[str] # potential issues flagged
206
+ ```
207
+
208
+ ### Why Chain-of-Thought Matters for a Small LLM
209
+
210
+ The Qwen2.5-0.5B model has only 0.5 billion parameters. Without guidance it frequently:
211
+ - Skips algebraic steps
212
+ - Produces a plausible-looking but incorrect final answer
213
+ - Confuses method with result
214
+
215
+ By embedding a detailed reasoning plan in every prompt β€” and for math problems, by **providing the correct answer upfront** β€” the model's role becomes that of a *step-by-step explainer* rather than an *unsupervised solver*. This dramatically improves output quality without requiring a larger model.
216
+
217
+ ---
218
+
219
+ ## Getting Started
220
+
221
+ ### Requirements
222
+
223
+ ```bash
224
+ pip install sympy llama-cpp-python colorama
225
+ ```
226
+
227
+ The LLM model (`Qwen2.5-0.5B-Instruct-GGUF`, Q4_K_M, ~350 MB) is downloaded automatically from HuggingFace on first use and cached locally.
228
+
229
+ ### Running
230
+
231
+ ```bash
232
+ cd anveshai
233
+ python main.py
234
+ ```
235
+
236
+ Or use the **AnveshAI Edge** workflow in the Replit environment.
237
+
238
+ ---
239
+
240
+ ## Usage Examples
241
+
242
+ ```
243
+ You β€Ί integrate x^2 sin(x)
244
+
245
+ SymPy β†’ ∫ (x**2*sin(x)) dx = -x**2*cos(x) + 2*x*sin(x) + 2*cos(x) + C
246
+ Reasoning engine: decomposing problem…
247
+ [Reasoning] domain=calculus | strategy=Apply integration rules (IBP) | confidence=HIGH
248
+ Building chain-of-thought prompt β†’ LLM…
249
+
250
+ AnveshAI [AdvMath+CoT+LLM] β€Ί -x**2*cos(x) + 2*x*sin(x) + 2*cos(x) + C
251
+ [Reasoning: integration | Apply integration rules (IBP)]
252
+ Step 1: We use Integration by Parts twice…
253
+ …
254
+ ```
255
+
256
+ ```
257
+ You β€Ί solve differential equation y'' + y = 0
258
+
259
+ AnveshAI [AdvMath+CoT+LLM] β€Ί Eq(y(x), C1*sin(x) + C2*cos(x))
260
+ [Reasoning: ode_solving | Classify ODE and apply characteristic equation]
261
+ Step 1: Classify the ODE β€” this is a 2nd-order linear homogeneous ODE…
262
+ …
263
+ ```
264
+
265
+ ```
266
+ You β€Ί summation of 1/n^2 for n from 1 to infinity
267
+
268
+ AnveshAI [AdvMath+CoT+LLM] β€Ί Ξ£(n**(-2), n=1..oo) = pi**2/6
269
+ [Reasoning: summation]
270
+ Step 1: This is the Basel Problem…
271
+ ```
272
+
273
+ ---
274
+
275
+ ## Commands
276
+
277
+ | Command | Action |
278
+ |---------|--------|
279
+ | `/help` | Show all commands and usage examples |
280
+ | `/history` | Display the last 10 conversation turns |
281
+ | `/exit` | Quit the assistant |
282
+
283
+ ---
284
+
285
+ ## Design Principles
286
+
287
+ 1. **Correctness-first for mathematics** β€” The symbolic engine always runs before the LLM for mathematical queries. The LLM explains, it does not compute.
288
+
289
+ 2. **Offline-first** β€” All computation runs locally. No API keys, no internet connection required after the one-time model download.
290
+
291
+ 3. **Transparency** β€” The system prints its internal reasoning trace to the console (engine used, reasoning plan summary, confidence, warnings).
292
+
293
+ 4. **Graceful degradation** β€” Every engine has a fallback: SymPy failures fall back to CoT-guided LLM, KB misses fall back to reasoning-guided LLM, and so on.
294
+
295
+ 5. **Safety** β€” Arithmetic uses AST-based safe-eval (no `eval()`). Matrix parsing uses a validated bracket pattern before `eval()`. The LLM prompt explicitly forbids inventing a different answer.
296
+
297
+ 6. **Modularity** β€” Every engine is independent and communicates through simple return types. Adding a new math operation requires only a new handler function and a keyword entry.
298
+
299
+ ---
300
+
301
+ ## File Structure
302
+
303
+ ```
304
+ anveshai/
305
+ β”œβ”€β”€ main.py # REPL loop, orchestration, response composer
306
+ β”œβ”€β”€ router.py # Intent classification (regex + keyword)
307
+ β”œβ”€β”€ math_engine.py # Safe AST arithmetic evaluator
308
+ β”œβ”€β”€ advanced_math_engine.py # SymPy symbolic engine (31+ operations)
309
+ β”œβ”€β”€ reasoning_engine.py # Chain-of-thought reasoning (CoT)
310
+ β”œβ”€β”€ llm_engine.py # Qwen2.5-0.5B-Instruct GGUF loader
311
+ β”œβ”€β”€ knowledge_engine.py # Local KB lookup (knowledge.txt)
312
+ β”œβ”€β”€ conversation_engine.py # Pattern-response engine (conversation.txt)
313
+ β”œβ”€β”€ memory.py # SQLite conversation history
314
+ β”œβ”€β”€ knowledge.txt # Local knowledge base paragraphs
315
+ β”œβ”€β”€ conversation.txt # PATTERN|||RESPONSE rule pairs
316
+ └── anveshai_memory.db # Auto-created SQLite DB (gitignored)
317
+ ```
318
+
319
+ ---
320
+
321
+ ## Technical Details
322
+
323
+ ### Model
324
+ - **Name:** Qwen2.5-0.5B-Instruct
325
+ - **Format:** GGUF (Q4_K_M quantisation, ~350 MB)
326
+ - **Runtime:** llama-cpp-python (CPU-only via llama.cpp)
327
+ - **Context window:** 16,384 tokens
328
+ - **Parameters:** 0.5B
329
+ - **Threads:** 4 CPU threads
330
+
331
+ ### Symbolic Engine
332
+ - **Library:** SymPy 1.x
333
+ - **Parsing:** `sympy.parsing.sympy_parser` with implicit multiplication and XOR-to-power transforms
334
+ - **Supported variables:** x, y, z, t, n, k, a, b, c, m, n, p, q, r, s
335
+ - **Special constants:** Ο€, e, i (imaginary), ∞
336
+
337
+ ### Chain-of-Thought Reasoning
338
+ - **Domain detection:** 12 domain categories with keyword matching
339
+ - **Problem type classification:** 18 problem types via regex
340
+ - **Strategy library:** Pre-defined strategies for 18 problem types
341
+ - **Decomposition:** Problem-specific step generators for 15 operation types
342
+ - **Confidence levels:** HIGH (symbolic result available) / MEDIUM / LOW
343
+
344
+ ### Response Labels
345
+
346
+ | Label | Meaning |
347
+ |-------|---------|
348
+ | `Math` | Instant arithmetic result |
349
+ | `AdvMath+CoT+LLM` | SymPy exact answer + CoT plan + LLM explanation |
350
+ | `AdvMath+CoT` | CoT-guided LLM fallback (SymPy failed) |
351
+ | `Knowledge` | Local KB answer |
352
+ | `LLM+CoT-KB` | KB miss β†’ reasoning-guided LLM |
353
+ | `Chat` | Conversation pattern match |
354
+ | `LLM+CoT` | Reasoning-guided LLM for open conversation |
355
+
356
+
357
+ ## Link
358
+
359
+ - **GitHub:-** [Click Here!](https://github.com/AnveshAI/AnveshAI-Edge)
360
+ - **Zenodo:-** [Click Here!](https://zenodo.org/records/19045466)
361
+
362
+
363
+ ---
364
+ license: mit
365
+ ---