Spaces:
Sleeping
Sleeping
File size: 14,568 Bytes
bfaf58a f36cf20 ee294d7 f36cf20 bfaf58a f36cf20 bfaf58a 20e5510 9bd51ef 359123a ab12077 9bd51ef 20e5510 9bd51ef 507bcad 9bd51ef 20e5510 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 |
---
title: Calculus Agent
emoji: 🌌
colorFrom: gray
colorTo: gray
sdk: docker
pinned: false
license: mit
short_description: Multi-Agent Calculus Orchestration System
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# Pochi 4.o: Multi-Agent Calculus Orchestration System
Pochi is a high-performance, asynchronous AI platform specialized in solving complex calculus problems. It utilizes a stateful multi-agent system built on LangGraph, coordinating multiple specialized LLMs and symbolic computation engines to achieve pedagogical excellence and mathematical precision.
## Live Demo
| Platform | URL |
| :--- | :--- |
| **Hugging Face** | [Visit Pochi on Hugging Face](https://huggingface.co/spaces/baeGil/calculus-agent) |
## Project Achievements & Performance
Pochi's performance and reliability are continuously monitored via LangSmith. The following data highlights the system's operational excellence and high-speed reasoning capabilities.

### System Health & Usage
| Metric | Value | Description |
| :--- | :--- | :--- |
| **Total Runs** | 476 | Cumulative successful execution cycles. |
| **Total Tokens** | 1.86M | Aggregate token throughput across all agents. |
| **Median Tokens** | 2,846 | Average context size per solver request. |
| **Success Rate** | 99% | System resilience against API and execution errors. |
| **Streaming Adoption** | 99% | Percentage of responses delivered via SSE for real-time feedback. |
### Latency Performance
> Latency varies significantly based on task complexity (e.g., Simple symbolic math vs. Multi-image OCR + Recursive code fixing).
| Stage | P50 (Median) | P99 (Tail) |
| :--- | :---: | :---: |
| **Time to First Token (TTFT)** | 0.53s | 5.30s |
| **End-to-End Latency** | 1.51s | 36.95s |
**Analysis**:
- **Responsiveness**: A P50 TTFT of **0.53s** ensures that users perceive an "instant" start to the response, crucial for engagement.
- **Efficiency**: The P50 latency of **1.51s** for full calculus resolution demonstrates the high-performance nature of the asynchronous multi-agent orchestration.
- **Complexity Buffer**: The P99 latency (**~37s**) accounts for the most intensive "Self-Healing" loops, where the system may perform multiple recursive code fixes or deep vision analysis.
## Highlight Features
- **Multi-Agent Orchestration**: Stateful DAG-based workflow using LangGraph for complex, multi-stage reasoning.
- **Parallel Sub-problem Processing**: Intelligent decomposition of complex queries into independent atomic tasks executed in parallel.
- **Multimodal OCR Intelligence**: High-fidelity vision extraction from up to 5 concurrent images with specialized math support.
- **Hybrid Solving Engine**: Seamlessly combines symbolic precision (Wolfram Alpha) with algorithmic logic (Python Executor).
- **Intelligent Long-Term Memory**: Massive 256K token context window with proactive memory management and token tracking.
- **Premium UI/UX**: Modern glassmorphism design with reactive animations, interactive tours, and native LaTeX rendering.
## System Architecture and Pipeline
The system is engineered as a directed acyclic graph (DAG) of specialized nodes, managed by a central orchestrator that maintains a consistent state throughout the conversation turn.
### The Execution Pipeline
1. **Vision Ingestion (OCR Agent)**: Processes up to 5 concurrent image inputs. Utilizing Llama-4 Maverick, it extracts raw text and LaTeX-formatted mathematical expressions.
2. **Strategic Decomposition (Planner)**: Analyzes user intent and OCR data to generate a vectorized execution plan. It decomposes composite problems into independent atomic tasks (JSON defined).
3. **Parallel Orchestration (Executor)**: The core processing engine that spawns asynchronous execution threads for each atomic task:
- **Symbolic Branch**: Direct interface with Wolfram Alpha API for verified algebraic and calculus manipulation.
- **Algorithmic Branch**: Python Code Engine (Qwen3-32B) for numerical methods or complex multi-step logic.
- **Heuristic Branch**: Direct LLM solving for theoretical or conceptual queries.
4. **Self-Correction Loop (Code Engine)**: If the Algorithmic Branch encounters execution errors, a specialized CodeFixer (GPT-OSS-120B) performs recursive debugging and code modification.
5. **Contextual Synthesis (Synthetic Agent)**: Aggregates atomic results, resolves inter-task dependencies, and consults conversation history to produce a structured, pedagogical response.
### Technical Workflow Diagram
```mermaid
graph TD
User([User Request]) --> API[FastAPI Entry]
API --> State[Agent State Initialization]
State --> OCR{OCR Node}
OCR -- Multi-Image --> Vision[Llama-4 Maverick]
Vision --> Planner[Planner Node: Kimi K2]
OCR -- Text Only --> Planner
Planner --> Plan{Execution Plan}
Plan -- All Direct --> Synthetic[Synthetic Agent]
Plan -- Tool Required --> Executor[Parallel Executor Node]
subgraph ParallelTasks["Async Task Orchestration"]
Executor --> Wolfram[Wolfram Alpha API]
Executor --> Code[Qwen3 Code Gen]
Code --> Exec[Python Executor]
Exec -- Error --> Fixer[GPT-OSS-120B Fixer]
Fixer --> Exec
end
ParallelTasks --> Synthetic
Synthetic --> Render[LaTeX Formatter]
Render --> SSE[SSE Stream]
SSE --> User
subgraph Observability["System Monitoring"]
Tracing[LangSmith Trace]
Memory[Session Memory Tracker]
RateLimit[Token/Request Limiter]
end
API -.-> Observability
Executor -.-> Observability
```
## Fault Tolerance and Error Handling
Pochi is built with a "Resilience-First" mindset, ensuring that the system remains operational and provides accurate results even when facing API failures or ambiguous inputs.
### 1. Model Redundancy and Failover
- **OCR Failover**: If the primary vision model (Maverick) encounters rate limits or internal errors, the system automatically redirects requests to a high-speed fallback model (Scout).
- **Model Switching**: The `ModelManager` dynamically monitors model health and rate limits (RPM/TPM), performing seamless transitions between tiers without task interruption.
### 2. "Self-Healing" Algorithmic Solving
- **Recursive Debugging**: The Python Code Engine is not a simple "one-shot" executor. If generated code fails (SyntaxError, ZeroDivision, etc.), the system sends the error log back to the `CodeFixer` agent.
- **Fix Loop**: The system allows for multiple recursive fix attempts, where the agent analyzes the stack trace and re-writes the logic until a successful execution is achieved.
### 3. Graceful Degradation of Tools
- **Wolfram-to-Code Fallback**: Symbolic math is the gold standard for precision. However, if the Wolfram Alpha API exceeds its 2000-req/month quota or times out, the system automatically shifts the problem to the Algorithmic Branch for a numerical solve.
- **Synthesis Resilience**: If the Synthetic Agent fails to format the final response (e.g., due to context length), the system performs a "raw-safe" synthesis, delivering the tool results directly to the user to ensure no data is lost.
### 4. Robust State and Parsing
- **Durable IO**: The background agent task saves intermediate results to the database immediately upon generation. This ensures that even if a client disconnects during a 20-second calculation, the result is waiting for them upon refresh.
- **JSON Recovery**: LLMs occasionally return malformed JSON. The `Planner` includes a multi-stage recovery logic that uses regex and string normalization to repair broken JSON blocks, preventing system crashes on minor formatting errors.
### 5. Memory and Resource Safety
- **Context Protection**: The `SessionMemoryTracker` proactively blocks requests that would exceed the 256K token limit, preventing "half-baked" or truncated responses from the LLM.
- **Rate Limit Resilience**: Integrated backoff and retry mechanisms for all third-party API calls (Groq, Wolfram, LangSmith).
## Model Distribution and Specialization
| Component | Model Identifier | Specialization |
| :--- | :--- | :--- |
| **OCR (Primary)** | Llama-4 Maverick | Multi-modal mathematical extraction. |
| **OCR (Fallback)** | Llama-4 Scout | High-speed redundancy for simple OCR. |
| **Planner & Synthesis** | Kimi K2-Instruct | 256K Context, complex reasoning, and pedagogy. |
| **Code Generation** | Qwen3-32B-Instruct | Optimized for Pythonic mathematical logic. |
| **Code Rectification** | GPT-OSS-120B | Deep-context code debugging and error resolution. |
| **Symbolic Logic** | Wolfram Alpha | Deterministic symbolic computation (2000 req/mo). |
## Project Structure
```text
.
├── backend/ # FastAPI Application & LangGraph Agents
│ ├── agent/ # Multi-agent logic (Nodes, Graph, State)
│ ├── database/ # SQLite models and migrations
│ ├── tools/ # Symbolic & Algorithmic executor tools
│ └── utils/ # Memory tracking, rate limiting, tracing
├── frontend/ # React (Vite) Application
│ ├── src/
│ │ ├── components/ # UI components (Math rendering, Tour)
│ │ └── App.jsx # Main application logic
├── Dockerfile # Containerized deployment
├── pyproject.toml # Python dependencies & metadata
└── README.md # Technical documentation
```
## Mathematics & Computation Stack
Pochi utilizes a heavy-duty scientific stack for high-precision calculations:
- **Symbolic**: SymPy, Wolfram Alpha API
- **Numerical**: NumPy, SciPy, Mpmath
- **Optimization**: CVXpy, PuLP
- **Visuals**: Matplotlib, Seaborn, Plotly
- **Data**: Pandas, Polars, Statsmodels
## Local Deployment
### Environment Configuration
Create a `.env` file in the root directory:
```env
GROQ_API_KEY=your_key_here
WOLFRAM_ALPHA_APP_ID=your_id_here
LANGSMITH_API_KEY=your_key_here (optional for tracking)
LANGSMITH_PROJECT=calculus-chatbot
LANGSMITH_TRACING=true
```
### Backend Infrastructure
1. Initialize virtual environment: `uv venv && source .venv/bin/activate`
2. Install dependencies: `uv pip install -r requirements.txt`
3. Launch Service: `python main.py`
### Frontend Application
1. Navigate to workspace: `cd frontend`
2. Install packages: `npm install`
3. Development server: `npm run dev`
### Docker Deployment
Build and run the entire stack:
```bash
docker build -t pochi-app .
docker run -p 7860:7860 -v ./data:/data --env-file .env pochi-app
```
## API Documentation
The backend service automatically generates interactive API documentation.
- **Swagger UI**: `http://localhost:7860/docs`
- **ReDoc**: `http://localhost:7860/redoc`
## Advanced Customization
### Prompt Engineering
The system's persona and logic are defined in `backend/agent/prompts.py`:
- **GUARD_PROMPT**: Defines the "Pochi" persona and strict safety guardrails.
- **TOT_PROMPT**: Enforces the Tree-of-Thought reasoning process (Plan -> Solve -> Verify).
- **PLANNER_SYSTEM_PROMPT**: Controls the multi-modal decomposition logic.
Developers can modify these constants to adjust the chatbot's tone or reasoning strictness.
## Security & Privacy Guidelines
- **Session Isolation**: User sessions are logically isolated in the database (`conversations` table) and memory cache.
- **Transient Data**: Uploaded images are processed in-memory (or temp storage) and converted to base64/embeddings; they are not permanently retained on disk for privacy.
## Known Limitations
- **Multimodal Cap**: Supports a maximum of 5 distinct images per query to manage context window limits.
- **Symbolic Rate Limit**: Wolfram Alpha requests are capped at 2000/month. Heavy usage will degrade to the numerical Python solver (Qwen3).
- **Latency**: Complex multi-step reasoning (Plan -> Code -> Fix -> Synthesize) may take 15-30s to fully resolve.
### AI Model Rate Limits
The system enforces strict rate limits to ensure stability and usage fairness:
| Model ID | RPM (Req/Min) | RPD (Req/Day) | TPM (Tokens/Min) | TPD (Tokens/Day) |
| :--- | :---: | :---: | :---: | :---: |
| **Kimi K2 Instruct** | 60 | 1,000 | 10,000 | 300,000 |
| **Llama-4 Maverick** | 30 | 1,000 | 6,000 | 500,000 |
| **Llama-4 Scout** | 30 | 1,000 | 30,000 | 500,000 |
| **Qwen3-32B** | 60 | 1,000 | 6,000 | 500,000 |
| **GPT-OSS-120B** | 30 | 1,000 | 8,000 | 200,000 |
## API Usage Examples
### Natural Language Calculus
> "Tính đạo hàm của f(x) = x^2 + 3x + 2"
### Multimodal Math Analysis (Image Support)
> [Upload 2 images of a calculus problem] "Giải bài toán trong ảnh sau"
### Algorithmic Mathematical Tasks
> "Sử dụng mã Python để tìm 100 số nguyên tố đầu tiên và giải thích thuật toán Sieve of Eratosthenes."
## Troubleshooting
| Issue | Possible Cause | Solution |
| :--- | :--- | :--- |
| **413 Payload Too Large** | Uploading images > 10MB total. | Reduce image size or upload fewer files per turn. |
| **429 Too Many Requests** | Exceeded Wolfram or LLM rate limits. | Wait 60s or switch to a different model tier in `.env`. |
| **LangSmith Error** | Invalid or missing API Key. | Set `LANGSMITH_TRACING=false` in `.env` to disable. |
| **Docker Build Fail** | Network timeout on `uv sync`. | Check internet connection or increase Docker memory limit. |
## Contributing
We welcome contributions! Please follow these steps:
1. Fork the repository.
2. Create a feature branch: `git checkout -b feature/amazing-feature`.
3. Commit your changes: `git commit -m 'Add amazing feature'`.
4. Push to the branch: `git push origin feature/amazing-feature`.
5. Open a Pull Request.
## License
Distributed under the MIT License. See `LICENSE` for more information.
## Acknowledgments
We deeply appreciate the open-source community and the providers of the powerful technologies that make Pochi possible:
- **AI & Logic Providers**:
- **LangChain & LangGraph**: For the robust orchestration framework.
- **Groq**: For ultra-low latency Llama inference.
- **Alibaba**: For the Qwen model.
- **OpenAI**: For the GPT-oss model.
- **Moonshot AI**: For the Kimi reasoning model.
- **Meta AI**: For the Llama vision models.
- **Wolfram Alpha**: For the symbolic computation engine.
- **Frontend Ecosystem**:
- **React & Vite**: For the blazing fast UI.
- **Lucide React**: For the beautiful icon set.
|