pyclaw-v2 / Design.md
rahul7star's picture
Update Design.md
b56e79e verified
# πŸš€ PygmyClaw Multi-Agent System β€” Architecture & Progress
---
# πŸ“Œ Overview
This project is evolving into a **multi-agent AI command center** capable of:
* Code generation & execution
* Task routing (code / research / image)
* Persistent sessions
* Memory + history tracking
* Hugging Face dataset logging
* Autonomous workflows (future)
---
# 🎯 Vision
> A unified AI system that can **think, act, remember, and improve over time**
---
# 🧠 System Architecture
## High-Level Flow
```
User Prompt
↓
Agent Router
↓
Agent Executor (LLM)
↓
Output Parser
↓
Execution Layer (Code / Tools)
↓
Session Manager (Save State)
↓
HF Upload (Persistence)
↓
Final Response
```
---
# 🧱 Core Components
## 1. `session_manager.py`
### Responsibility:
* Manage session lifecycle
* Store:
* `workspace.json`
* `history.json`
### Key Methods:
```python
create_session()
load_session(session_id)
append_history(data)
update_workspace(key, value)
get_history()
```
### Structure:
```
/workspace/hf/sessions/
└── sess_xxxx/
β”œβ”€β”€ workspace.json
└── history.json
```
---
## 2. `agent_router.py`
### Responsibility:
Route user intent β†’ correct agent type
### Logic:
```python
"code" β†’ coding agent
"research" β†’ research agent
"image" β†’ image agent
default β†’ command agent
```
### Example:
| Input | Routed To |
| ------------------- | --------- |
| "write python code" | code |
| "who is elon musk" | research |
| "generate image" | image |
---
## 3. `agent_executor.py`
### Responsibility:
Wrapper around **PygmyClaw LLM**
### Features:
* Injects **system prompts**
* Controls behavior per agent type
### Example:
```python
if agent_type == "code":
return """
- Only return Python code
- No JSON
- Must be executable
"""
```
---
## 4. `run_llm()` (Core Brain)
### Responsibilities:
* Call executor
* Log output
* Save to session
* Upload to HF
### Flow:
```
Prompt β†’ LLM β†’ Response
↓
Save history
↓
Upload session
```
---
## 5. `run_agent()`
### Responsibilities:
* Route task
* Call LLM
* Extract code
* Execute if needed
### Flow:
```
Prompt
↓
Router β†’ agent_type
↓
LLM
↓
Code Extract
↓
Execute (if code)
```
---
## 6. Code Execution Layer
### Current:
```python
def run_code(code):
write β†’ temp file
execute β†’ python3 file
```
### Issues solved:
* Syntax errors handled
* Execution isolated
* Logging added
---
## 7. Hugging Face Upload
### Dataset:
```
rahul7star/pyclaw2
```
### Structure:
```
sessions/
└── sess_xxx/
β”œβ”€β”€ workspace.json
└── history.json
```
### Upload Logic:
```python
upload_session(session_id)
```
---
# πŸ”„ Current System Flow (Detailed)
```
User Input
↓
route_task()
↓
AgentExecutor.run()
↓
LLM Output
↓
extract_code()
↓
run_code() (if needed)
↓
SessionManager.append_history()
↓
upload_session()
↓
Return result
```
---
# βœ… What We Have Built So Far
## βœ” Core System
* βœ… Multi-agent routing
* βœ… LLM execution wrapper
* βœ… Code extraction & execution
* βœ… Session persistence
* βœ… HF dataset integration
* βœ… Command system (REPL)
---
## βœ” Stability Fixes
* Fixed session initialization errors
* Fixed missing `session_id`
* Fixed upload path issues
* Fixed PygmyClaw API mismatches
* Fixed shell execution fallback
* Fixed malformed LLM outputs (partial)
---
## βœ” Working Features
* Generate & run Python code
* Save execution history
* Persist sessions across runs
* Upload sessions to HF
* Basic autonomous agent loop
---
# ⚠️ Known Limitations
* ❌ LLM sometimes returns invalid code
* ❌ No retry/fix loop yet
* ❌ Tools not fully functional
* ❌ No argument parsing for tools
* ❌ No model switching (yet)
---
# 🧠 Design Philosophy
## 1. Local-First Execution
* Prefer running code locally over LLM calls
## 2. Structured Memory
* Everything stored in session files
## 3. Deterministic Flow
* Avoid unpredictable LLM outputs when possible
## 4. Modular Agents
* Each agent has a clear responsibility
---
# πŸš€ Next Steps (Roadmap)
---
## πŸ”₯ Phase 1 (Current β€” Stabilization)
* [ ] Improve code validation before execution
* [ ] Add retry loop for failed code
* [ ] Improve logging & debugging
---
## πŸ”₯ Phase 2 (Next)
### Multi-Model Routing
```
Code β†’ Code LLM
Research β†’ Reasoning LLM
Image β†’ Diffusion model
```
---
## πŸ”₯ Phase 3
### Tool System (Deferred)
* Convert code β†’ reusable tools
* Tool execution engine
* Tool selection logic
---
## πŸ”₯ Phase 4
### Autonomous Intelligence
* Self-improving agent
* Task decomposition
* Planning + execution loop
---
# 🧭 Future Architecture (Target)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ USER UI β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ROUTER β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
↓ ↓ ↓
Code Agent Research Agent Image Agent
↓ ↓ ↓
└──────→ Executor Layer β†β”€β”€β”€β”€β”˜
↓
Execution Engine
↓
Session Manager
↓
HF Dataset Store
```
---
# πŸ’‘ Key Insight
You are building:
> ❌ NOT just a chatbot
> βœ… BUT a **persistent AI system with memory, execution, and evolution**
---
# πŸ§ͺ Debugging Tips
* Check logs:
```
/workspace/api.log
```
* Check session:
```
/workspace/hf/sessions/
```
* Verify upload:
* HF dataset repo
---
# 🏁 Summary
## Current State:
βœ” Functional multi-agent system
βœ” Code execution working
βœ” Session persistence working
⚠ Needs stabilization
---
## Next Milestone:
πŸ‘‰ **Multi-model intelligence layer**
---
# πŸš€ Final Thought
Once stable, this system becomes:
> 🧠 A self-building AI platform β€” not just an interface
---