Spaces:

rahul7star
/

pyclaw-v2

Paused

App Files Files Community

pyclaw-v2 / Design.md

rahul7star

Update Design.md

b56e79e verified 20 days ago

preview code

raw

history blame contribute delete

6.41 kB

	# 🚀 PygmyClaw Multi-Agent System — Architecture & Progress

	---

	# 📌 Overview

	This project is evolving into a multi-agent AI command center capable of:

	* Code generation & execution
	* Task routing (code / research / image)
	* Persistent sessions
	* Memory + history tracking
	* Hugging Face dataset logging
	* Autonomous workflows (future)

	---

	# 🎯 Vision

	> A unified AI system that can think, act, remember, and improve over time

	---

	# 🧠 System Architecture

	## High-Level Flow

	```
	User Prompt
	↓
	Agent Router
	↓
	Agent Executor (LLM)
	↓
	Output Parser
	↓
	Execution Layer (Code / Tools)
	↓
	Session Manager (Save State)
	↓
	HF Upload (Persistence)
	↓
	Final Response
	```

	---

	# 🧱 Core Components

	## 1. `session_manager.py`

	### Responsibility:

	* Manage session lifecycle
	* Store:

	* `workspace.json`
	* `history.json`

	### Key Methods:

	```python
	create_session()
	load_session(session_id)
	append_history(data)
	update_workspace(key, value)
	get_history()
	```

	### Structure:

	```
	/workspace/hf/sessions/
	└── sess_xxxx/
	├── workspace.json
	└── history.json
	```

	---

	## 2. `agent_router.py`

	### Responsibility:

	Route user intent → correct agent type

	### Logic:

	```python
	"code" → coding agent
	"research" → research agent
	"image" → image agent
	default → command agent
	```

	### Example:

	\| Input \| Routed To \|
	\| ------------------- \| --------- \|
	\| "write python code" \| code \|
	\| "who is elon musk" \| research \|
	\| "generate image" \| image \|

	---

	## 3. `agent_executor.py`

	### Responsibility:

	Wrapper around PygmyClaw LLM

	### Features:

	* Injects system prompts
	* Controls behavior per agent type

	### Example:

	```python
	if agent_type == "code":
	return """
	- Only return Python code
	- No JSON
	- Must be executable
	"""
	```

	---

	## 4. `run_llm()` (Core Brain)

	### Responsibilities:

	* Call executor
	* Log output
	* Save to session
	* Upload to HF

	### Flow:

	```
	Prompt → LLM → Response
	↓
	Save history
	↓
	Upload session
	```

	---

	## 5. `run_agent()`

	### Responsibilities:

	* Route task
	* Call LLM
	* Extract code
	* Execute if needed

	### Flow:

	```
	Prompt
	↓
	Router → agent_type
	↓
	LLM
	↓
	Code Extract
	↓
	Execute (if code)
	```

	---

	## 6. Code Execution Layer

	### Current:

	```python
	def run_code(code):
	write → temp file
	execute → python3 file
	```

	### Issues solved:

	* Syntax errors handled
	* Execution isolated
	* Logging added

	---

	## 7. Hugging Face Upload

	### Dataset:

	```
	rahul7star/pyclaw2
	```

	### Structure:

	```
	sessions/
	└── sess_xxx/
	├── workspace.json
	└── history.json
	```

	### Upload Logic:

	```python
	upload_session(session_id)
	```

	---

	# 🔄 Current System Flow (Detailed)

	```
	User Input
	↓
	route_task()
	↓
	AgentExecutor.run()
	↓
	LLM Output
	↓
	extract_code()
	↓
	run_code() (if needed)
	↓
	SessionManager.append_history()
	↓
	upload_session()
	↓
	Return result
	```

	---

	# ✅ What We Have Built So Far

	## ✔ Core System

	* ✅ Multi-agent routing
	* ✅ LLM execution wrapper
	* ✅ Code extraction & execution
	* ✅ Session persistence
	* ✅ HF dataset integration
	* ✅ Command system (REPL)

	---

	## ✔ Stability Fixes

	* Fixed session initialization errors
	* Fixed missing `session_id`
	* Fixed upload path issues
	* Fixed PygmyClaw API mismatches
	* Fixed shell execution fallback
	* Fixed malformed LLM outputs (partial)

	---

	## ✔ Working Features

	* Generate & run Python code
	* Save execution history
	* Persist sessions across runs
	* Upload sessions to HF
	* Basic autonomous agent loop

	---

	# ⚠️ Known Limitations

	* ❌ LLM sometimes returns invalid code
	* ❌ No retry/fix loop yet
	* ❌ Tools not fully functional
	* ❌ No argument parsing for tools
	* ❌ No model switching (yet)

	---

	# 🧠 Design Philosophy

	## 1. Local-First Execution

	* Prefer running code locally over LLM calls

	## 2. Structured Memory

	* Everything stored in session files

	## 3. Deterministic Flow

	* Avoid unpredictable LLM outputs when possible

	## 4. Modular Agents

	* Each agent has a clear responsibility

	---

	# 🚀 Next Steps (Roadmap)

	---

	## 🔥 Phase 1 (Current — Stabilization)

	* [ ] Improve code validation before execution
	* [ ] Add retry loop for failed code
	* [ ] Improve logging & debugging

	---

	## 🔥 Phase 2 (Next)

	### Multi-Model Routing

	```
	Code → Code LLM
	Research → Reasoning LLM
	Image → Diffusion model
	```

	---

	## 🔥 Phase 3

	### Tool System (Deferred)

	* Convert code → reusable tools
	* Tool execution engine
	* Tool selection logic

	---

	## 🔥 Phase 4

	### Autonomous Intelligence

	* Self-improving agent
	* Task decomposition
	* Planning + execution loop

	---

	# 🧭 Future Architecture (Target)

	```
	┌──────────────┐
	│ USER UI │
	└──────┬───────┘
	↓
	┌──────────────┐
	│ ROUTER │
	└──────┬───────┘
	↓
	┌──────────────┼──────────────┐
	↓ ↓ ↓
	Code Agent Research Agent Image Agent
	↓ ↓ ↓
	└──────→ Executor Layer ←────┘
	↓
	Execution Engine
	↓
	Session Manager
	↓
	HF Dataset Store
	```

	---

	# 💡 Key Insight

	You are building:

	> ❌ NOT just a chatbot
	> ✅ BUT a persistent AI system with memory, execution, and evolution

	---

	# 🧪 Debugging Tips

	* Check logs:

	```
	/workspace/api.log
	```

	* Check session:

	```
	/workspace/hf/sessions/
	```

	* Verify upload:
	* HF dataset repo

	---

	# 🏁 Summary

	## Current State:

	✔ Functional multi-agent system
	✔ Code execution working
	✔ Session persistence working
	⚠ Needs stabilization

	---

	## Next Milestone:

	👉 Multi-model intelligence layer

	---

	# 🚀 Final Thought

	Once stable, this system becomes:

	> 🧠 A self-building AI platform — not just an interface

	---