# 🚀 PygmyClaw Multi-Agent System — Architecture & Progress

---

# 📌 Overview

This project is evolving into a **multi-agent AI command center** capable of:

* Code generation & execution
* Task routing (code / research / image)
* Persistent sessions
* Memory + history tracking
* Hugging Face dataset logging
* Autonomous workflows (future)

---

# 🎯 Vision

> A unified AI system that can **think, act, remember, and improve over time**

---

# 🧠 System Architecture

## High-Level Flow

```
User Prompt
    ↓
Agent Router
    ↓
Agent Executor (LLM)
    ↓
Output Parser
    ↓
Execution Layer (Code / Tools)
    ↓
Session Manager (Save State)
    ↓
HF Upload (Persistence)
    ↓
Final Response
```

---

# 🧱 Core Components

## 1. `session_manager.py`

### Responsibility:

* Manage session lifecycle
* Store:

  * `workspace.json`
  * `history.json`

### Key Methods:

```python
create_session()
load_session(session_id)
append_history(data)
update_workspace(key, value)
get_history()
```

### Structure:

```
/workspace/hf/sessions/
    └── sess_xxxx/
         ├── workspace.json
         └── history.json
```

---

## 2. `agent_router.py`

### Responsibility:

Route user intent → correct agent type

### Logic:

```python
"code"     → coding agent
"research" → research agent
"image"    → image agent
default    → command agent
```

### Example:

| Input               | Routed To |
| ------------------- | --------- |
| "write python code" | code      |
| "who is elon musk"  | research  |
| "generate image"    | image     |

---

## 3. `agent_executor.py`

### Responsibility:

Wrapper around **PygmyClaw LLM**

### Features:

* Injects **system prompts**
* Controls behavior per agent type

### Example:

```python
if agent_type == "code":
    return """
    - Only return Python code
    - No JSON
    - Must be executable
    """
```

---

## 4. `run_llm()` (Core Brain)

### Responsibilities:

* Call executor
* Log output
* Save to session
* Upload to HF

### Flow:

```
Prompt → LLM → Response
           ↓
    Save history
           ↓
    Upload session
```

---

## 5. `run_agent()`

### Responsibilities:

* Route task
* Call LLM
* Extract code
* Execute if needed

### Flow:

```
Prompt
  ↓
Router → agent_type
  ↓
LLM
  ↓
Code Extract
  ↓
Execute (if code)
```

---

## 6. Code Execution Layer

### Current:

```python
def run_code(code):
    write → temp file
    execute → python3 file
```

### Issues solved:

* Syntax errors handled
* Execution isolated
* Logging added

---

## 7. Hugging Face Upload

### Dataset:

```
rahul7star/pyclaw2
```

### Structure:

```
sessions/
   └── sess_xxx/
        ├── workspace.json
        └── history.json
```

### Upload Logic:

```python
upload_session(session_id)
```

---

# 🔄 Current System Flow (Detailed)

```
User Input
   ↓
route_task()
   ↓
AgentExecutor.run()
   ↓
LLM Output
   ↓
extract_code()
   ↓
run_code() (if needed)
   ↓
SessionManager.append_history()
   ↓
upload_session()
   ↓
Return result
```

---

# ✅ What We Have Built So Far

## ✔ Core System

* ✅ Multi-agent routing
* ✅ LLM execution wrapper
* ✅ Code extraction & execution
* ✅ Session persistence
* ✅ HF dataset integration
* ✅ Command system (REPL)

---

## ✔ Stability Fixes

* Fixed session initialization errors
* Fixed missing `session_id`
* Fixed upload path issues
* Fixed PygmyClaw API mismatches
* Fixed shell execution fallback
* Fixed malformed LLM outputs (partial)

---

## ✔ Working Features

* Generate & run Python code
* Save execution history
* Persist sessions across runs
* Upload sessions to HF
* Basic autonomous agent loop

---

# ⚠️ Known Limitations

* ❌ LLM sometimes returns invalid code
* ❌ No retry/fix loop yet
* ❌ Tools not fully functional
* ❌ No argument parsing for tools
* ❌ No model switching (yet)

---

# 🧠 Design Philosophy

## 1. Local-First Execution

* Prefer running code locally over LLM calls

## 2. Structured Memory

* Everything stored in session files

## 3. Deterministic Flow

* Avoid unpredictable LLM outputs when possible

## 4. Modular Agents

* Each agent has a clear responsibility

---

# 🚀 Next Steps (Roadmap)

---

## 🔥 Phase 1 (Current — Stabilization)

* [ ] Improve code validation before execution
* [ ] Add retry loop for failed code
* [ ] Improve logging & debugging

---

## 🔥 Phase 2 (Next)

### Multi-Model Routing

```
Code → Code LLM
Research → Reasoning LLM
Image → Diffusion model
```

---

## 🔥 Phase 3

### Tool System (Deferred)

* Convert code → reusable tools
* Tool execution engine
* Tool selection logic

---

## 🔥 Phase 4

### Autonomous Intelligence

* Self-improving agent
* Task decomposition
* Planning + execution loop

---

# 🧭 Future Architecture (Target)

```
                ┌──────────────┐
                │   USER UI    │
                └──────┬───────┘
                       ↓
                ┌──────────────┐
                │   ROUTER     │
                └──────┬───────┘
                       ↓
        ┌──────────────┼──────────────┐
        ↓              ↓              ↓
   Code Agent     Research Agent   Image Agent
        ↓              ↓              ↓
        └──────→ Executor Layer ←────┘
                       ↓
              Execution Engine
                       ↓
              Session Manager
                       ↓
                HF Dataset Store
```

---

# 💡 Key Insight

You are building:

> ❌ NOT just a chatbot
> ✅ BUT a **persistent AI system with memory, execution, and evolution**

---

# 🧪 Debugging Tips

* Check logs:

```
/workspace/api.log
```

* Check session:

```
/workspace/hf/sessions/
```

* Verify upload:
* HF dataset repo

---

# 🏁 Summary

## Current State:

✔ Functional multi-agent system
✔ Code execution working
✔ Session persistence working
⚠ Needs stabilization

---

## Next Milestone:

👉 **Multi-model intelligence layer**

---

# 🚀 Final Thought

Once stable, this system becomes:

> 🧠 A self-building AI platform — not just an interface

---