pyclaw-v2 / Design.md
rahul7star's picture
Update Design.md
b56e79e verified

πŸš€ PygmyClaw Multi-Agent System β€” Architecture & Progress


πŸ“Œ Overview

This project is evolving into a multi-agent AI command center capable of:

  • Code generation & execution
  • Task routing (code / research / image)
  • Persistent sessions
  • Memory + history tracking
  • Hugging Face dataset logging
  • Autonomous workflows (future)

🎯 Vision

A unified AI system that can think, act, remember, and improve over time


🧠 System Architecture

High-Level Flow

User Prompt
    ↓
Agent Router
    ↓
Agent Executor (LLM)
    ↓
Output Parser
    ↓
Execution Layer (Code / Tools)
    ↓
Session Manager (Save State)
    ↓
HF Upload (Persistence)
    ↓
Final Response

🧱 Core Components

1. session_manager.py

Responsibility:

  • Manage session lifecycle

  • Store:

    • workspace.json
    • history.json

Key Methods:

create_session()
load_session(session_id)
append_history(data)
update_workspace(key, value)
get_history()

Structure:

/workspace/hf/sessions/
    └── sess_xxxx/
         β”œβ”€β”€ workspace.json
         └── history.json

2. agent_router.py

Responsibility:

Route user intent β†’ correct agent type

Logic:

"code"     β†’ coding agent
"research" β†’ research agent
"image"    β†’ image agent
default    β†’ command agent

Example:

Input Routed To
"write python code" code
"who is elon musk" research
"generate image" image

3. agent_executor.py

Responsibility:

Wrapper around PygmyClaw LLM

Features:

  • Injects system prompts
  • Controls behavior per agent type

Example:

if agent_type == "code":
    return """
    - Only return Python code
    - No JSON
    - Must be executable
    """

4. run_llm() (Core Brain)

Responsibilities:

  • Call executor
  • Log output
  • Save to session
  • Upload to HF

Flow:

Prompt β†’ LLM β†’ Response
           ↓
    Save history
           ↓
    Upload session

5. run_agent()

Responsibilities:

  • Route task
  • Call LLM
  • Extract code
  • Execute if needed

Flow:

Prompt
  ↓
Router β†’ agent_type
  ↓
LLM
  ↓
Code Extract
  ↓
Execute (if code)

6. Code Execution Layer

Current:

def run_code(code):
    write β†’ temp file
    execute β†’ python3 file

Issues solved:

  • Syntax errors handled
  • Execution isolated
  • Logging added

7. Hugging Face Upload

Dataset:

rahul7star/pyclaw2

Structure:

sessions/
   └── sess_xxx/
        β”œβ”€β”€ workspace.json
        └── history.json

Upload Logic:

upload_session(session_id)

πŸ”„ Current System Flow (Detailed)

User Input
   ↓
route_task()
   ↓
AgentExecutor.run()
   ↓
LLM Output
   ↓
extract_code()
   ↓
run_code() (if needed)
   ↓
SessionManager.append_history()
   ↓
upload_session()
   ↓
Return result

βœ… What We Have Built So Far

βœ” Core System

  • βœ… Multi-agent routing
  • βœ… LLM execution wrapper
  • βœ… Code extraction & execution
  • βœ… Session persistence
  • βœ… HF dataset integration
  • βœ… Command system (REPL)

βœ” Stability Fixes

  • Fixed session initialization errors
  • Fixed missing session_id
  • Fixed upload path issues
  • Fixed PygmyClaw API mismatches
  • Fixed shell execution fallback
  • Fixed malformed LLM outputs (partial)

βœ” Working Features

  • Generate & run Python code
  • Save execution history
  • Persist sessions across runs
  • Upload sessions to HF
  • Basic autonomous agent loop

⚠️ Known Limitations

  • ❌ LLM sometimes returns invalid code
  • ❌ No retry/fix loop yet
  • ❌ Tools not fully functional
  • ❌ No argument parsing for tools
  • ❌ No model switching (yet)

🧠 Design Philosophy

1. Local-First Execution

  • Prefer running code locally over LLM calls

2. Structured Memory

  • Everything stored in session files

3. Deterministic Flow

  • Avoid unpredictable LLM outputs when possible

4. Modular Agents

  • Each agent has a clear responsibility

πŸš€ Next Steps (Roadmap)


πŸ”₯ Phase 1 (Current β€” Stabilization)

  • Improve code validation before execution
  • Add retry loop for failed code
  • Improve logging & debugging

πŸ”₯ Phase 2 (Next)

Multi-Model Routing

Code β†’ Code LLM
Research β†’ Reasoning LLM
Image β†’ Diffusion model

πŸ”₯ Phase 3

Tool System (Deferred)

  • Convert code β†’ reusable tools
  • Tool execution engine
  • Tool selection logic

πŸ”₯ Phase 4

Autonomous Intelligence

  • Self-improving agent
  • Task decomposition
  • Planning + execution loop

🧭 Future Architecture (Target)

                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚   USER UI    β”‚
                β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                       ↓
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚   ROUTER     β”‚
                β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                       ↓
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        ↓              ↓              ↓
   Code Agent     Research Agent   Image Agent
        ↓              ↓              ↓
        └──────→ Executor Layer β†β”€β”€β”€β”€β”˜
                       ↓
              Execution Engine
                       ↓
              Session Manager
                       ↓
                HF Dataset Store

πŸ’‘ Key Insight

You are building:

❌ NOT just a chatbot βœ… BUT a persistent AI system with memory, execution, and evolution


πŸ§ͺ Debugging Tips

  • Check logs:
/workspace/api.log
  • Check session:
/workspace/hf/sessions/
  • Verify upload:
  • HF dataset repo

🏁 Summary

Current State:

βœ” Functional multi-agent system βœ” Code execution working βœ” Session persistence working ⚠ Needs stabilization


Next Milestone:

πŸ‘‰ Multi-model intelligence layer


πŸš€ Final Thought

Once stable, this system becomes:

🧠 A self-building AI platform β€” not just an interface