pygmy22 / desgin.md
ohamlab's picture
Migrated from GitHub
bf70ca8 verified

design.md

PygmyClaw Agent System

app.py  β†’ UI
pygmyclaw.py β†’ AI agent + queue + HF storage
pygmyclaw_multitool.py β†’ tool execution engine
tools.json β†’ tool registry stored on HuggingFace
memory.json β†’ task memory stored on HuggingFace



        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚   Streamlit   β”‚
        β”‚    app.py     β”‚
        β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚   PygmyClaw   β”‚
        β”‚  pygmyclaw.py β”‚
        β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
      Queue + Agent logic
               β”‚
               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  pygmyclaw_multitoolβ”‚
    β”‚     run_tool()      β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
    HF Storage (tools.json / memory.json)

1. Overview

PygmyClaw is a local AI agent framework designed to run in a Docker Space.

The system combines:

  • speculative decoding for faster generation
  • tool-calling architecture
  • dynamic tool creation
  • artifact generation (files/code/data)
  • background task execution
  • interactive UI

The goal is to create a self-extensible AI system where the agent can:

  • use tools
  • create new tools
  • generate code
  • install dependencies
  • run programs
  • manage tasks
  • interact through a UI

2. High-Level System Architecture

User
 ↓
HF Space UI (Gradio)
 ↓
Agent Engine (pygmyclaw.py)
 ↓
Speculative Decoding Engine
 ↓
Tool Call Parser
 ↓
Tool Executor (subprocess)
 ↓
Dynamic Tool Registry
 ↓
Workspace / Artifacts / Queue
 ↓
Result returned to LLM

3. System Layers

The system is organized into five layers.

Layer 1 β€” User Interface

Provides an interactive interface to the agent.

Responsibilities:

  • display chat interaction
  • show generated artifacts
  • provide code editor
  • allow running generated code
  • show execution output

UI components:

Chat Panel
Artifact Viewer
Code Editor
Execution Console

Example layout:

----------------------------------
Chat                | Code Editor
                    |
                    | demo_nn.py
                    |
                    | [Run] [Save]
----------------------------------
Console Output
----------------------------------

4. Layer 2 β€” Agent Engine

Implemented in:

pygmyclaw.py

This is the central orchestrator of the system.

Responsibilities:

  • manage LLM interaction
  • run speculative decoding
  • parse tool calls
  • execute tools
  • maintain context
  • handle queues
  • coordinate artifacts

The engine runs an agent execution loop.


5. Agent Execution Loop

The entire system is driven by the following loop.

User Prompt
   ↓
Agent (LLM)
   ↓
LLM outputs JSON tool call
   ↓
Tool executes
   ↓
Result returned to LLM
   ↓
LLM continues reasoning
   ↓
Final response

Expanded loop:

User Prompt
   ↓
LLM reasoning
   ↓
Tool call JSON
   ↓
Tool executor
   ↓
Tool result
   ↓
Context updated
   ↓
LLM reasoning continues

The loop stops when the LLM returns a final answer instead of a tool call.


6. Speculative Decoding Engine

PygmyClaw speeds up generation using speculative decoding.

Architecture:

Drafter 1
Drafter 2
Drafter 3
   ↓
Verifier

Flow:

User prompt
  ↓
Draft tokens generated
  ↓
Verifier checks tokens
  ↓
Accept or reject

This improves generation speed while maintaining accuracy.

Typical configuration:

3 draft models
1 verifier model

Each runs in a separate Ollama instance.


7. Tool System

Tools allow the agent to perform actions outside the LLM.

Tools run in an isolated subprocess:

pygmyclaw_multitool.py

Execution flow:

Agent
 ↓
Tool call JSON
 ↓
Subprocess execution
 ↓
Tool result JSON

Example tool call:

{
 "tool": "sys_info",
 "parameters": {}
}

Tool response:

{
 "os": "Linux",
 "python_version": "3.11"
}

8. Tool Categories

The system supports multiple tool types.

System Tools

sys_info
list_files
read_file
write_file

Environment Tools

install_python_package
check_package_installed
execute_shell

Code Tools

write_python_code
run_python_file
format_code

Artifact Tools

create_artifact
update_artifact
delete_artifact

Agent Tools

create_agent
list_agents
run_agent

9. Dynamic Tool Creation

Agents can create new tools dynamically.

Example prompt:

create a tool to fetch python documentation

Agent calls:

{
 "tool": "create_tool",
 "parameters": {
  "name": "python_docs",
  "description": "search python docs"
 }
}

System creates:

tools/python_docs.py

Tool becomes immediately available.

This enables self-extending agents.


10. Tool Discovery

At startup the engine scans the tools directory.

tools/
   echo.py
   sys_info.py
   run_python.py

A tool registry is generated.

Example:

TOOLS = {
 echo,
 sys_info,
 run_python
}

The tool list is provided to the LLM.


11. Artifact System

Artifacts are files generated by the agent.

Examples:

code
datasets
documents
images
logs

Directory structure:

artifacts/
   code/
   data/
   documents/

Example artifact:

artifacts/code/addition.py

Artifacts enable UI interaction.


12. Artifact UI Interaction

When artifacts are created the UI displays controls.

Example:

addition.py

Edit
Run
Download

Artifact metadata may include:

language
dependencies
runnable
created_by

13. Workspace Environment

All agent work occurs in a dedicated workspace.

workspace/
   venv/
   artifacts/
   tools/
   agents/

The workspace provides:

  • dependency isolation
  • file management
  • persistent agent data

14. Dependency Management

Agents may install packages.

Example prompt:

create a pytorch neural network demo

Agent detects dependency:

torch

Tool call:

install_python_package("torch")

Installed inside the workspace environment.

workspace/venv/

15. Code Generation Workflow

Example request:

create a neural network code demo in python using pytorch

Execution flow:

User prompt
 ↓
Agent reasoning
 ↓
Install dependency
 ↓
Generate code
 ↓
Save artifact
 ↓
UI displays code

Artifact example:

artifacts/code/pytorch_nn_demo.py

UI shows:

Edit
Run
Download

16. Code Execution Workflow

When the user presses Run:

UI
 ↓
run_python_file tool
 ↓
workspace python interpreter
 ↓
program execution
 ↓
stdout returned

Console output appears in the UI.


17. Task Queue

The system supports background tasks.

Queue storage options:

Redis
JSON file

Example task:

{
 "id": "123",
 "prompt": "generate dataset"
}

Queue worker processes tasks asynchronously.


18. Agent Registry

Agents are stored as configuration files.

agents/
   python_coder.json
   research_agent.json

Example agent definition:

{
 "name": "python_coder",
 "model": "qwen2.5",
 "tools": [
   "write_python_code",
   "run_python_file"
 ]
}

Agents can be created dynamically.


19. Redis Usage

Redis is optional but improves scalability.

Uses:

task queue
agent memory
caching

Configuration can be stored in Redis or local config.


20. Hugging Face Space Deployment

The system runs inside a Docker Space.

Components:

Python
Ollama
Gradio
Redis (optional)

Startup sequence:

Start container
 ↓
Start Ollama
 ↓
Load models
 ↓
Start agent engine
 ↓
Launch UI

21. Security Considerations

Important restrictions include:

  • package allowlist
  • sandboxed tool execution
  • workspace file isolation
  • limited shell access

Example allowed packages:

numpy
pandas
torch
scikit-learn
matplotlib

22. End-to-End Example

User request:

create a neural network code demo in python that uses pytorch

System execution:

User prompt
 ↓
Agent reasoning
 ↓
install_python_package(torch)
 ↓
write_file(pytorch_nn_demo.py)
 ↓
artifact created
 ↓
UI displays code

User interaction:

Edit code
Run code
Download file

Execution output appears in the console.


23. Final System Architecture

User
 ↓
HF Space UI
 ↓
Agent Engine
 ↓
Speculative Decoding
 ↓
Tool Parser
 ↓
Tool Executor
 ↓
Dynamic Tools
 ↓
Workspace
 ↓
Artifacts / Agents / Queue
 ↓
Result returned to LLM

24. Design Principles

The system follows several core principles.

  1. Everything is a tool

Actions are performed through tools rather than hardcoded logic.

  1. Agents can extend themselves

Agents may create tools and agents.

  1. Artifacts are first-class outputs

Generated files are accessible through the UI.

  1. Isolation and safety

Tools run in subprocesses.

  1. Local and lightweight

System runs entirely locally using Ollama.


25. Future Extensions

Potential future capabilities include:

multi-agent collaboration
autonomous project generation
browser automation
dataset generation
long-term memory