Spaces:
Paused
Paused
2. System Components
2.1 User Interface
Streamlit-based web UI.
Features:
- Prompt input
/HELPcommands- Code generation
/WRITE_PY - Code editing + execution (
CPUor optionalGPU) - Book, story, or poem management
- Session logs and downloads
2.2 Agent Core β pygmyclaw.py
Handles all user-agent interaction:
- Receives user prompts.
- Converts LLM output into JSON tool calls.
- Dynamically loads and executes Python tools.
Speculative decoding:
- 3 drafters + 1 verifier for robust output.
Queue system:
- Redis or JSON-file queue for task scheduling.
Artifact management:
- Stores code, logs, and task outputs in workspace.
- Supports automatic dependency installation.
2.3 Python Multitool β pygmyclaw_multitool.py
Tool registry:
list_tools_detailed,sys_info,log_error,echo, etc.
Dynamic tool addition:
- Agents can create new tools that are callable via JSON.
Safe execution sandbox:
- Python subprocess with controlled input/output.
2.4 LLM Interaction
Ollama / HF-hosted model as backend:
- Multi-instance support for parallel drafters.
- Token-based speculative decoding.
Workflow:
User prompt β Agent (LLM) β JSON tool call β Python tool executes β Result returned to LLM β Final response to userDynamic code generation & execution integrated:
- e.g., user asks for PyTorch demo β agent installs PyTorch β generates editable code β runs it in UI.
2.5 Persistent Memory β Hugging Face
Repository:
rahul7star/pyclawStores:
- Generated code & scripts
- Task outputs (
.out) - Logs & session history
- Dynamic tools and metadata
Mechanism:
- Push artifacts via
huggingface_hubAPI - Pull existing artifacts for agent memory
- Push artifacts via
Benefits: Enables long-term learning, cross-session continuity, and reproducibility.
2.6 Autonomous Task Management
Queue-based execution:
- Tasks added by user or agent itself.
- Background processor executes tasks in order.
Speculative execution:
- Multi-instance drafters improve code reliability.
- Verifier ensures correctness of outputs.
Dynamic tools:
- Tools can evolve or new tools can be created on-the-fly.
2.7 Safety & Resource Management
Code execution sandbox:
- Controlled Python subprocess.
- Auto cleanup of temporary files.
CPU/GPU selection:
- Default: CPU
- Optional: GPU if available and environment variable set.
Dependency management:
- Automatic package installation (e.g., PyTorch for user-requested demos).
3. Example User Workflow
User Prompt:
"Create a neural network demo in Python using PyTorch."
Agent Actions:
- Detects PyTorch requirement β installs on CPU.
- Generates Python code using
/WRITE_PY. - Saves code in Hugging Face repo for memory.
- Displays code in UI for editing.
- Runs code β outputs printed and logged.
- Updates agent memory with results and execution logs.
- Optionally creates a new dynamic tool for future NN generation.
4. Evolvable Architecture
| Feature | Current Status | Future Evolution |
|---|---|---|
| Dynamic Tool Creation | β | Can auto-generate new tools from tasks |
| Long-Term Memory | β via HF | Add semantic search, embeddings for context |
| Speculative Decoding | β | Increase drafters, multi-agent cooperation |
| Autonomous Task Execution | Partial | Recursive task planning, multi-step project handling |
| Dependency Management | β | Expand to virtual environments per project |
| Safe Execution | Partial | Containerized execution (Docker/WSL) |
5. Roadmap to Claude-Like Autonomy
- Enhance memory: Semantic embeddings + search in HF repo.
- Recursive reasoning: Agent generates subtasks autonomously.
- Multi-agent collaboration: Multiple PygmyClaws coordinate on large projects.
- Learning from outputs: Store completed tasks + feedback for continuous improvement.
- Safety & isolation: Dockerized Python execution with resource limits.
- Dynamic UI: Allow live editing, execution, and visualization of code outputs.
6. Diagram of End-to-End Flow
βββββββββββββββββ
β User Prompt β
βββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββ
β Agent (LLM) β
βββββββββ¬ββββββββ
β JSON Tool Call
βΌ
βββββββββββββββββ
β Python Tool β
β Executes Task β
βββββββββ¬ββββββββ
β Result
βΌ
βββββββββββββββββ
β Agent (LLM) β
β Processes β
βββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββ
β UI Output β
β (Code/Result) β
βββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββ
β β
β Persistent β
β Memory Repo β
βββββββββββββββββ