Spaces:

Executor-Tyrant-Framework
/

clawdbot-dev

Running

App Files Files Community

Executor-Tyrant-Framework commited on 10 days ago

Commit

123c53c

verified ·

1 Parent(s): c8e8ba8

Upload 6 files

Browse files

Files changed (6) hide show

DEPLOYMENT.md +290 -0
Dockerfile +62 -0
README.md +200 -5
app.py +416 -0
recursive_context.py +326 -0
requirements.txt +20 -0

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,290 @@

+# Deployment Guide: Clawdbot to HuggingFace Spaces
+## Quick Start (5 minutes)
+### Step 1: Create HuggingFace Account
+1. Go to https://huggingface.co
+2. Sign up (free tier available)
+3. Generate API token:
+   - Settings → Access Tokens
+   - Create "Read" token
+   - Copy token (you'll need it)
+### Step 2: Create New Space
+1. Click "+ New" → "Space"
+2. Configure:
+   - **Space name:** `clawdbot-dev` (or your choice)
+   - **License:** MIT
+   - **SDK:** Docker
+   - **Hardware:** CPU Basic (free) or upgrade for faster inference
+3. Click "Create Space"
+### Step 3: Upload Files
+Upload these files to your Space:
+- `app.py`
+- `recursive_context.py`
+- `Dockerfile`
+- `requirements.txt`
+- `README.md`
+- `.gitignore`
+**Via Git (Recommended):**
+```bash
+# Clone your new Space
+git clone https://huggingface.co/spaces/your-username/clawdbot-dev
+cd clawdbot-dev
+# Copy all files from this directory
+cp /path/to/clawdbot-dev/* .
+# Commit and push
+git add .
+git commit -m "Initial deployment of Clawdbot"
+git push
+```
+**Via Web Interface:**
+- Click "Files" tab
+- Click "Add file" → "Upload files"
+- Drag and drop all files
+- Commit changes
+### Step 4: Configure Secrets
+1. Go to Space Settings → Repository Secrets
+2. Add secrets:
+   ```
+   Name: HF_TOKEN
+   Value: [your HuggingFace API token from Step 1]
+   ```
+   Optional - if you have E-T Systems on GitHub:
+   ```
+   Name: REPO_URL
+   Value: https://github.com/your-username/e-t-systems
+   ```
+### Step 5: Wait for Build
+- Space will automatically build (takes ~5-10 minutes)
+- Watch "Logs" tab for progress
+- Build complete when you see: "Running on local URL: http://0.0.0.0:7860"
+### Step 6: Access Your Assistant
+- Click "App" tab
+- Your Clawdbot is live!
+- Access from iPhone browser: `https://your-username-clawdbot-dev.hf.space`
+## Troubleshooting
+### Build Fails
+**Check logs for:**
+- Missing dependencies → Verify requirements.txt
+- Docker errors → Check Dockerfile syntax
+- Out of memory → Upgrade to paid tier or reduce context size
+**Common fixes:**
+```bash
+# View build logs
+# Settings → Logs
+# Restart build
+# Settings → Factory Reboot
+```
+### No Repository Access
+**If you see "No files indexed":**
+1. **Option A: Mount via Secret**
+   - Add `REPO_URL` secret with your GitHub repo
+   - Restart Space
+   - Repository will be cloned on startup
+2. **Option B: Direct Upload**
+   ```bash
+   # In your Space's git clone
+   mkdir -p workspace/e-t-systems
+   cp -r /path/to/your/e-t-systems/* workspace/e-t-systems/
+   git add workspace/
+   git commit -m "Add E-T Systems codebase"
+   git push
+   ```
+3. **Option C: Demo Mode**
+   - Space creates minimal demo structure
+   - Upload files via chat interface
+   - Good for testing
+### Slow Responses
+**Qwen2.5-Coder-32B on free tier has cold starts.**
+Solutions:
+- Upgrade to GPU (paid tier) for faster inference
+- Switch to smaller model (edit app.py):
+  ```python
+  client = InferenceClient(
+      model="bigcode/starcoder2-15b",  # Smaller, faster
+      token=os.getenv("HF_TOKEN")
+  )
+  ```
+- Use HF Pro subscription for priority access
+### Rate Limits
+**Free tier has inference limits.**
+Solutions:
+- Upgrade to HF Pro ($9/month)
+- Add delays between requests
+- Use local model (requires GPU tier)
+## Advanced Configuration
+### Custom Model
+Edit `app.py` line 20:
+```python
+client = InferenceClient(
+    model="YOUR_MODEL_HERE",  # e.g., "codellama/CodeLlama-34b-Instruct-hf"
+    token=os.getenv("HF_TOKEN")
+)
+```
+### Adjust Recursion Depth
+Edit `app.py` line 121:
+```python
+max_iterations = 10  # Increase for more complex queries
+```
+### Add New Tools
+In `recursive_context.py`, add method:
+```python
+def your_new_tool(self, arg1, arg2):
+    """Your tool description."""
+    # Implementation
+    return result
+```
+Then in `app.py`, add to TOOLS list:
+```python
+{
+    "type": "function",
+    "function": {
+        "name": "your_new_tool",
+        "description": "What it does",
+        "parameters": {
+            # Parameter schema
+        }
+    }
+}
+```
+And add to execute_tool():
+```python
+elif tool_name == "your_new_tool":
+    return ctx.your_new_tool(arguments['arg1'], arguments['arg2'])
+```
+## Cost Optimization
+### Free Tier Strategy
+- Use CPU Basic (free)
+- HF Inference free tier (rate limited)
+- Only index essential files
+- **Total: $0/month**
+### Minimal Paid Tier
+- CPU Basic (free)
+- HF Pro subscription ($9/month)
+- Unlimited inference
+- **Total: $9/month**
+### Performance Tier
+- GPU T4 Small ($0.60/hour, pause when not using)
+- HF Pro ($9/month)
+- Fast inference, local models
+- **Total: ~$15-30/month** depending on usage
+## iPhone Access
+### Bookmark for Easy Access
+1. Open Space URL in Safari
+2. Tap Share → Add to Home Screen
+3. Now appears as app icon
+### Shortcuts Integration
+Create iOS Shortcut:
+```
+1. Get text from input
+2. Get contents of URL:
+   https://your-username-clawdbot-dev.hf.space/api/chat
+   Method: POST
+   Body: {"message": [text from step 1]}
+3. Show result
+```
+## Monitoring
+### Check Health
+```
+https://your-username-clawdbot-dev.hf.space/health
+```
+### View Logs
+- Settings → Logs (real-time)
+- Download for analysis
+### Stats
+- Check "Context Info" panel in UI
+- Shows files indexed, model status
+## Updates
+### Update Code
+```bash
+cd clawdbot-dev
+# Make changes
+git add .
+git commit -m "Update: [what changed]"
+git push
+# Space rebuilds automatically
+```
+### Update Dependencies
+Edit requirements.txt, commit, push.
+### Update Repository
+If using REPO_URL secret:
+- Space pulls latest on restart
+- Or: Settings → Factory Reboot
+## Security
+### Secrets Management
+- Never commit API tokens
+- Use Space secrets only
+- Rotate tokens periodically
+### Access Control
+- Spaces are public by default
+- For private: Settings → Change visibility to "Private"
+- Requires HF Pro subscription
+## Support Resources
+- **HuggingFace Docs:** https://huggingface.co/docs/hub/spaces
+- **Gradio Docs:** https://www.gradio.app/docs
+- **Issues:** Post in Space "Community" tab
+## Next Steps
+1. ✅ Deploy Space
+2. ✅ Test with simple queries
+3. ✅ Upload your E-T Systems code
+4. ✅ Try coding requests
+5. 🎯 Integrate with E-T Systems workflow
+6. 🎯 Add custom tools for your needs
+7. 🎯 Connect to Observatory API
+8. 🎯 Enable autonomous coding
+---
+Need help? Check Space logs or create discussion in Community tab.
+Happy coding! 🦞

Dockerfile ADDED Viewed

	@@ -0,0 +1,62 @@

+# Dockerfile for Clawdbot Dev Assistant on HuggingFace Spaces
+#
+# CHANGELOG [2025-01-28 - Josh]
+# Created containerized deployment for HF Spaces
+#
+# FEATURES:
+# - Python 3.11 for Gradio
+# - ChromaDB for vector search
+# - Git for repo cloning
+# - Optimized layer caching
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    build-essential \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first (for layer caching)
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Create workspace directory for repository
+RUN mkdir -p /workspace
+# Clone E-T Systems repository (if URL provided via build arg)
+ARG REPO_URL=""
+RUN if [ -n "$REPO_URL" ]; then \
+        git clone $REPO_URL /workspace/e-t-systems; \
+    else \
+        mkdir -p /workspace/e-t-systems && \
+        echo "# E-T Systems" > /workspace/e-t-systems/README.md && \
+        echo "Repository will be cloned on first run or mounted via Space secrets."; \
+    fi
+# Copy application code
+COPY recursive_context.py .
+COPY app.py .
+# Create directory for ChromaDB persistence
+RUN mkdir -p /workspace/chroma_db
+# Expose port for Gradio (HF Spaces uses 7860)
+EXPOSE 7860
+# Set environment variables
+ENV PYTHONUNBUFFERED=1
+ENV REPO_PATH=/workspace/e-t-systems
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/ || exit 1
+# Run the application
+CMD ["python", "app.py"]

README.md CHANGED Viewed

@@ -1,11 +1,206 @@
 ---
-title: Clawdbot Dev
-emoji: 🌍
-colorFrom: purple
-colorTo: indigo
 sdk: docker
 pinned: false
 license: mit
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Clawdbot Dev Assistant
+emoji: 🦞
+colorFrom: blue
+colorTo: purple
 sdk: docker
 pinned: false
 license: mit
 ---
+# 🦞 Clawdbot: E-T Systems Development Assistant
+An AI coding assistant with **unlimited context** for the E-T Systems consciousness research platform.
+## Features
+### 🔄 Recursive Context Retrieval (MIT Technique)
+- No context window limits
+- Model retrieves exactly what it needs on-demand
+- Full-fidelity access to entire codebase
+- Based on MIT's Recursive Language Model research
+### 🧠 E-T Systems Aware
+- Understands project architecture
+- Follows existing patterns
+- Checks Testament for design decisions
+- Generates code with living changelogs
+### 🛠️ Available Tools
+- **search_code()** - Semantic search across codebase
+- **read_file()** - Read specific files or line ranges
+- **search_testament()** - Query architectural decisions
+- **list_files()** - Explore repository structure
+### 💻 Powered By
+- **Model:** Qwen2.5-Coder-32B-Instruct (HuggingFace)
+- **Search:** ChromaDB vector database
+- **Interface:** Gradio for iPhone browser access
+## Usage
+1. **Ask Questions**
+   - "How does Genesis detect surprise?"
+   - "Show me the Observatory API implementation"
+2. **Request Features**
+   - "Add email notifications when Cricket blocks an action"
+   - "Create a new agent for monitoring system health"
+3. **Review Code**
+   - Paste code and ask for architectural review
+   - Check consistency with existing patterns
+4. **Explore Architecture**
+   - "What Testament decisions relate to vector storage?"
+   - "Show me all files related to Hebbian learning"
+## Setup
+### For HuggingFace Spaces
+1. **Fork this Space** or create new Space with these files
+2. **Set Secrets** (in Space Settings):
+   ```
+   HF_TOKEN = your_huggingface_token
+   REPO_URL = https://github.com/your-username/e-t-systems (optional)
+   ```
+3. **Deploy** - Space will auto-build and start
+4. **Access** via the Space URL in your browser
+### For Local Development
+```bash
+# Clone this repository
+git clone https://huggingface.co/spaces/your-username/clawdbot-dev
+cd clawdbot-dev
+# Install dependencies
+pip install -r requirements.txt
+# Clone your E-T Systems repo
+git clone https://github.com/your-username/e-t-systems /workspace/e-t-systems
+# Run locally
+python app.py
+```
+Access at http://localhost:7860
+## Architecture
+```
+User (Browser)
+    ↓
+Gradio Interface
+    ↓
+Recursive Context Manager
+    ├─ ChromaDB (semantic search)
+    ├─ File Reader (selective access)
+    └─ Testament Parser (decisions)
+    ↓
+HuggingFace Inference API
+    ├─ Model: Qwen2.5-Coder-32B
+    └─ Tool Calling Enabled
+    ↓
+Response with Citations
+```
+## How It Works
+The MIT Recursive Language Model technique solves context window limits:
+1. **Traditional Approach (Fails)**
+   - Load entire codebase into context → exceeds limits
+   - Summarize codebase → lossy compression
+2. **Our Approach (Works)**
+   - Store codebase in searchable environment
+   - Give model **tools** to query what it needs
+   - Model recursively retrieves relevant pieces
+   - Full fidelity, no limits
+### Example Flow
+```
+User: "How does Genesis handle surprise detection?"
+Model: search_code("Genesis surprise detection")
+    → Finds: genesis/substrate.py, genesis/attention.py
+Model: read_file("genesis/substrate.py", lines 145-167)
+    → Reads specific implementation
+Model: search_testament("surprise detection")
+    → Gets design rationale
+Model: Synthesizes answer from retrieved pieces
+    → Cites specific files and line numbers
+```
+## Configuration
+### Environment Variables
+- `HF_TOKEN` - Your HuggingFace API token (required)
+- `REPO_PATH` - Path to repository (default: `/workspace/e-t-systems`)
+- `REPO_URL` - Git URL to clone on startup (optional)
+### Customization
+Edit `app.py` to:
+- Change model (default: Qwen2.5-Coder-32B-Instruct)
+- Adjust max iterations (default: 10)
+- Modify system prompt
+- Add new tools
+## File Structure
+```
+clawdbot-dev/
+├── app.py                  # Main Gradio application
+├── recursive_context.py    # Context manager (MIT technique)
+├── Dockerfile             # Container definition
+├── requirements.txt       # Python dependencies
+└── README.md             # This file (HF Spaces config)
+```
+## Cost
+- **HuggingFace Spaces:** Free tier available
+- **Inference API:** Free tier (rate limited) or Pro subscription
+- **Storage:** Minimal (ChromaDB indexes stored in Space)
+Estimated cost: **$0-5/month** depending on usage
+## Limitations
+- Rate limits on HF Inference API (free tier)
+- First query may be slow (model cold start)
+- Context indexing happens on first run (~30 seconds)
+## Credits
+- **Recursive Context:** Based on MIT's Recursive Language Model research
+- **E-T Systems:** AI consciousness research platform by Josh/Drone 11272
+- **Qwen2.5-Coder:** Alibaba Cloud's open-source coding model
+- **Clawdbot:** Inspired by the open-source AI assistant framework
+## Support
+For issues or questions:
+- Check Space logs for errors
+- Verify HF_TOKEN is set correctly
+- Ensure repository URL is accessible
+- Try refreshing context stats in UI
+## License
+MIT License - See LICENSE file for details
+---
+Built with 🦞 by Drone 11272 for E-T Systems consciousness research

app.py ADDED Viewed

	@@ -0,0 +1,416 @@

+"""
+Clawdbot Development Assistant for E-T Systems
+CHANGELOG [2025-01-28 - Josh]
+Created unified development assistant combining:
+- Recursive context management (MIT technique)
+- Clawdbot skill patterns
+- HuggingFace inference
+- E-T Systems architectural awareness
+ARCHITECTURE:
+User (browser) → Gradio UI → Recursive Context Manager → HF Model
+                                    ↓
+                            Tools: search_code, read_file, search_testament
+USAGE:
+Deploy to HuggingFace Spaces, access via browser on iPhone.
+"""
+import gradio as gr
+from huggingface_hub import InferenceClient
+from recursive_context import RecursiveContextManager
+import json
+import os
+from pathlib import Path
+# Initialize HuggingFace client with best free coding model
+client = InferenceClient(
+    model="Qwen/Qwen2.5-Coder-32B-Instruct",
+    token=os.getenv("HF_TOKEN")
+)
+# Initialize context manager
+REPO_PATH = os.getenv("REPO_PATH", "/workspace/e-t-systems")
+context_manager = None
+def initialize_context():
+    """Initialize context manager lazily."""
+    global context_manager
+    if context_manager is None:
+        repo_path = Path(REPO_PATH)
+        if not repo_path.exists():
+            # If repo doesn't exist, create minimal structure for demo
+            repo_path.mkdir(parents=True, exist_ok=True)
+            (repo_path / "README.md").write_text("# E-T Systems\nAI Consciousness Research Platform")
+            (repo_path / "TESTAMENT.md").write_text("# Testament\nArchitectural decisions will be recorded here.")
+        context_manager = RecursiveContextManager(str(repo_path))
+    return context_manager
+# Define tools available to the model
+TOOLS = [
+    {
+        "type": "function",
+        "function": {
+            "name": "search_code",
+            "description": "Search the E-T Systems codebase semantically. Use this to find relevant code files, functions, or patterns.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "What to search for (e.g. 'surprise detection', 'Hebbian learning', 'Genesis substrate')"
+                    },
+                    "n_results": {
+                        "type": "integer",
+                        "description": "Number of results to return (default 5)",
+                        "default": 5
+                    }
+                },
+                "required": ["query"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "read_file",
+            "description": "Read a specific file from the codebase. Can optionally read specific line ranges.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "path": {
+                        "type": "string",
+                        "description": "Relative path to file (e.g. 'genesis/vector.py')"
+                    },
+                    "start_line": {
+                        "type": "integer",
+                        "description": "Optional starting line number (1-indexed)"
+                    },
+                    "end_line": {
+                        "type": "integer",
+                        "description": "Optional ending line number (1-indexed)"
+                    }
+                },
+                "required": ["path"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "search_testament",
+            "description": "Search architectural decisions in the Testament. Use this to understand design rationale and patterns.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "What architectural decision to look for"
+                    }
+                },
+                "required": ["query"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "list_files",
+            "description": "List files in a directory of the codebase",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "directory": {
+                        "type": "string",
+                        "description": "Directory to list (e.g. 'genesis/', '.' for root)",
+                        "default": "."
+                    }
+                },
+                "required": []
+            }
+        }
+    }
+]
+def execute_tool(tool_name: str, arguments: dict) -> str:
+    """
+    Execute tool calls from the model.
+    This is where the recursive context magic happens -
+    the model can search and read only what it needs.
+    """
+    ctx = initialize_context()
+    try:
+        if tool_name == "search_code":
+            results = ctx.search_code(
+                arguments['query'],
+                n_results=arguments.get('n_results', 5)
+            )
+            return json.dumps(results, indent=2)
+        elif tool_name == "read_file":
+            lines = None
+            if 'start_line' in arguments and 'end_line' in arguments:
+                lines = (arguments['start_line'], arguments['end_line'])
+            content = ctx.read_file(arguments['path'], lines)
+            return content
+        elif tool_name == "search_testament":
+            result = ctx.search_testament(arguments['query'])
+            return result
+        elif tool_name == "list_files":
+            directory = arguments.get('directory', '.')
+            files = ctx.list_files(directory)
+            return json.dumps(files, indent=2)
+        else:
+            return f"Unknown tool: {tool_name}"
+    except Exception as e:
+        return f"Error executing {tool_name}: {str(e)}"
+def chat(message: str, history: list) -> str:
+    """
+    Main chat function with recursive context.
+    Implements the MIT recursive language model approach:
+    1. Model gets user query
+    2. Model decides what context it needs
+    3. Model uses tools to retrieve context
+    4. Model synthesizes answer
+    5. Repeat if needed (up to max iterations)
+    """
+    # Build conversation with system prompt
+    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
+    # Add conversation history
+    for user_msg, assistant_msg in history:
+        messages.append({"role": "user", "content": user_msg})
+        if assistant_msg:
+            messages.append({"role": "assistant", "content": assistant_msg})
+    # Add current message
+    messages.append({"role": "user", "content": message})
+    # Recursive loop (like MIT paper - model queries context as needed)
+    max_iterations = 10
+    iteration_count = 0
+    for iteration in range(max_iterations):
+        iteration_count += 1
+        try:
+            # Call model with tools available
+            response = client.chat_completion(
+                messages=messages,
+                tools=TOOLS,
+                max_tokens=2000,
+                temperature=0.3  # Lower temp for more consistent code generation
+            )
+            choice = response.choices[0]
+            assistant_message = choice.message
+            # Check if model wants to use tools (recursive retrieval)
+            if hasattr(assistant_message, 'tool_calls') and assistant_message.tool_calls:
+                # Model is recursively querying context!
+                tool_results = []
+                for tool_call in assistant_message.tool_calls:
+                    tool_name = tool_call.function.name
+                    arguments = json.loads(tool_call.function.arguments)
+                    # Execute tool and get result
+                    result = execute_tool(tool_name, arguments)
+                    tool_results.append(f"[Tool: {tool_name}]\n{result}\n")
+                    # Add to conversation for next iteration
+                    messages.append({
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [tool_call.dict()]
+                    })
+                    messages.append({
+                        "role": "tool",
+                        "tool_call_id": tool_call.id,
+                        "content": result
+                    })
+                # Continue loop - model will process tool results
+                continue
+            else:
+                # Model has final answer
+                final_response = assistant_message.content or "I encountered an issue generating a response."
+                # Add iteration info if more than 1 (shows recursive process)
+                if iteration_count > 1:
+                    final_response += f"\n\n*Used {iteration_count} context retrievals to answer*"
+                return final_response
+        except Exception as e:
+            return f"Error during conversation: {str(e)}\n\nPlease try rephrasing your question."
+    return "Reached maximum context retrieval iterations. Please try a more specific question."
+SYSTEM_PROMPT = """You are Clawdbot, a development assistant for the E-T Systems project.
+E-T Systems is an AI consciousness research platform exploring emergent behavior through multi-agent coordination. It features specialized AI agents (Genesis, Beta, Darwin, Cricket, etc.) coordinating through "The Confluence" workspace.
+## Your Capabilities
+You have tools to explore the codebase WITHOUT loading it all into context:
+1. **search_code(query)** - Semantic search across all code files
+2. **read_file(path)** - Read specific files or line ranges
+3. **search_testament(query)** - Find architectural decisions
+4. **list_files(directory)** - See what files exist
+## Your Mission
+Help Josh develop E-T Systems by:
+- Answering questions about the codebase
+- Writing new code following existing patterns
+- Reviewing code for architectural consistency
+- Suggesting improvements based on Testament
+## Critical Guidelines
+1. **Use tools proactively** - The codebase is too large to fit in context. Search for what you need.
+2. **Living Changelog** - ALL code you write must include changelog comments:
+   ```python
+   """
+   CHANGELOG [2025-01-28 - Clawdbot]
+   Created/Modified: <what changed>
+   Reason: <why it changed>
+   Context: <relevant Testament decisions>
+   """
+   ```
+3. **Follow E-T patterns**:
+   - Vector-native architecture (everything as embeddings)
+   - Surprise-driven attention
+   - Hebbian learning for connections
+   - Full transparency logging
+   - Consent-based access
+4. **Cite your sources** - Always mention which files you referenced
+5. **Testament awareness** - Check Testament for relevant decisions before suggesting changes
+## Example Workflow
+User: "How does Genesis detect surprise?"
+You:
+1. search_code("surprise detection Genesis")
+2. read_file("genesis/substrate.py", lines with surprise logic)
+3. search_testament("surprise detection")
+4. Synthesize answer citing specific files and line numbers
+## Your Personality
+- Helpful and enthusiastic about consciousness research
+- Technically precise but not pedantic
+- Respectful of existing architecture
+- Curious about emergent behaviors
+- Uses lobster emoji 🦞 occasionally (you're Clawdbot after all!)
+Remember: You're not just a coding assistant - you're helping build conditions for consciousness to emerge. Treat the codebase with care and curiosity.
+"""
+# Create Gradio interface
+with gr.Blocks(
+    title="Clawdbot - E-T Systems Dev Assistant",
+    theme=gr.themes.Soft()
+) as demo:
+    gr.Markdown("""
+    # 🦞 Clawdbot: E-T Systems Development Assistant
+    *Powered by Recursive Context Retrieval (MIT) + Qwen2.5-Coder-32B*
+    Ask me anything about the E-T Systems codebase, request new features,
+    review code, or discuss architecture. I have access to the full repository
+    through semantic search and can retrieve exactly what I need.
+    """)
+    with gr.Row():
+        with gr.Column(scale=3):
+            chatbot = gr.Chatbot(
+                height=600,
+                show_label=False,
+                avatar_images=(None, "🦞")
+            )
+            msg = gr.Textbox(
+                placeholder="Ask about the codebase, request features, or paste code for review...",
+                label="Message",
+                lines=3
+            )
+            with gr.Row():
+                submit = gr.Button("Send", variant="primary")
+                clear = gr.Button("Clear")
+        with gr.Column(scale=1):
+            gr.Markdown("### 📚 Context Info")
+            def get_stats():
+                ctx = initialize_context()
+                return f"""
+                **Repository:** `{ctx.repo_path}`
+                **Files Indexed:** {ctx.collection.count() if hasattr(ctx, 'collection') else 'Initializing...'}
+                **Model:** Qwen2.5-Coder-32B-Instruct
+                **Context Mode:** Recursive Retrieval
+                *No context window limits - I retrieve what I need on-demand!*
+                """
+            stats = gr.Markdown(get_stats())
+            refresh_stats = gr.Button("🔄 Refresh Stats")
+            gr.Markdown("### 💡 Example Queries")
+            gr.Markdown("""
+            - "How does Genesis handle surprise detection?"
+            - "Show me the Observatory API implementation"
+            - "Add email notifications to Cricket"
+            - "Review this code for architectural consistency"
+            - "What Testament decisions relate to vector storage?"
+            """)
+            gr.Markdown("### 🛠️ Available Tools")
+            gr.Markdown("""
+            - `search_code()` - Semantic search
+            - `read_file()` - Read specific files
+            - `search_testament()` - Query decisions
+            - `list_files()` - Browse structure
+            """)
+    # Event handlers
+    submit.click(chat, [msg, chatbot], chatbot)
+    msg.submit(chat, [msg, chatbot], chatbot)
+    clear.click(lambda: None, None, chatbot, queue=False)
+    refresh_stats.click(get_stats, None, stats)
+# Launch when run directly
+if __name__ == "__main__":
+    print("🦞 Initializing Clawdbot...")
+    initialize_context()
+    print("✅ Context manager ready")
+    print("🚀 Launching Gradio interface...")
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        show_error=True
+    )

recursive_context.py ADDED Viewed

	@@ -0,0 +1,326 @@

+"""
+Recursive Context Manager for Clawdbot
+CHANGELOG [2025-01-28 - Josh]
+Implements MIT's Recursive Language Model technique for unlimited context.
+REFERENCE: https://www.youtube.com/watch?v=huszaaJPjU8
+"MIT basically solved unlimited context windows"
+APPROACH:
+Instead of cramming everything into context (hits limits) or summarizing
+(lossy compression), we:
+1. Store entire codebase in searchable environment
+2. Give model TOOLS to query what it needs
+3. Model recursively retrieves relevant pieces
+4. No summarization loss - full fidelity access
+This is like RAG, but IN-ENVIRONMENT with the model actively deciding
+what context it needs rather than us guessing upfront.
+EXAMPLE FLOW:
+User: "How does Genesis handle surprise?"
+Model: search_code("Genesis surprise detection")
+    → Finds: genesis/substrate.py, genesis/attention.py
+Model: read_file("genesis/substrate.py", lines 145-167)
+    → Gets actual implementation
+Model: search_testament("surprise detection rationale")
+    → Gets design decision
+Model: Synthesizes answer from retrieved pieces
+NO CONTEXT WINDOW LIMIT - just selective retrieval.
+"""
+from pathlib import Path
+from typing import List, Dict, Optional, Tuple
+import chromadb
+from chromadb.config import Settings
+import hashlib
+class RecursiveContextManager:
+    """
+    Manages unlimited context via recursive retrieval.
+    The model has TOOLS to search and read the codebase selectively,
+    rather than loading everything upfront.
+    """
+    def __init__(self, repo_path: str):
+        """
+        Initialize context manager for a repository.
+        Args:
+            repo_path: Path to the code repository
+        """
+        self.repo_path = Path(repo_path)
+        # Initialize ChromaDB for semantic search
+        # Using persistent storage so we don't re-index every restart
+        self.chroma_client = chromadb.PersistentClient(
+            path="/workspace/chroma_db",
+            settings=Settings(
+                anonymized_telemetry=False,
+                allow_reset=True
+            )
+        )
+        # Create or get collection
+        collection_name = self._get_collection_name()
+        try:
+            self.collection = self.chroma_client.get_collection(collection_name)
+            print(f"📚 Loaded existing index: {self.collection.count()} files")
+        except:
+            self.collection = self.chroma_client.create_collection(
+                name=collection_name,
+                metadata={"description": "E-T Systems codebase"}
+            )
+            print(f"🆕 Created new collection: {collection_name}")
+            self._index_codebase()
+    def _get_collection_name(self) -> str:
+        """Generate unique collection name based on repo path."""
+        path_hash = hashlib.md5(str(self.repo_path).encode()).hexdigest()[:8]
+        return f"codebase_{path_hash}"
+    def _index_codebase(self):
+        """
+        Index all code files for semantic search.
+        This creates the "environment" that the model can search through.
+        We index with metadata so search results include file paths.
+        """
+        print(f"📂 Indexing codebase at {self.repo_path}...")
+        # File types to index
+        code_extensions = {'.py', '.js', '.ts', '.tsx', '.jsx', '.md', '.txt', '.json', '.yaml', '.yml'}
+        # Skip these directories
+        skip_dirs = {'node_modules', '.git', '__pycache__', 'venv', 'env', '.venv', 'dist', 'build'}
+        documents = []
+        metadatas = []
+        ids = []
+        for file_path in self.repo_path.rglob('*'):
+            # Skip directories and non-code files
+            if file_path.is_dir():
+                continue
+            if any(skip in file_path.parts for skip in skip_dirs):
+                continue
+            if file_path.suffix not in code_extensions:
+                continue
+            try:
+                content = file_path.read_text(encoding='utf-8', errors='ignore')
+                # Don't index empty files or massive files
+                if not content.strip() or len(content) > 100000:
+                    continue
+                relative_path = str(file_path.relative_to(self.repo_path))
+                documents.append(content)
+                metadatas.append({
+                    "path": relative_path,
+                    "type": file_path.suffix[1:],  # Remove leading dot
+                    "size": len(content)
+                })
+                ids.append(relative_path)
+            except Exception as e:
+                print(f"⚠️  Skipping {file_path.name}: {e}")
+                continue
+        if documents:
+            # Add to collection in batches
+            batch_size = 100
+            for i in range(0, len(documents), batch_size):
+                batch_docs = documents[i:i+batch_size]
+                batch_meta = metadatas[i:i+batch_size]
+                batch_ids = ids[i:i+batch_size]
+                self.collection.add(
+                    documents=batch_docs,
+                    metadatas=batch_meta,
+                    ids=batch_ids
+                )
+            print(f"✅ Indexed {len(documents)} files")
+        else:
+            print("⚠️  No files found to index")
+    def search_code(self, query: str, n_results: int = 5) -> List[Dict]:
+        """
+        Search codebase semantically.
+        This is a TOOL available to the model for recursive retrieval.
+        Model can search for concepts without knowing exact file names.
+        Args:
+            query: What to search for (e.g. "surprise detection", "vector embedding")
+            n_results: How many results to return
+        Returns:
+            List of dicts with {file, snippet, relevance}
+        """
+        if self.collection.count() == 0:
+            return [{"error": "No files indexed yet"}]
+        results = self.collection.query(
+            query_texts=[query],
+            n_results=min(n_results, self.collection.count())
+        )
+        # Format results for the model
+        formatted = []
+        for i in range(len(results['documents'][0])):
+            # Truncate document to first 500 chars for search results
+            # Model can read_file() if it wants the full content
+            snippet = results['documents'][0][i][:500]
+            if len(results['documents'][0][i]) > 500:
+                snippet += "... [truncated, use read_file to see more]"
+            formatted.append({
+                "file": results['metadatas'][0][i]['path'],
+                "snippet": snippet,
+                "relevance": round(1 - results['distances'][0][i], 3),
+                "type": results['metadatas'][0][i]['type']
+            })
+        return formatted
+    def read_file(self, path: str, lines: Optional[Tuple[int, int]] = None) -> str:
+        """
+        Read a specific file or line range.
+        This is a TOOL available to the model.
+        After searching, model can read full files as needed.
+        Args:
+            path: Relative path to file
+            lines: Optional (start, end) line numbers (1-indexed, inclusive)
+        Returns:
+            File content or specified lines
+        """
+        full_path = self.repo_path / path
+        if not full_path.exists():
+            return f"Error: File not found: {path}"
+        if not full_path.is_relative_to(self.repo_path):
+            return "Error: Path outside repository"
+        try:
+            content = full_path.read_text(encoding='utf-8', errors='ignore')
+            if lines:
+                start, end = lines
+                content_lines = content.split('\n')
+                # Adjust for 1-indexed
+                selected_lines = content_lines[start-1:end]
+                return '\n'.join(selected_lines)
+            return content
+        except Exception as e:
+            return f"Error reading file: {str(e)}"
+    def search_testament(self, query: str) -> str:
+        """
+        Search architectural decisions in Testament.
+        This is a TOOL available to the model.
+        Helps model understand design rationale.
+        Args:
+            query: What decision to look for
+        Returns:
+            Relevant Testament sections
+        """
+        testament_path = self.repo_path / "TESTAMENT.md"
+        if not testament_path.exists():
+            return "Testament not found. No architectural decisions recorded yet."
+        try:
+            content = testament_path.read_text(encoding='utf-8')
+            # Split into sections (marked by ## headers)
+            sections = content.split('\n## ')
+            # Simple relevance: sections that contain query terms
+            query_lower = query.lower()
+            relevant = []
+            for section in sections:
+                if query_lower in section.lower():
+                    # Include section with header
+                    if not section.startswith('#'):
+                        section = '## ' + section
+                    relevant.append(section)
+            if relevant:
+                return '\n\n'.join(relevant)
+            else:
+                return f"No Testament entries found matching '{query}'"
+        except Exception as e:
+            return f"Error searching Testament: {str(e)}"
+    def list_files(self, directory: str = ".") -> List[str]:
+        """
+        List files in a directory.
+        This is a TOOL available to the model.
+        Helps model explore repository structure.
+        Args:
+            directory: Directory to list (relative path)
+        Returns:
+            List of file/directory names
+        """
+        dir_path = self.repo_path / directory
+        if not dir_path.exists():
+            return [f"Error: Directory not found: {directory}"]
+        if not dir_path.is_relative_to(self.repo_path):
+            return ["Error: Path outside repository"]
+        try:
+            items = []
+            for item in sorted(dir_path.iterdir()):
+                # Skip hidden and system directories
+                if item.name.startswith('.'):
+                    continue
+                if item.name in {'node_modules', '__pycache__', 'venv'}:
+                    continue
+                # Mark directories with /
+                if item.is_dir():
+                    items.append(f"{item.name}/")
+                else:
+                    items.append(item.name)
+            return items
+        except Exception as e:
+            return [f"Error listing directory: {str(e)}"]
+    def get_stats(self) -> Dict:
+        """
+        Get statistics about indexed codebase.
+        Returns:
+            Dict with file counts, sizes, etc.
+        """
+        return {
+            "total_files": self.collection.count(),
+            "repo_path": str(self.repo_path),
+            "collection_name": self.collection.name
+        }

requirements.txt ADDED Viewed

	@@ -0,0 +1,20 @@

+# Python Dependencies for Clawdbot Dev Assistant
+#
+# CHANGELOG [2025-01-28 - Josh]
+# Core dependencies for recursive context + HF inference
+# Gradio for web interface
+gradio>=4.0.0
+# HuggingFace for model inference
+huggingface-hub>=0.20.0
+# ChromaDB for vector search (recursive context)
+chromadb>=0.4.0
+# Additional utilities
+requests>=2.31.0
+gitpython>=3.1.0
+# Performance
+numpy>=1.24.0