Spaces:

smolagents
/

ml-agent

Running

App Files Files Community

Henri Bonamy commited on Jan 8

Commit

baee379

2 Parent(s): df460d9 32f776a

main merge

Browse files

Files changed (23) hide show

.gitignore +2 -1
README.md +48 -26
agent/config.py +3 -0
agent/context_manager/manager.py +12 -0
agent/core/agent_loop.py +60 -22
agent/core/session.py +171 -1
agent/core/session_uploader.py +194 -0
agent/core/tools.py +63 -15
agent/main.py +16 -12
agent/prompts/system_prompt.yaml +1 -2
agent/tools/__init__.py +24 -0
agent/tools/docs_tools.py +0 -49
agent/tools/github_find_examples.py +489 -0
agent/tools/github_list_repos.py +281 -0
agent/tools/github_read_file.py +336 -0
agent/tools/github_search_code.py +453 -0
agent/tools/jobs_tool.py +62 -6
agent/tools/utilities.py +2 -2
agent/tools/utils_tools.py +5 -8
configs/main_agent_config.json +3 -1
pyproject.toml +31 -12
tests/unit/tools/test_jobs_tool.py +83 -0
uv.lock +0 -0

.gitignore CHANGED Viewed

@@ -15,4 +15,5 @@ wheels/
 *.csv
 /logs
 hf-agent-leaderboard/
-.cursor/

 *.csv
 /logs
 hf-agent-leaderboard/
+.cursor/
+session_logs/

README.md CHANGED Viewed

@@ -11,9 +11,11 @@ An MLE agent CLI with MCP (Model Context Protocol) integration and built-in tool
 # Clone the repository
 git clone git@github.com:huggingface/hf_agent.git
 cd hf-agent
-# Install dependencies (using uv)
-uv sync
 ```
 ### Interactive CLI
@@ -21,11 +23,19 @@ uv sync
 ```bash
 uv run python -m agent.main
 ```
 This starts an interactive chat session with the agent. Type your messages and the agent will respond, using tools as needed.
 The agent will automatically discover and register all tools from configured MCP servers.
 ## Architecture
 ### Component Overview
@@ -58,16 +68,20 @@ The agent will automatically discover and register all tools from configured MCP
 │  │  │  │  │ ContextManager             │  │  │  │  │         │
 │  │  │  │  │ • Message history          │  │  │  │  │         │
 │  │  │  │  │   (litellm.Message[])      │  │  │  │  │         │
 │  │  │  │  └────────────────────────────┘  │  │  │  │         │
 │  │  │  │                                  │  │  │  │         │
 │  │  │  │  ┌────────────────────────────┐  │  │  │  │         │
 │  │  │  │  │ ToolRouter                 │  │  │  │  │         │
-│  │  │  │  │  ├─ bash                   │  │  │  │  │         │
-│  │  │  │  │  ├─ read_file              │  │  │  │  │         │
-│  │  │  │  │  ├─ write_file             │  │  │  │  │         │
-│  │  │  │  │  └─ McpConnectionManager   │  │  │  │  │         │
-│  │  │  │  │      ├─ mcp__server1__*    │  │  │  │  │         │
-│  │  │  │  │      └─ mcp__server2__*    │  │  │  │  │         │
 │  │  │  │  └────────────────────────────┘  │  │  │  │         │
 │  │  │  └──────────────────────────────────┘  │  │  │         │
 │  │  │                                        │  │  │         │
@@ -121,16 +135,20 @@ User Message
 agent/
 ├── config.py                 # Configuration models
 ├── main.py                   # Interactive CLI entry point
 ├── context_manager/
-│   └── manager.py           # Message history management
 └── core/
     ├── agent_loop.py        # Main agent loop and handlers
     ├── session.py           # Session management
     ├── mcp_client.py        # MCP SDK integration
     └── tools.py             # ToolRouter and built-in tools
-test_integration.py          # Basic integration tests
-test_tools.py                # Tool execution tests
 eval/                        # Evaluation suite (see eval/README.md)
 ```
@@ -143,6 +161,7 @@ The agent emits the following events via `event_queue`:
 - `assistant_message` - LLM response text
 - `tool_call` - Tool being called with arguments
 - `tool_output` - Tool execution result
 - `turn_complete` - Agent finished processing
 - `error` - Error occurred during processing
 - `interrupted` - Agent was interrupted
@@ -177,18 +196,21 @@ def create_builtin_tools() -> list[ToolSpec]:
 ### Adding MCP Servers
-Add to your config:
-```python
-config = Config(
-    model_name="anthropic/claude-sonnet-4-5-20250929",
-    mcp_servers=[
-        MCPServerConfig(
-            name="your_server",
-            command="command",
-            args=["arg1", "arg2"],
-            env={"KEY": "value"}
-        )
-    ]
-)
 ```

 # Clone the repository
 git clone git@github.com:huggingface/hf_agent.git
 cd hf-agent
+```
+#### Install recommended dependencies
+```bash
+uv sync --extra agent # or uv sync --extra all
 ```
 ### Interactive CLI
 ```bash
 uv run python -m agent.main
 ```
 This starts an interactive chat session with the agent. Type your messages and the agent will respond, using tools as needed.
 The agent will automatically discover and register all tools from configured MCP servers.
+### Env Setup
+```bash
+ANTHROPIC_API_KEY=<one-key-to-rule-them-all>
+HF_TOKEN=<hf-token-to-access-the-hub>
+GITHUB_TOKEN=<gh-pat-key-for-not-reinventing-the-wheel>
+HF_NAMESPACE=<hf-namespace-to-use>
+```
 ## Architecture
 ### Component Overview
 │  │  │  │  │ ContextManager             │  │  │  │  │         │
 │  │  │  │  │ • Message history          │  │  │  │  │         │
 │  │  │  │  │   (litellm.Message[])      │  │  │  │  │         │
+│  │  │  │  │ • Auto-compaction (180k)   │  │  │  │  │         │
 │  │  │  │  └────────────────────────────┘  │  │  │  │         │
 │  │  │  │                                  │  │  │  │         │
 │  │  │  │  ┌────────────────────────────┐  │  │  │  │         │
 │  │  │  │  │ ToolRouter                 │  │  │  │  │         │
+│  │  │  │  │  ├─ explore_hf_docs        │  │  │  │  │         │
+│  │  │  │  │  ├─ fetch_hf_docs          │  │  │  │  │         │
+│  │  │  │  │  ├─ search_hf_api_endpoints│  │  │  │  │         │
+│  │  │  │  │  ├─ plan_tool              │  │  │  │  │         │
+│  │  │  │  │  ├─ hf_jobs*               │  │  │  │  │         │
+│  │  │  │  │  ├─ hf_private_repos*      │  │  │  │  │         │
+│  │  │  │  │  ├─ github_* (3 tools)     │  │  │  │  │         │
+│  │  │  │  │  └─ MCP tools (e.g.,       │  │  │  │  │         │
+│  │  │  │  │      model_search, etc.)   │  │  │  │  │         │
 │  │  │  │  └────────────────────────────┘  │  │  │  │         │
 │  │  │  └──────────────────────────────────┘  │  │  │         │
 │  │  │                                        │  │  │         │
 agent/
 ├── config.py                 # Configuration models
 ├── main.py                   # Interactive CLI entry point
+├── prompts/
+│   └── system_prompt.yaml   # Agent behavior and personality
 ├── context_manager/
+│   └── manager.py           # Message history & auto-compaction
 └── core/
     ├── agent_loop.py        # Main agent loop and handlers
     ├── session.py           # Session management
     ├── mcp_client.py        # MCP SDK integration
     └── tools.py             # ToolRouter and built-in tools
+configs/
+└── main_agent_config.json   # Model and MCP server configuration
+tests/                       # Integration and unit tests
 eval/                        # Evaluation suite (see eval/README.md)
 ```
 - `assistant_message` - LLM response text
 - `tool_call` - Tool being called with arguments
 - `tool_output` - Tool execution result
+- `approval_request` - Requesting user approval for sensitive operations
 - `turn_complete` - Agent finished processing
 - `error` - Error occurred during processing
 - `interrupted` - Agent was interrupted
 ### Adding MCP Servers
+Edit `configs/main_agent_config.json`:
+```json
+{
+  "model_name": "anthropic/claude-sonnet-4-5-20250929",
+  "mcpServers": {
+    "your-server-name": {
+      "transport": "http",
+      "url": "https://example.com/mcp",
+      "headers": {
+        "Authorization": "Bearer ${YOUR_TOKEN}"
+      }
+    }
+  }
+}
 ```
+Note: Environment variables like `${YOUR_TOKEN}` are auto-substituted from `.env`.

agent/config.py CHANGED Viewed

@@ -19,6 +19,9 @@ class Config(BaseModel):
     model_name: str
     mcpServers: dict[str, MCPServerConfig] = {}
 def substitute_env_vars(obj: Any) -> Any:

     model_name: str
     mcpServers: dict[str, MCPServerConfig] = {}
+    save_sessions: bool = True
+    session_dataset_repo: str = "smolagents/hf-agent-sessions"
+    auto_save_interval: int = 3  # Save every N user turns (0 = disabled)
 def substitute_env_vars(obj: Any) -> Any:

agent/context_manager/manager.py CHANGED Viewed

@@ -2,6 +2,8 @@
 Context management for conversation history
 """
 from pathlib import Path
 from typing import Any
@@ -42,10 +44,20 @@ class ContextManager:
             prompt_data = yaml.safe_load(f)
             template_str = prompt_data.get("system_prompt", "")
         template = Template(template_str)
         return template.render(
             tools=tool_specs,
             num_tools=len(tool_specs),
         )
     def add_message(self, message: Message, token_count: int = None) -> None:

 Context management for conversation history
 """
+import zoneinfo
+from datetime import datetime
 from pathlib import Path
 from typing import Any
             prompt_data = yaml.safe_load(f)
             template_str = prompt_data.get("system_prompt", "")
+        # Get current date and time
+        tz = zoneinfo.ZoneInfo("Europe/Paris")
+        now = datetime.now(tz)
+        current_date = now.strftime("%d-%m-%Y")
+        current_time = now.strftime("%H:%M:%S.%f")[:-3]
+        current_timezone = f"{now.strftime('%Z')} (UTC{now.strftime('%z')[:3]}:{now.strftime('%z')[3:]})"
         template = Template(template_str)
         return template.render(
             tools=tool_specs,
             num_tools=len(tool_specs),
+            current_date=current_date,
+            current_time=current_time,
+            current_timezone=current_timezone,
         )
     def add_message(self, message: Message, token_count: int = None) -> None:

agent/core/agent_loop.py CHANGED Viewed

@@ -25,9 +25,15 @@ def _validate_tool_args(tool_args: dict) -> tuple[bool, str | None]:
     args = tool_args.get("args", {})
     # Sometimes LLM passes args as string instead of dict
     if isinstance(args, str):
-        return False, f"Tool call error: 'args' must be a JSON object, not a string. You passed: {repr(args)}"
     if not isinstance(args, dict) and args is not None:
-        return False, f"Tool call error: 'args' must be a JSON object. You passed type: {type(args).__name__}"
     return True, None
@@ -38,8 +44,6 @@ def _needs_approval(tool_name: str, tool_args: dict) -> bool:
     if not args_valid:
         return False
-    args = tool_args.get("args", {})
     if tool_name == "hf_jobs":
         # Check if it's a run or uv operation
         operation = tool_args.get("operation", "")
@@ -251,6 +255,11 @@ class Handlers:
                 data={"history_size": len(session.context_manager.items)},
             )
         )
         return final_response
     @staticmethod
@@ -410,6 +419,14 @@ class Handlers:
     @staticmethod
     async def shutdown(session: Session) -> bool:
         """Handle shutdown (like shutdown in codex.rs:1329)"""
         session.is_running = False
         await session.send_event(Event(event_type="shutdown"))
         return True
@@ -470,26 +487,47 @@ async def submission_loop(
     session = Session(event_queue, config=config, tool_router=tool_router)
     print("Agent loop started")
-    # Main processing loop
-    async with tool_router:
-        # Emit ready event after initialization
-        await session.send_event(
-            Event(event_type="ready", data={"message": "Agent initialized"})
         )
-        while session.is_running:
-            submission = await submission_queue.get()
-            try:
-                should_continue = await process_submission(session, submission)
-                if not should_continue:
                     break
-            except asyncio.CancelledError:
-                break
-            except Exception as e:
-                print(f"❌ Error in agent loop: {e}")
-                await session.send_event(
-                    Event(event_type="error", data={"error": str(e)})
-                )
-    print("🛑 Agent loop exited")

     args = tool_args.get("args", {})
     # Sometimes LLM passes args as string instead of dict
     if isinstance(args, str):
+        return (
+            False,
+            f"Tool call error: 'args' must be a JSON object, not a string. You passed: {repr(args)}",
+        )
     if not isinstance(args, dict) and args is not None:
+        return (
+            False,
+            f"Tool call error: 'args' must be a JSON object. You passed type: {type(args).__name__}",
+        )
     return True, None
     if not args_valid:
         return False
     if tool_name == "hf_jobs":
         # Check if it's a run or uv operation
         operation = tool_args.get("operation", "")
                 data={"history_size": len(session.context_manager.items)},
             )
         )
+        # Increment turn counter and check for auto-save
+        session.increment_turn()
+        await session.auto_save_if_needed()
         return final_response
     @staticmethod
     @staticmethod
     async def shutdown(session: Session) -> bool:
         """Handle shutdown (like shutdown in codex.rs:1329)"""
+        # Save session trajectory if enabled (fire-and-forget, returns immediately)
+        if session.config.save_sessions:
+            print("💾 Saving session...")
+            repo_id = session.config.session_dataset_repo
+            local_path = session.save_and_upload_detached(repo_id)
+            if local_path:
+                print("✅ Session saved locally, upload in progress")
         session.is_running = False
         await session.send_event(Event(event_type="shutdown"))
         return True
     session = Session(event_queue, config=config, tool_router=tool_router)
     print("Agent loop started")
+    # Retry any failed uploads from previous sessions (fire-and-forget)
+    if config and config.save_sessions:
+        Session.retry_failed_uploads_detached(
+            directory="session_logs", repo_id=config.session_dataset_repo
         )
+    try:
+        # Main processing loop
+        async with tool_router:
+            # Emit ready event after initialization
+            await session.send_event(
+                Event(event_type="ready", data={"message": "Agent initialized"})
+            )
+            while session.is_running:
+                submission = await submission_queue.get()
+                try:
+                    should_continue = await process_submission(session, submission)
+                    if not should_continue:
+                        break
+                except asyncio.CancelledError:
+                    print("\n⚠️  Agent loop cancelled")
                     break
+                except Exception as e:
+                    print(f"❌ Error in agent loop: {e}")
+                    await session.send_event(
+                        Event(event_type="error", data={"error": str(e)})
+                    )
+        print("🛑 Agent loop exited")
+    finally:
+        # Emergency save if session saving is enabled and shutdown wasn't called properly
+        if session.config.save_sessions and session.is_running:
+            print("\n💾 Emergency save: preserving session before exit...")
+            try:
+                local_path = session.save_and_upload_detached(
+                    session.config.session_dataset_repo
+                )
+                if local_path:
+                    print("✅ Emergency save successful, upload in progress")
+            except Exception as e:
+                print(f"❌ Emergency save failed: {e}")

agent/core/session.py CHANGED Viewed

@@ -1,7 +1,12 @@
 import asyncio
 import uuid
 from dataclasses import dataclass
 from enum import Enum
 from typing import Any, Optional
 from litellm import get_max_tokens
@@ -55,11 +60,176 @@ class Session:
         self.current_task: asyncio.Task | None = None
         self.pending_approval: Optional[dict[str, Any]] = None
     async def send_event(self, event: Event) -> None:
-        """Send event back to client"""
         await self.event_queue.put(event)
     def interrupt(self) -> None:
         """Interrupt current running task"""
         if self.current_task and not self.current_task.done():
             self.current_task.cancel()

 import asyncio
+import json
+import subprocess
+import sys
 import uuid
 from dataclasses import dataclass
+from datetime import datetime
 from enum import Enum
+from pathlib import Path
 from typing import Any, Optional
 from litellm import get_max_tokens
         self.current_task: asyncio.Task | None = None
         self.pending_approval: Optional[dict[str, Any]] = None
+        # Session trajectory logging
+        self.logged_events: list[dict] = []
+        self.session_start_time = datetime.now().isoformat()
+        self.turn_count: int = 0
+        self.last_auto_save_turn: int = 0
     async def send_event(self, event: Event) -> None:
+        """Send event back to client and log to trajectory"""
         await self.event_queue.put(event)
+        # Log event to trajectory
+        self.logged_events.append(
+            {
+                "timestamp": datetime.now().isoformat(),
+                "event_type": event.event_type,
+                "data": event.data,
+            }
+        )
     def interrupt(self) -> None:
         """Interrupt current running task"""
         if self.current_task and not self.current_task.done():
             self.current_task.cancel()
+    def increment_turn(self) -> None:
+        """Increment turn counter (called after each user interaction)"""
+        self.turn_count += 1
+    async def auto_save_if_needed(self) -> None:
+        """Check if auto-save should trigger and save if so (completely non-blocking)"""
+        if not self.config.save_sessions:
+            return
+        interval = self.config.auto_save_interval
+        if interval <= 0:
+            return
+        turns_since_last_save = self.turn_count - self.last_auto_save_turn
+        if turns_since_last_save >= interval:
+            print(f"\n💾 Auto-saving session (turn {self.turn_count})...")
+            # Fire-and-forget save - returns immediately
+            self.save_and_upload_detached(self.config.session_dataset_repo)
+            self.last_auto_save_turn = self.turn_count
+    def get_trajectory(self) -> dict:
+        """Serialize complete session trajectory for logging"""
+        return {
+            "session_id": self.session_id,
+            "session_start_time": self.session_start_time,
+            "session_end_time": datetime.now().isoformat(),
+            "model_name": self.config.model_name,
+            "messages": [msg.model_dump() for msg in self.context_manager.items],
+            "events": self.logged_events,
+        }
+    def save_trajectory_local(
+        self,
+        directory: str = "session_logs",
+        upload_status: str = "pending",
+        dataset_url: Optional[str] = None,
+    ) -> Optional[str]:
+        """
+        Save trajectory to local JSON file as backup with upload status
+        Args:
+            directory: Directory to save logs (default: "session_logs")
+            upload_status: Status of upload attempt ("pending", "success", "failed")
+            dataset_url: URL of dataset if upload succeeded
+        Returns:
+            Path to saved file if successful, None otherwise
+        """
+        try:
+            log_dir = Path(directory)
+            log_dir.mkdir(parents=True, exist_ok=True)
+            trajectory = self.get_trajectory()
+            # Add upload metadata
+            trajectory["upload_status"] = upload_status
+            trajectory["upload_url"] = dataset_url
+            trajectory["last_save_time"] = datetime.now().isoformat()
+            filename = f"session_{self.session_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
+            filepath = log_dir / filename
+            with open(filepath, "w") as f:
+                json.dump(trajectory, f, indent=2)
+            return str(filepath)
+        except Exception as e:
+            print(f"Failed to save session locally: {e}")
+            return None
+    def update_local_save_status(
+        self, filepath: str, upload_status: str, dataset_url: Optional[str] = None
+    ) -> bool:
+        """Update the upload status of an existing local save file"""
+        try:
+            with open(filepath, "r") as f:
+                data = json.load(f)
+            data["upload_status"] = upload_status
+            data["upload_url"] = dataset_url
+            data["last_save_time"] = datetime.now().isoformat()
+            with open(filepath, "w") as f:
+                json.dump(data, f, indent=2)
+            return True
+        except Exception as e:
+            print(f"Failed to update local save status: {e}")
+            return False
+    def save_and_upload_detached(self, repo_id: str) -> Optional[str]:
+        """
+        Save session locally and spawn detached subprocess for upload (fire-and-forget)
+        Args:
+            repo_id: HuggingFace dataset repo ID
+        Returns:
+            Path to local save file
+        """
+        # Save locally first (fast, synchronous)
+        local_path = self.save_trajectory_local(upload_status="pending")
+        if not local_path:
+            return None
+        # Spawn detached subprocess for upload (fire-and-forget)
+        try:
+            uploader_script = Path(__file__).parent / "session_uploader.py"
+            # Use Popen with detached process
+            subprocess.Popen(
+                [sys.executable, str(uploader_script), "upload", local_path, repo_id],
+                stdin=subprocess.DEVNULL,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+                start_new_session=True,  # Detach from parent
+            )
+        except Exception as e:
+            print(f"⚠️  Failed to spawn upload subprocess: {e}")
+        return local_path
+    @staticmethod
+    def retry_failed_uploads_detached(
+        directory: str = "session_logs", repo_id: Optional[str] = None
+    ) -> None:
+        """
+        Spawn detached subprocess to retry failed/pending uploads (fire-and-forget)
+        Args:
+            directory: Directory containing session logs
+            repo_id: Target dataset repo ID
+        """
+        if not repo_id:
+            return
+        try:
+            uploader_script = Path(__file__).parent / "session_uploader.py"
+            # Spawn detached subprocess for retry
+            subprocess.Popen(
+                [sys.executable, str(uploader_script), "retry", directory, repo_id],
+                stdin=subprocess.DEVNULL,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+                start_new_session=True,  # Detach from parent
+            )
+        except Exception as e:
+            print(f"⚠️  Failed to spawn retry subprocess: {e}")

agent/core/session_uploader.py ADDED Viewed

	@@ -0,0 +1,194 @@

+#!/usr/bin/env python3
+"""
+Standalone script for uploading session trajectories to HuggingFace.
+This runs as a separate process to avoid blocking the main agent.
+Uses individual file uploads to avoid race conditions.
+"""
+import json
+import os
+import sys
+from datetime import datetime
+from pathlib import Path
+def upload_session_as_file(
+    session_file: str, repo_id: str, max_retries: int = 3
+) -> bool:
+    """
+    Upload a single session as an individual JSONL file (no race conditions)
+    Args:
+        session_file: Path to local session JSON file
+        repo_id: HuggingFace dataset repo ID
+        max_retries: Number of retry attempts
+    Returns:
+        True if successful, False otherwise
+    """
+    try:
+        from huggingface_hub import HfApi
+    except ImportError:
+        print("Error: huggingface_hub library not available", file=sys.stderr)
+        return False
+    try:
+        # Load session data
+        with open(session_file, "r") as f:
+            data = json.load(f)
+        # Check if already uploaded
+        upload_status = data.get("upload_status")
+        if upload_status == "success":
+            return True
+        hf_token = os.getenv("HF_TOKEN")
+        if not hf_token:
+            # Update status to failed
+            data["upload_status"] = "failed"
+            with open(session_file, "w") as f:
+                json.dump(data, f, indent=2)
+            return False
+        # Prepare JSONL content (single line)
+        # Store messages and events as JSON strings to avoid schema conflicts
+        session_row = {
+            "session_id": data["session_id"],
+            "session_start_time": data["session_start_time"],
+            "session_end_time": data["session_end_time"],
+            "model_name": data["model_name"],
+            "messages": json.dumps(data["messages"]),
+            "events": json.dumps(data["events"]),
+        }
+        # Create temporary JSONL file
+        import tempfile
+        with tempfile.NamedTemporaryFile(
+            mode="w", suffix=".jsonl", delete=False
+        ) as tmp:
+            json.dump(session_row, tmp)  # Single line JSON
+            tmp_path = tmp.name
+        try:
+            # Generate unique path in repo: sessions/YYYY-MM-DD/session_id.jsonl
+            session_id = data["session_id"]
+            date_str = datetime.fromisoformat(data["session_start_time"]).strftime(
+                "%Y-%m-%d"
+            )
+            repo_path = f"sessions/{date_str}/{session_id}.jsonl"
+            # Upload with retries
+            api = HfApi()
+            for attempt in range(max_retries):
+                try:
+                    # Try to create repo if it doesn't exist (idempotent)
+                    try:
+                        api.create_repo(
+                            repo_id=repo_id,
+                            repo_type="dataset",
+                            private=True,
+                            token=hf_token,
+                            exist_ok=True,  # Don't fail if already exists
+                        )
+                    except Exception:
+                        # Repo might already exist, continue
+                        pass
+                    # Upload the session file
+                    api.upload_file(
+                        path_or_fileobj=tmp_path,
+                        path_in_repo=repo_path,
+                        repo_id=repo_id,
+                        repo_type="dataset",
+                        token=hf_token,
+                        commit_message=f"Add session {session_id}",
+                    )
+                    # Update local status to success
+                    data["upload_status"] = "success"
+                    data["upload_url"] = f"https://huggingface.co/datasets/{repo_id}"
+                    with open(session_file, "w") as f:
+                        json.dump(data, f, indent=2)
+                    return True
+                except Exception:
+                    if attempt < max_retries - 1:
+                        import time
+                        wait_time = 2**attempt
+                        time.sleep(wait_time)
+                    else:
+                        # Final attempt failed
+                        data["upload_status"] = "failed"
+                        with open(session_file, "w") as f:
+                            json.dump(data, f, indent=2)
+                        return False
+        finally:
+            # Clean up temp file
+            try:
+                os.unlink(tmp_path)
+            except Exception:
+                pass
+    except Exception as e:
+        print(f"Error uploading session: {e}", file=sys.stderr)
+        return False
+def retry_failed_uploads(directory: str, repo_id: str):
+    """Retry all failed/pending uploads in a directory"""
+    log_dir = Path(directory)
+    if not log_dir.exists():
+        return
+    session_files = list(log_dir.glob("session_*.json"))
+    for filepath in session_files:
+        try:
+            with open(filepath, "r") as f:
+                data = json.load(f)
+            upload_status = data.get("upload_status", "unknown")
+            # Only retry pending or failed uploads
+            if upload_status in ["pending", "failed"]:
+                upload_session_as_file(str(filepath), repo_id)
+        except Exception:
+            pass
+if __name__ == "__main__":
+    if len(sys.argv) < 3:
+        print("Usage: session_uploader.py <command> <args...>")
+        sys.exit(1)
+    command = sys.argv[1]
+    if command == "upload":
+        # python session_uploader.py upload <session_file> <repo_id>
+        if len(sys.argv) < 4:
+            print("Usage: session_uploader.py upload <session_file> <repo_id>")
+            sys.exit(1)
+        session_file = sys.argv[2]
+        repo_id = sys.argv[3]
+        success = upload_session_as_file(session_file, repo_id)
+        sys.exit(0 if success else 1)
+    elif command == "retry":
+        # python session_uploader.py retry <directory> <repo_id>
+        if len(sys.argv) < 4:
+            print("Usage: session_uploader.py retry <directory> <repo_id>")
+            sys.exit(1)
+        directory = sys.argv[2]
+        repo_id = sys.argv[3]
+        retry_failed_uploads(directory, repo_id)
+        sys.exit(0)
+    else:
+        print(f"Unknown command: {command}")
+        sys.exit(1)

agent/core/tools.py CHANGED Viewed

@@ -19,13 +19,27 @@ from agent.tools.docs_tools import (
     explore_hf_docs_handler,
     hf_docs_fetch_handler,
 )
 from agent.tools.jobs_tool import HF_JOBS_TOOL_SPEC, hf_jobs_handler
 from agent.tools.plan_tool import PLAN_TOOL_SPEC, plan_tool_handler
 from agent.tools.private_hf_repo_tools import (
     PRIVATE_HF_REPO_TOOL_SPEC,
     private_hf_repo_handler,
 )
-from agent.tools.utils_tools import UTILS_TOOL_SPEC, utils_handler
 # Suppress aiohttp deprecation warning
 warnings.filterwarnings(
@@ -118,11 +132,13 @@ class ToolRouter:
     async def register_mcp_tools(self) -> None:
         tools = await self.mcp_client.list_tools()
         for tool in tools:
             if tool.name in NOT_ALLOWED_TOOL_NAMES:
-                print(f"Skipping not MCP allowed tool: {tool.name}")
                 continue
-            print(f"MCP Tool: {tool.name}")
             self.register_tool(
                 ToolSpec(
                     name=tool.name,
@@ -131,6 +147,9 @@ class ToolRouter:
                     handler=None,
                 )
             )
     async def register_openapi_tool(self) -> None:
         """Register the OpenAPI search tool (requires async initialization)"""
@@ -139,8 +158,6 @@ class ToolRouter:
             search_openapi_handler,
         )
-        print("Registering OpenAPI search tool...")
         # Register search_hf_api_endpoints with dynamic spec
         openapi_spec = await _get_api_search_tool_spec()
         self.register_tool(
@@ -151,7 +168,7 @@ class ToolRouter:
                 handler=search_openapi_handler,
             )
         )
-        print(f"Registered: {openapi_spec['name']}")
     def get_tool_specs_for_llm(self) -> list[dict[str, Any]]:
         """Get tool specifications in OpenAI format"""
@@ -175,11 +192,13 @@ class ToolRouter:
             await self.mcp_client.initialize()
             await self.register_mcp_tools()
             self._mcp_initialized = True
-        print(f"MCP initialized: {self._mcp_initialized}")
         # Register OpenAPI tool (requires async initialization)
         await self.register_openapi_tool()
         return self
     async def __aexit__(self, exc_type, exc, tb) -> None:
@@ -223,11 +242,8 @@ class ToolRouter:
 def create_builtin_tools() -> list[ToolSpec]:
     """Create built-in tool specifications"""
-    print(
-        f"Creating built-in tools: {EXPLORE_HF_DOCS_TOOL_SPEC['name']}, {HF_DOCS_FETCH_TOOL_SPEC['name']}, {PLAN_TOOL_SPEC['name']}, {HF_JOBS_TOOL_SPEC['name']}, {PRIVATE_HF_REPO_TOOL_SPEC['name']}, {UTILS_TOOL_SPEC['name']}"
-    )
     # in order of importance
-    return [
         # Documentation search tools
         ToolSpec(
             name=EXPLORE_HF_DOCS_TOOL_SPEC["name"],
@@ -260,10 +276,42 @@ def create_builtin_tools() -> list[ToolSpec]:
             parameters=PRIVATE_HF_REPO_TOOL_SPEC["parameters"],
             handler=private_hf_repo_handler,
         ),
         ToolSpec(
-            name=UTILS_TOOL_SPEC["name"],
-            description=UTILS_TOOL_SPEC["description"],
-            parameters=UTILS_TOOL_SPEC["parameters"],
-            handler=utils_handler,
         ),
     ]

     explore_hf_docs_handler,
     hf_docs_fetch_handler,
 )
+from agent.tools.github_find_examples import (
+    GITHUB_FIND_EXAMPLES_TOOL_SPEC,
+    github_find_examples_handler,
+)
+from agent.tools.github_list_repos import (
+    GITHUB_LIST_REPOS_TOOL_SPEC,
+    github_list_repos_handler,
+)
+from agent.tools.github_read_file import (
+    GITHUB_READ_FILE_TOOL_SPEC,
+    github_read_file_handler,
+)
 from agent.tools.jobs_tool import HF_JOBS_TOOL_SPEC, hf_jobs_handler
 from agent.tools.plan_tool import PLAN_TOOL_SPEC, plan_tool_handler
 from agent.tools.private_hf_repo_tools import (
     PRIVATE_HF_REPO_TOOL_SPEC,
     private_hf_repo_handler,
 )
+# NOTE: Utils tool disabled - date/time now loaded into system prompt at initialization
+# from agent.tools.utils_tools import UTILS_TOOL_SPEC, utils_handler
 # Suppress aiohttp deprecation warning
 warnings.filterwarnings(
     async def register_mcp_tools(self) -> None:
         tools = await self.mcp_client.list_tools()
+        registered_names = []
+        skipped_count = 0
         for tool in tools:
             if tool.name in NOT_ALLOWED_TOOL_NAMES:
+                skipped_count += 1
                 continue
+            registered_names.append(tool.name)
             self.register_tool(
                 ToolSpec(
                     name=tool.name,
                     handler=None,
                 )
             )
+        print(
+            f"Loaded {len(registered_names)} MCP tools: {', '.join(registered_names)} ({skipped_count} disabled)"
+        )
     async def register_openapi_tool(self) -> None:
         """Register the OpenAPI search tool (requires async initialization)"""
             search_openapi_handler,
         )
         # Register search_hf_api_endpoints with dynamic spec
         openapi_spec = await _get_api_search_tool_spec()
         self.register_tool(
                 handler=search_openapi_handler,
             )
         )
+        print(f"Loaded OpenAPI search tool: {openapi_spec['name']}")
     def get_tool_specs_for_llm(self) -> list[dict[str, Any]]:
         """Get tool specifications in OpenAI format"""
             await self.mcp_client.initialize()
             await self.register_mcp_tools()
             self._mcp_initialized = True
         # Register OpenAPI tool (requires async initialization)
         await self.register_openapi_tool()
+        total_tools = len(self.tools)
+        print(f"\nAgent ready with {total_tools} tools total\n")
         return self
     async def __aexit__(self, exc_type, exc, tb) -> None:
 def create_builtin_tools() -> list[ToolSpec]:
     """Create built-in tool specifications"""
     # in order of importance
+    tools = [
         # Documentation search tools
         ToolSpec(
             name=EXPLORE_HF_DOCS_TOOL_SPEC["name"],
             parameters=PRIVATE_HF_REPO_TOOL_SPEC["parameters"],
             handler=private_hf_repo_handler,
         ),
+        # NOTE: Utils tool disabled - date/time now loaded into system prompt at initialization (less tool calls=more reliablity)
+        # ToolSpec(
+        #     name=UTILS_TOOL_SPEC["name"],
+        #     description=UTILS_TOOL_SPEC["description"],
+        #     parameters=UTILS_TOOL_SPEC["parameters"],
+        #     handler=utils_handler,
+        # ),
+        # GitHub tools
+        # NOTE: Github search code tool disabled - a bit buggy
+        # ToolSpec(
+        #     name=GITHUB_SEARCH_CODE_TOOL_SPEC["name"],
+        #     description=GITHUB_SEARCH_CODE_TOOL_SPEC["description"],
+        #     parameters=GITHUB_SEARCH_CODE_TOOL_SPEC["parameters"],
+        #     handler=github_search_code_handler,
+        # ),
         ToolSpec(
+            name=GITHUB_FIND_EXAMPLES_TOOL_SPEC["name"],
+            description=GITHUB_FIND_EXAMPLES_TOOL_SPEC["description"],
+            parameters=GITHUB_FIND_EXAMPLES_TOOL_SPEC["parameters"],
+            handler=github_find_examples_handler,
+        ),
+        ToolSpec(
+            name=GITHUB_LIST_REPOS_TOOL_SPEC["name"],
+            description=GITHUB_LIST_REPOS_TOOL_SPEC["description"],
+            parameters=GITHUB_LIST_REPOS_TOOL_SPEC["parameters"],
+            handler=github_list_repos_handler,
+        ),
+        ToolSpec(
+            name=GITHUB_READ_FILE_TOOL_SPEC["name"],
+            description=GITHUB_READ_FILE_TOOL_SPEC["description"],
+            parameters=GITHUB_READ_FILE_TOOL_SPEC["parameters"],
+            handler=github_read_file_handler,
         ),
     ]
+    tool_names = ", ".join([t.name for t in tools])
+    print(f"Loaded {len(tools)} built-in tools: {tool_names}")
+    return tools

agent/main.py CHANGED Viewed

@@ -120,7 +120,6 @@ async def event_listener(
                 print(format_error(error))
                 turn_complete_event.set()
             elif event.event_type == "shutdown":
-                print("Agent shutdown")
                 break
             elif event.event_type == "processing":
                 print("Processing...", flush=True)
@@ -228,11 +227,15 @@ async def event_listener(
                             # Build repo URL
                             type_path = "" if repo_type == "model" else f"{repo_type}s"
-                            repo_url = f"https://huggingface.co/{type_path}/{repo_id}".replace("//", "/")
                             print(f"Repository: {repo_id}")
                             print(f"Type: {repo_type}")
-                            print(f"Private: Yes")
                             print(f"URL: {repo_url}")
                             # Show file preview for upload_file operation
@@ -243,9 +246,9 @@ async def event_listener(
                                 if isinstance(file_content, str):
                                     # Calculate metrics
-                                    all_lines = file_content.split('\n')
                                     line_count = len(all_lines)
-                                    size_bytes = len(file_content.encode('utf-8'))
                                     size_kb = size_bytes / 1024
                                     size_mb = size_kb / 1024
@@ -257,8 +260,10 @@ async def event_listener(
                                     # Show preview
                                     preview_lines = all_lines[:5]
-                                    preview = '\n'.join(preview_lines)
-                                    print(f"Content preview (first 5 lines):\n{preview}")
                                     if len(all_lines) > 5:
                                         print("...")
@@ -327,6 +332,8 @@ async def main():
     print(f"{Colors.YELLOW} {banner}{Colors.RESET}")
     print("Type your messages below. Type 'exit', 'quit', or '/quit' to end.\n")
     print(format_separator())
     # Create queues for communication
     submission_queue = asyncio.Queue()
@@ -342,7 +349,7 @@ async def main():
     config = load_config(config_path)
     # Create tool router
-    print(f"Config: {config.mcpServers}")
     tool_router = ToolRouter(config.mcpServers)
     # Create prompt session for input
@@ -368,8 +375,6 @@ async def main():
         )
     )
-    # Wait for agent to initialize
-    print("Initializing agent...")
     await ready_event.wait()
     submission_id = 0
@@ -416,8 +421,7 @@ async def main():
     )
     await submission_queue.put(shutdown_submission)
-    # Wait for tasks to complete
-    await asyncio.wait_for(agent_task, timeout=2.0)
     listener_task.cancel()
     print("✨ Goodbye!\n")

                 print(format_error(error))
                 turn_complete_event.set()
             elif event.event_type == "shutdown":
                 break
             elif event.event_type == "processing":
                 print("Processing...", flush=True)
                             # Build repo URL
                             type_path = "" if repo_type == "model" else f"{repo_type}s"
+                            repo_url = (
+                                f"https://huggingface.co/{type_path}/{repo_id}".replace(
+                                    "//", "/"
+                                )
+                            )
                             print(f"Repository: {repo_id}")
                             print(f"Type: {repo_type}")
+                            print("Private: Yes")
                             print(f"URL: {repo_url}")
                             # Show file preview for upload_file operation
                                 if isinstance(file_content, str):
                                     # Calculate metrics
+                                    all_lines = file_content.split("\n")
                                     line_count = len(all_lines)
+                                    size_bytes = len(file_content.encode("utf-8"))
                                     size_kb = size_bytes / 1024
                                     size_mb = size_kb / 1024
                                     # Show preview
                                     preview_lines = all_lines[:5]
+                                    preview = "\n".join(preview_lines)
+                                    print(
+                                        f"Content preview (first 5 lines):\n{preview}"
+                                    )
                                     if len(all_lines) > 5:
                                         print("...")
     print(f"{Colors.YELLOW} {banner}{Colors.RESET}")
     print("Type your messages below. Type 'exit', 'quit', or '/quit' to end.\n")
     print(format_separator())
+    # Wait for agent to initialize
+    print("Initializing agent...")
     # Create queues for communication
     submission_queue = asyncio.Queue()
     config = load_config(config_path)
     # Create tool router
+    print(f"Loading MCP servers: {', '.join(config.mcpServers.keys())}")
     tool_router = ToolRouter(config.mcpServers)
     # Create prompt session for input
         )
     )
     await ready_event.wait()
     submission_id = 0
     )
     await submission_queue.put(shutdown_submission)
+    await asyncio.wait_for(agent_task, timeout=5.0)
     listener_task.cancel()
     print("✨ Goodbye!\n")

agent/prompts/system_prompt.yaml CHANGED Viewed

@@ -26,7 +26,7 @@ system_prompt: |
    - Invoke multiple independent tools simultaneously for efficiency
   # Available Tools
   You have access to the following main categories of tools. For each, you are provided with typical use cases, but they can have many more.
   - Hugging Face Hub
@@ -168,4 +168,3 @@ system_prompt: |
   3. Sort by trending or downloads.
   4. Report top results with short descriptions and links.
   </example>

    - Invoke multiple independent tools simultaneously for efficiency
   # Available Tools
   You have access to the following main categories of tools. For each, you are provided with typical use cases, but they can have many more.
   - Hugging Face Hub
   3. Sort by trending or downloads.
   4. Report top results with short descriptions and links.
   </example>

agent/tools/__init__.py CHANGED Viewed

@@ -2,6 +2,22 @@
 Hugging Face tools for the agent
 """
 from agent.tools.jobs_tool import HF_JOBS_TOOL_SPEC, HfJobsTool, hf_jobs_handler
 from agent.tools.types import ToolResult
@@ -10,4 +26,12 @@ __all__ = [
     "HF_JOBS_TOOL_SPEC",
     "hf_jobs_handler",
     "HfJobsTool",
 ]

 Hugging Face tools for the agent
 """
+from agent.tools.github_find_examples import (
+    GITHUB_FIND_EXAMPLES_TOOL_SPEC,
+    github_find_examples_handler,
+)
+from agent.tools.github_list_repos import (
+    GITHUB_LIST_REPOS_TOOL_SPEC,
+    github_list_repos_handler,
+)
+from agent.tools.github_read_file import (
+    GITHUB_READ_FILE_TOOL_SPEC,
+    github_read_file_handler,
+)
+from agent.tools.github_search_code import (
+    GITHUB_SEARCH_CODE_TOOL_SPEC,
+    github_search_code_handler,
+)
 from agent.tools.jobs_tool import HF_JOBS_TOOL_SPEC, HfJobsTool, hf_jobs_handler
 from agent.tools.types import ToolResult
     "HF_JOBS_TOOL_SPEC",
     "hf_jobs_handler",
     "HfJobsTool",
+    "GITHUB_FIND_EXAMPLES_TOOL_SPEC",
+    "github_find_examples_handler",
+    "GITHUB_LIST_REPOS_TOOL_SPEC",
+    "github_list_repos_handler",
+    "GITHUB_READ_FILE_TOOL_SPEC",
+    "github_read_file_handler",
+    "GITHUB_SEARCH_CODE_TOOL_SPEC",
+    "github_search_code_handler",
 ]

agent/tools/docs_tools.py CHANGED Viewed

@@ -5,7 +5,6 @@ Tools for exploring and fetching HuggingFace documentation and API specification
 import asyncio
 import os
-import time
 from typing import Any
 import httpx
@@ -21,21 +20,15 @@ async def _fetch_html_page(hf_token: str, endpoint: str) -> str:
     url = f"{base_url}/{endpoint}"
     headers = {"Authorization": f"Bearer {hf_token}"}
-    fetch_start = time.perf_counter()
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
         response = await client.get(url, headers=headers)
         response.raise_for_status()
-    fetch_time = time.perf_counter() - fetch_start
-    print(f"[DEBUG] _fetch_html_page: Fetched in {fetch_time:.2f}s")
     return response.text
 def _parse_sidebar_navigation(html_content: str) -> list[dict[str, str]]:
     """Parse the sidebar navigation and extract all links"""
-    parse_start = time.perf_counter()
     soup = BeautifulSoup(html_content, "html.parser")
     sidebar = soup.find("nav", class_=lambda x: x and "flex-auto" in x)
@@ -53,11 +46,6 @@ def _parse_sidebar_navigation(html_content: str) -> list[dict[str, str]]:
         page_url = f"https://huggingface.co{href}" if href.startswith("/") else href
         nav_data.append({"title": title, "url": page_url})
-    parse_time = time.perf_counter() - parse_start
-    print(
-        f"[DEBUG] _parse_sidebar_navigation: Parsed in {parse_time:.2f}s, found {len(nav_data)} links"
-    )
     return nav_data
@@ -96,18 +84,11 @@ async def _fetch_all_glimpses(
     hf_token: str, nav_data: list[dict[str, str]]
 ) -> list[dict[str, str]]:
     """Fetch glimpses for all pages in parallel"""
-    glimpse_start = time.perf_counter()
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
         result_items = await asyncio.gather(
             *[_fetch_single_glimpse(client, hf_token, item) for item in nav_data]
         )
-    glimpse_time = time.perf_counter() - glimpse_start
-    print(
-        f"[DEBUG] _fetch_all_glimpses: Fetched {len(result_items)} glimpses in {glimpse_time:.2f}s"
-    )
     return list(result_items)
@@ -130,9 +111,6 @@ def _format_exploration_results(
 async def explore_hf_docs(hf_token: str, endpoint: str) -> str:
     """Main function to explore documentation structure"""
-    start_time = time.perf_counter()
-    print(f"[DEBUG] explore_hf_docs: Starting for endpoint '{endpoint}'")
     # Fetch HTML page
     html_content = await _fetch_html_page(hf_token, endpoint)
@@ -148,9 +126,6 @@ async def explore_hf_docs(hf_token: str, endpoint: str) -> str:
     # Format results
     result = _format_exploration_results(endpoint, result_items)
-    total_time = time.perf_counter() - start_time
-    print(f"[DEBUG] explore_hf_docs: Total time {total_time:.2f}s")
     return result
@@ -199,12 +174,8 @@ async def _fetch_openapi_spec() -> dict[str, Any]:
     global _openapi_spec_cache
     if _openapi_spec_cache is not None:
-        print("[DEBUG] _fetch_openapi_spec: Using cached spec")
         return _openapi_spec_cache
-    start_time = time.perf_counter()
-    print("[DEBUG] _fetch_openapi_spec: Fetching from API")
     url = "https://huggingface.co/.well-known/openapi.json"
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
@@ -214,9 +185,6 @@ async def _fetch_openapi_spec() -> dict[str, Any]:
     spec = response.json()
     _openapi_spec_cache = spec
-    fetch_time = time.perf_counter() - start_time
-    print(f"[DEBUG] _fetch_openapi_spec: Fetched and cached in {fetch_time:.2f}s")
     return spec
@@ -457,9 +425,7 @@ async def search_openapi_handler(arguments: dict[str, Any]) -> tuple[str, bool]:
     Returns:
         Tuple of (search_results, success)
     """
-    start_time = time.perf_counter()
     tag = arguments.get("tag", "")
-    print(f"[DEBUG] search_openapi: Starting for tag '{tag}'")
     if not tag:
         return "Error: No tag provided", False
@@ -474,9 +440,6 @@ async def search_openapi_handler(arguments: dict[str, Any]) -> tuple[str, bool]:
         # Format results
         formatted = _format_openapi_results(results, tag)
-        total_time = time.perf_counter() - start_time
-        print(f"[DEBUG] search_openapi: Total time {total_time:.2f}s")
         return formatted, True
     except httpx.HTTPStatusError as e:
@@ -497,9 +460,7 @@ async def hf_docs_fetch_handler(arguments: dict[str, Any]) -> tuple[str, bool]:
     Returns:
         Tuple of (full_markdown_content, success)
     """
-    start_time = time.perf_counter()
     url = arguments.get("url", "")
-    print(f"[DEBUG] fetch_hf_docs: Starting for URL '{url}'")
     if not url:
         return "Error: No URL provided", False
@@ -521,25 +482,15 @@ async def hf_docs_fetch_handler(arguments: dict[str, Any]) -> tuple[str, bool]:
         # Make request with auth
         headers = {"Authorization": f"Bearer {hf_token}"}
-        fetch_start = time.perf_counter()
         async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
             response = await client.get(url, headers=headers)
             response.raise_for_status()
-        fetch_time = time.perf_counter() - fetch_start
         content = response.text
-        content_size_kb = len(content) / 1024
-        print(
-            f"[DEBUG] fetch_hf_docs: Fetched {content_size_kb:.1f}KB in {fetch_time:.2f}s"
-        )
         # Return the markdown content directly
         result = f"Documentation from: {url}\n\n{content}"
-        total_time = time.perf_counter() - start_time
-        print(f"[DEBUG] fetch_hf_docs: Total time {total_time:.2f}s")
         return result, True
     except httpx.HTTPStatusError as e:

 import asyncio
 import os
 from typing import Any
 import httpx
     url = f"{base_url}/{endpoint}"
     headers = {"Authorization": f"Bearer {hf_token}"}
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
         response = await client.get(url, headers=headers)
         response.raise_for_status()
     return response.text
 def _parse_sidebar_navigation(html_content: str) -> list[dict[str, str]]:
     """Parse the sidebar navigation and extract all links"""
     soup = BeautifulSoup(html_content, "html.parser")
     sidebar = soup.find("nav", class_=lambda x: x and "flex-auto" in x)
         page_url = f"https://huggingface.co{href}" if href.startswith("/") else href
         nav_data.append({"title": title, "url": page_url})
     return nav_data
     hf_token: str, nav_data: list[dict[str, str]]
 ) -> list[dict[str, str]]:
     """Fetch glimpses for all pages in parallel"""
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
         result_items = await asyncio.gather(
             *[_fetch_single_glimpse(client, hf_token, item) for item in nav_data]
         )
     return list(result_items)
 async def explore_hf_docs(hf_token: str, endpoint: str) -> str:
     """Main function to explore documentation structure"""
     # Fetch HTML page
     html_content = await _fetch_html_page(hf_token, endpoint)
     # Format results
     result = _format_exploration_results(endpoint, result_items)
     return result
     global _openapi_spec_cache
     if _openapi_spec_cache is not None:
         return _openapi_spec_cache
     url = "https://huggingface.co/.well-known/openapi.json"
     async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
     spec = response.json()
     _openapi_spec_cache = spec
     return spec
     Returns:
         Tuple of (search_results, success)
     """
     tag = arguments.get("tag", "")
     if not tag:
         return "Error: No tag provided", False
         # Format results
         formatted = _format_openapi_results(results, tag)
         return formatted, True
     except httpx.HTTPStatusError as e:
     Returns:
         Tuple of (full_markdown_content, success)
     """
     url = arguments.get("url", "")
     if not url:
         return "Error: No URL provided", False
         # Make request with auth
         headers = {"Authorization": f"Bearer {hf_token}"}
         async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
             response = await client.get(url, headers=headers)
             response.raise_for_status()
         content = response.text
         # Return the markdown content directly
         result = f"Documentation from: {url}\n\n{content}"
         return result, True
     except httpx.HTTPStatusError as e:

agent/tools/github_find_examples.py ADDED Viewed

	@@ -0,0 +1,489 @@

+"""
+GitHub Find Examples Tool - Discover examples, tutorials, and guides for any library
+Lists all files in a repository and performs deterministic keyword search.
+"""
+import os
+from typing import Any, Dict, List
+import requests
+from thefuzz import fuzz
+from agent.tools.types import ToolResult
+# In order of priority (lower index = higher priority for sorting)
+EXAMPLE_PATTERNS = [
+    "scripts",
+    # General example patterns (catch-all, lower priority)
+    "examples",
+    "example",
+    # Notebook patterns
+    "notebooks",
+    "notebook",
+    # Tutorial/learning patterns
+    "tutorials",
+    "tutorial",
+    "quickstart",
+    "walkthroughs",
+    "walkthrough",
+    # Cookbook/recipe patterns
+    "cookbook",
+    "cookbooks",
+    "recipes",
+    "recipe",
+    # Demo/sample patterns
+    "demos",
+    "demo",
+    "samples",
+    "sample",
+    # Other patterns
+    "guides",
+    "guide",
+    "getting-started",
+    "getting_started",
+    "playground",
+    "howto",
+    "how-to",
+    "use-cases",
+    "usecases",
+    "use_cases",
+    "sandbox",
+    "showcase",
+]
+def _get_repo_tree(org: str, repo: str, token: str) -> tuple[List[Dict[str, Any]], str]:
+    """Get all files in a repository recursively. Returns (files, error_message)"""
+    headers = {
+        "Accept": "application/vnd.github+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+        "Authorization": f"Bearer {token}",
+    }
+    full_repo = f"{org}/{repo}"
+    # Get default branch
+    try:
+        response = requests.get(
+            f"https://api.github.com/repos/{full_repo}", headers=headers, timeout=10
+        )
+        if response.status_code == 404:
+            return [], "not_found"
+        if response.status_code != 200:
+            return [], f"API error: {response.status_code}"
+        repo_data = response.json()
+        default_branch = repo_data.get("default_branch", "main")
+    except Exception as e:
+        return [], f"Error fetching repo: {str(e)}"
+    # Get repository tree recursively
+    try:
+        response = requests.get(
+            f"https://api.github.com/repos/{full_repo}/git/trees/{default_branch}",
+            headers=headers,
+            params={"recursive": "1"},
+            timeout=30,
+        )
+        if response.status_code != 200:
+            return [], f"Error fetching tree: {response.status_code}"
+        data = response.json()
+        tree = data.get("tree", [])
+        # Filter to only include files (not directories)
+        files = [
+            {
+                "path": item["path"],
+                "ref": item["sha"],
+                "size": item.get("size", 0),
+                "url": f"https://github.com/{full_repo}/blob/{default_branch}/{item['path']}",
+            }
+            for item in tree
+            if item["type"] == "blob"
+        ]
+        return files, ""
+    except Exception as e:
+        return [], f"Error processing tree: {str(e)}"
+def _search_similar_repos(org: str, repo: str, token: str) -> List[Dict[str, Any]]:
+    """Search for similar repository names in the organization"""
+    headers = {
+        "Accept": "application/vnd.github+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+        "Authorization": f"Bearer {token}",
+    }
+    # Search for repos in the org with similar name
+    query = f"org:{org} {repo}"
+    try:
+        response = requests.get(
+            "https://api.github.com/search/repositories",
+            headers=headers,
+            params={"q": query, "sort": "stars", "order": "desc", "per_page": 10},
+            timeout=30,
+        )
+        if response.status_code != 200:
+            return []
+        data = response.json()
+        items = data.get("items", [])
+        return [
+            {
+                "name": item.get("name"),
+                "full_name": item.get("full_name"),
+                "description": item.get("description"),
+                "stars": item.get("stargazers_count", 0),
+                "url": item.get("html_url"),
+            }
+            for item in items
+        ]
+    except Exception:
+        return []
+def _score_against_example_patterns(file_path: str) -> int:
+    """Score file against example patterns using token_set_ratio"""
+    scores = []
+    for pattern in EXAMPLE_PATTERNS:
+        score = fuzz.token_set_ratio(pattern.lower(), file_path.lower())
+        scores.append(score)
+    return max(scores) if scores else 0
+def _score_against_keyword(file_path: str, keyword: str) -> int:
+    """Calculate fuzzy match score for a file path against a keyword"""
+    # Use partial_ratio for substring matching (good for paths)
+    # Also check token_set_ratio for word-level matching
+    partial_score = fuzz.partial_ratio(keyword.lower(), file_path.lower())
+    token_score = fuzz.token_set_ratio(keyword.lower(), file_path.lower())
+    # Return the higher of the two
+    return max(partial_score, token_score)
+def _get_pattern_priority(file_path: str) -> tuple[int, int, int]:
+    """
+    Get priority of a file path based on which example pattern directory it's in.
+    Returns: (in_examples_dir, pattern_priority, path_depth)
+    - in_examples_dir: 0 if in examples/ directory, 1 otherwise (lower is better)
+    - pattern_priority: Index in EXAMPLE_PATTERNS (lower is better), or 999 if no match
+    - path_depth: Number of path segments (lower is better)
+    Note: Prioritizes files in "examples/" directory first, then by most specific pattern match.
+    E.g., "examples/scripts/train.py" is better than "scripts/util.py"
+    """
+    path_lower = file_path.lower()
+    path_parts = path_lower.split("/")
+    # Check if file is in examples/ directory (highest priority)
+    in_examples_dir = 0 if (path_parts[0] in ["examples", "example"]) else 1
+    # Find ALL matching patterns and use the best (lowest index) one
+    # But prefer deeper matches (more specific) over shallow ones
+    best_priority = 999
+    best_depth_at_match = -1
+    for i, pattern in enumerate(EXAMPLE_PATTERNS):
+        # Check if pattern appears as a directory component in the path
+        if pattern in path_parts:
+            # Find the depth where this pattern appears (rightmost occurrence)
+            depth = len(path_parts) - 1 - path_parts[::-1].index(pattern)
+            # Prefer deeper matches, or better priority if at same depth
+            if depth > best_depth_at_match or (
+                depth == best_depth_at_match and i < best_priority
+            ):
+                best_priority = i
+                best_depth_at_match = depth
+    return (in_examples_dir, best_priority, len(path_parts))
+def _handle_repo_tree_errors(
+    all_files: List[Dict[str, Any]],
+    error: str,
+    org: str,
+    repo: str,
+    token: str,
+) -> ToolResult | None:
+    """Handle errors from repo tree fetch. Returns ToolResult if error, None if OK."""
+    if error == "not_found":
+        similar_repos = _search_similar_repos(org, repo, token)
+        if not similar_repos:
+            return {
+                "formatted": f"Repository '{org}/{repo}' not found and no similar repositories found.",
+                "totalResults": 0,
+                "resultsShared": 0,
+                "isError": True,
+            }
+        # Format similar repos
+        lines = [f"**Repository '{org}/{repo}' not found. Similar repositories:**\n"]
+        for i, r in enumerate(similar_repos, 1):
+            lines.append(f"{i}. **{r['full_name']}** (⭐ {r['stars']:,} stars)")
+            if r["description"]:
+                desc = (
+                    r["description"][:100] + "..."
+                    if len(r["description"]) > 100
+                    else r["description"]
+                )
+                lines.append(f"   {desc}")
+            lines.append(f"   {r['url']}\n")
+        return {
+            "formatted": "\n".join(lines),
+            "totalResults": len(similar_repos),
+            "resultsShared": len(similar_repos),
+            "isError": True,
+        }
+    if error:
+        return {
+            "formatted": f"Error accessing repository '{org}/{repo}': {error}",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    if not all_files:
+        return {
+            "formatted": f"No files found in repository '{org}/{repo}'",
+            "totalResults": 0,
+            "resultsShared": 0,
+        }
+    return None
+def find_examples(
+    keyword: str = "",
+    repo: str = "",
+    org: str = "huggingface",
+    max_results: int = 10,
+    min_score: int = 80,
+) -> ToolResult:
+    """
+    Find example files in a repository using fuzzy matching.
+    Args:
+        keyword: Keyword to fuzzy match against file paths (e.g., "grpo")
+        repo: Repository name (e.g., "trl")
+        org: GitHub organization (default: "huggingface")
+        max_results: Maximum number of results (default 50)
+        min_score: Minimum fuzzy match score (0-100, default 60)
+    Returns:
+        ToolResult with matching files, or similar repos if repo not found
+    """
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        return {
+            "formatted": "Error: GITHUB_TOKEN environment variable is required",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    if not repo:
+        return {
+            "formatted": "Error: repo parameter is required",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    # Get all files in the repository
+    all_files, error = _get_repo_tree(org, repo, token)
+    # Handle errors (not found, API errors, empty repo)
+    if error_result := _handle_repo_tree_errors(all_files, error, org, repo, token):
+        return error_result
+    # Step 1: Filter files by example patterns (score >= 60)
+    example_threshold = 60
+    example_files = []
+    for file in all_files:
+        example_score = _score_against_example_patterns(file["path"])
+        if example_score >= example_threshold:
+            example_files.append({**file, "example_score": example_score})
+    if not example_files:
+        return {
+            "formatted": f"No example files found in {org}/{repo} (no files match example patterns with score >= {example_threshold}).",
+            "totalResults": 0,
+            "resultsShared": 0,
+        }
+    # Step 2: If keyword provided, score and filter by keyword
+    if keyword:
+        scored_files = []
+        for file in example_files:
+            keyword_score = _score_against_keyword(file["path"], keyword)
+            if keyword_score >= min_score:
+                scored_files.append({**file, "score": keyword_score})
+        if not scored_files:
+            return {
+                "formatted": f"No files found in {org}/{repo} matching keyword '{keyword}' (min score: {min_score}) among {len(example_files)} example files.",
+                "totalResults": 0,
+                "resultsShared": 0,
+            }
+        # Sort by keyword score (descending) for best matches first
+        scored_files.sort(key=lambda x: x["score"], reverse=True)
+    else:
+        # No keyword: prioritize by pattern directory, then path depth
+        scored_files = []
+        for file in example_files:
+            in_examples_dir, pattern_priority, path_depth = _get_pattern_priority(
+                file["path"]
+            )
+            scored_files.append(
+                {
+                    **file,
+                    "score": file["example_score"],
+                    "in_examples_dir": in_examples_dir,
+                    "pattern_priority": pattern_priority,
+                    "path_depth": path_depth,
+                }
+            )
+        if not scored_files:
+            return {
+                "formatted": f"No example files found in {org}/{repo}.",
+                "totalResults": 0,
+                "resultsShared": 0,
+            }
+        # Sort by: 1) files in examples/ dir first, 2) pattern priority (scripts > datasets > etc), 3) path depth, 4) path name
+        scored_files.sort(
+            key=lambda x: (
+                x["in_examples_dir"],
+                x["pattern_priority"],
+                x["path_depth"],
+                x["path"],
+            )
+        )
+    # Limit results
+    results = scored_files[:max_results]
+    # Format output
+    keyword_desc = f" matching '{keyword}'" if keyword else ""
+    lines = [f"**Found {len(results)} example files in {org}/{repo}{keyword_desc}:**"]
+    if len(scored_files) > max_results:
+        lines[0] += f" (showing {max_results} of {len(scored_files)})"
+    lines.append("")
+    for i, file in enumerate(results, 1):
+        lines.append(f"{i}. **{file['path']}**")
+        lines.append(f"   Size: {file['size']:,} bytes | Ref: {file['ref'][:7]}")
+        lines.append(f"   URL: {file['url']}")
+        # Copyable parameters for read_file tool
+        read_params = f"{{'repo': '{org}/{repo}', 'path': '{file['path']}'}}"
+        lines.append(f"   To read, use: {read_params}")
+        lines.append("")
+    return {
+        "formatted": "\n".join(lines),
+        "totalResults": len(results),
+        "resultsShared": len(results),
+    }
+# Tool specification
+GITHUB_FIND_EXAMPLES_TOOL_SPEC = {
+    "name": "github_find_examples",
+    "description": "Discover best practices, reusable scripts, tutorials, and demos for using a specific library or framework. This is an important step before implementing anything ML related. "
+    "Use together with github_read_file tool.\n\n"
+    "## When to use this tool\n\n"
+    "- ALWAYS before implementing any training/inference/benchmarking or other ML related code or answering how-to question.\n"
+    "- When exploring a new repository and need to understand how to use it\n"
+    "## How it works\n\n"
+    "1. Fetches all (examples, tutorials, demos, notebooks, scripts, etc.) from the repository\n"
+    "2. If keyword provided, scores found files against the keyword using fuzzy matching\n"
+    "3. Returns best matches sorted by relevance score\n"
+    "## Examples\n\n"
+    "<example>\n"
+    "// ML Workflow Step: Find GRPO/SFT/DPO/RLOO etc training examples\n"
+    "// Task: Starting GRPO fine-tuning project, need reference implementations\n"
+    "{\n"
+    "  keyword: 'grpo',\n"
+    "  repo: 'trl',\n"
+    "  org: 'huggingface'\n"
+    "}\n"
+    "// Returns: examples/scripts/grpo_agent.py, examples/scripts/grpo_vlm.py\n"
+    "// Next step: Use github_read_file to study the implementation\n"
+    "</example>\n\n"
+    "<example>\n"
+    "// ML Workflow Step: Discover all training examples in TRL\n"
+    "// Task: Exploring available training methods before choosing approach\n"
+    "{\n"
+    "  repo: 'trl',\n"
+    "  org: 'huggingface',\n"
+    "  max_results: 20\n"
+    "}\n"
+    "// Lists all example scripts: PPO, DPO, GRPO, reward modeling, etc.\n"
+    "</example>\n\n"
+    "<example>\n"
+    "// ML Workflow Step: Find LoRA fine-tuning examples\n"
+    "// Task: Learning parameter-efficient fine-tuning with PEFT\n"
+    "{\n"
+    "  keyword: 'lora',\n"
+    "  repo: 'peft',\n"
+    "  org: 'huggingface'\n"
+    "}\n"
+    "// Discovers LoRA configuration and training examples\n"
+    "</example>",
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "keyword": {
+                "type": "string",
+                "description": "Keyword to fuzzy match against file paths (e.g., 'grpo', 'sft').",
+            },
+            "repo": {
+                "type": "string",
+                "description": "Repository name (e.g., 'trl', 'transformers'). Required.",
+            },
+            "org": {
+                "type": "string",
+                "description": "GitHub organization or username. Default: 'huggingface'.",
+            },
+            "max_results": {
+                "type": "integer",
+                "description": "Maximum number of results to return. Default: 50.",
+            },
+            "min_score": {
+                "type": "integer",
+                "description": "Minimum fuzzy match score (0-100). Default: 60.",
+            },
+        },
+        "required": ["repo"],
+    },
+}
+async def github_find_examples_handler(arguments: Dict[str, Any]) -> tuple[str, bool]:
+    """Handler for agent tool router"""
+    try:
+        result = find_examples(
+            keyword=arguments.get("keyword", ""),
+            repo=arguments["repo"],
+            org=arguments.get("org", "huggingface"),
+            max_results=arguments.get("max_results", 50),
+            min_score=arguments.get("min_score", 60),
+        )
+        return result["formatted"], not result.get("isError", False)
+    except Exception as e:
+        return f"Error finding examples: {str(e)}", False

agent/tools/github_list_repos.py ADDED Viewed

	@@ -0,0 +1,281 @@

+"""
+GitHub List Repositories Tool - List and sort repositories for any user or organization
+Efficiently discover repositories with flexible sorting options.
+"""
+import os
+from typing import Any, Dict, Literal, Optional
+import requests
+from agent.tools.types import ToolResult
+def list_repos(
+    owner: str,
+    owner_type: Literal["user", "org"] = "org",
+    sort: Literal["stars", "forks", "updated", "created"] = "stars",
+    order: Literal["asc", "desc"] = "desc",
+    limit: Optional[int] = 30,
+) -> ToolResult:
+    """
+    List repositories for a user or organization using GitHub REST API.
+    Args:
+        owner: GitHub username or organization name
+        owner_type: Whether the owner is a "user" or "org" (default: "org")
+        sort: Sort field - "stars", "forks", "updated", or "created"
+        order: Sort order - "asc" or "desc" (default: "desc")
+        limit: Maximum number of repositories to return
+    Returns:
+        ToolResult with repository information
+    """
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        return {
+            "formatted": "Error: GITHUB_TOKEN environment variable is required",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    if owner_type == "org":
+        url = f"https://api.github.com/orgs/{owner}/repos"
+    else:
+        url = f"https://api.github.com/users/{owner}/repos"
+    headers = {
+        "Accept": "application/vnd.github+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+        "Authorization": f"Bearer {token}",
+    }
+    all_repos = []
+    page = 1
+    per_page = 100  # Maximum allowed by GitHub
+    # Map our sort values to GitHub API sort values
+    # Note: GitHub list repos API doesn't support sorting by stars/forks
+    # We'll fetch all repos and sort in memory for those cases
+    api_sort_map = {
+        "created": "created",
+        "updated": "updated",
+        "stars": None,  # Not supported by list API
+        "forks": None,  # Not supported by list API
+    }
+    api_sort = api_sort_map.get(sort)
+    need_manual_sort = api_sort is None
+    try:
+        while True:
+            params = {
+                "page": page,
+                "per_page": per_page,
+            }
+            # Only add sort/direction if API supports it
+            if api_sort:
+                params["sort"] = api_sort
+                params["direction"] = order
+            response = requests.get(
+                url,
+                headers=headers,
+                params=params,
+                timeout=30,
+            )
+            if response.status_code == 403:
+                error_data = response.json()
+                return {
+                    "formatted": f"GitHub API rate limit or permission error: {error_data.get('message', 'Unknown error')}",
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+            if response.status_code != 200:
+                error_msg = f"GitHub API error (status {response.status_code})"
+                try:
+                    error_data = response.json()
+                    if "message" in error_data:
+                        error_msg += f": {error_data['message']}"
+                except Exception:
+                    pass
+                return {
+                    "formatted": error_msg,
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+            items = response.json()
+            if not items:
+                break
+            for item in items:
+                all_repos.append(
+                    {
+                        "name": item.get("name"),
+                        "full_name": item.get("full_name"),
+                        "description": item.get("description"),
+                        "html_url": item.get("html_url"),
+                        "language": item.get("language"),
+                        "stars": item.get("stargazers_count", 0),
+                        "forks": item.get("forks_count", 0),
+                        "open_issues": item.get("open_issues_count", 0),
+                        "topics": item.get("topics", []),
+                        "updated_at": item.get("updated_at"),
+                        "created_at": item.get("created_at"),
+                    }
+                )
+            # Check if we got fewer results than requested (last page)
+            if len(items) < per_page:
+                break
+            # Stop if we have enough repos
+            if limit and len(all_repos) >= limit:
+                break
+            page += 1
+    except requests.exceptions.RequestException as e:
+        return {
+            "formatted": f"Failed to connect to GitHub API: {str(e)}",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    # Manual sorting if needed (for stars/forks)
+    if need_manual_sort and all_repos:
+        reverse = order == "desc"
+        all_repos.sort(key=lambda x: x[sort], reverse=reverse)
+    # Apply limit after sorting
+    if limit:
+        all_repos = all_repos[:limit]
+    if not all_repos:
+        return {
+            "formatted": f"No repositories found for {owner_type} '{owner}'",
+            "totalResults": 0,
+            "resultsShared": 0,
+        }
+    # Format output
+    lines = [f"**Found {len(all_repos)} repositories for {owner}:**\n"]
+    for i, repo in enumerate(all_repos, 1):
+        lines.append(f"{i}. **{repo['full_name']}**")
+        lines.append(
+            f"   ⭐ {repo['stars']:,} stars | 🍴 {repo['forks']:,} forks | Language: {repo['language'] or 'N/A'}"
+        )
+        if repo["description"]:
+            desc = (
+                repo["description"][:100] + "..."
+                if len(repo["description"]) > 100
+                else repo["description"]
+            )
+            lines.append(f"   {desc}")
+        lines.append(f"   URL: {repo['html_url']}")
+        if repo["topics"]:
+            lines.append(f"   Topics: {', '.join(repo['topics'][:5])}")
+        # Copyable parameters for other tools
+        lines.append(f"   Use in tools: {{'repo': '{repo['full_name']}'}}")
+        lines.append("")
+    return {
+        "formatted": "\n".join(lines),
+        "totalResults": len(all_repos),
+        "resultsShared": len(all_repos),
+    }
+# Tool specification
+GITHUB_LIST_REPOS_TOOL_SPEC = {
+    "name": "github_list_repos",
+    "description": (
+        "List and discover repositories for any GitHub user or organization with flexible sorting.\n\n"
+        "Returns comprehensive repository information including stars, forks, language, topics, and direct URLs. "
+        "Sorts by stars, forks, update date, or creation date.\n\n"
+        "## When to use this tool\n\n"
+        "- When you need to find libraries to use in your implementation, or to explore what repositories exist for a task.\n"
+        "- When debugging an error to looking up if others are having the same issues in repositories."
+        "- When finding the most popular or active projects for a user or org\n"
+        "## Examples\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Discover HF libraries for RLHF/alignment\n"
+        "// Use case: Find the right library for training with human feedback\n"
+        "{\n"
+        "  owner: 'huggingface',\n"
+        "  owner_type: 'org',\n"
+        "  sort: 'stars',\n"
+        "  limit: 10\n"
+        "}\n"
+        "// Returns: transformers, trl, peft, accelerate, diffusers...\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Check for recently updated HF repos\n"
+        "// Use case: Find actively maintained libraries with latest features\n"
+        "{\n"
+        "  owner: 'huggingface',\n"
+        "  owner_type: 'org',\n"
+        "  sort: 'updated',\n"
+        "  order: 'desc',\n"
+        "  limit: 15\n"
+        "}\n"
+        "// Helps identify which repos have recent improvements/fixes\n"
+        "</example>"
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "owner": {
+                "type": "string",
+                "description": "GitHub username or organization name. Required.",
+            },
+            "owner_type": {
+                "type": "string",
+                "enum": ["user", "org"],
+                "description": "Whether the owner is a 'user' or 'org'. Default: 'org'.",
+            },
+            "sort": {
+                "type": "string",
+                "enum": ["stars", "forks", "updated", "created"],
+                "description": "Sort field. Options: 'stars', 'forks', 'updated', 'created'. Default: 'stars'.",
+            },
+            "order": {
+                "type": "string",
+                "enum": ["asc", "desc"],
+                "description": "Sort order. Options: 'asc', 'desc'. Default: 'desc'.",
+            },
+            "limit": {
+                "type": "integer",
+                "description": "Maximum number of repositories to return. No limit if not specified. Default: 30.",
+            },
+        },
+        "required": ["owner"],
+    },
+}
+async def github_list_repos_handler(arguments: Dict[str, Any]) -> tuple[str, bool]:
+    """Handler for agent tool router"""
+    try:
+        result = list_repos(
+            owner=arguments["owner"],
+            owner_type=arguments.get("owner_type", "org"),
+            sort=arguments.get("sort", "stars"),
+            order=arguments.get("order", "desc"),
+            limit=arguments.get("limit"),
+        )
+        return result["formatted"], not result.get("isError", False)
+    except Exception as e:
+        return f"Error listing repositories: {str(e)}", False

agent/tools/github_read_file.py ADDED Viewed

	@@ -0,0 +1,336 @@

+"""
+GitHub Read File Tool - Read file contents from any GitHub repository with line range support
+Fetch exact file contents with metadata, supporting line ranges for efficient reading.
+"""
+import base64
+import json
+import os
+from typing import Any, Dict, Optional
+import nbformat
+import requests
+from nbconvert import MarkdownExporter
+from nbconvert.preprocessors import ClearOutputPreprocessor, TagRemovePreprocessor
+from agent.tools.types import ToolResult
+def _convert_ipynb_to_markdown(content: str) -> str:
+    """
+    Convert Jupyter notebook JSON to LLM-friendly Markdown.
+    Args:
+        content: Raw notebook JSON string
+    Returns:
+        Converted Markdown string
+    """
+    try:
+        # Parse notebook JSON
+        nb_dict = json.loads(content)
+        # Normalize cell sources (can be string or list of strings)
+        if "cells" in nb_dict:
+            for cell in nb_dict["cells"]:
+                if "source" in cell and isinstance(cell["source"], list):
+                    cell["source"] = "".join(cell["source"])
+        # Read notebook with explicit version
+        nb = nbformat.reads(json.dumps(nb_dict), as_version=4)
+        # Strip outputs for LLM readability (outputs can be noisy/large)
+        clear = ClearOutputPreprocessor()
+        nb, _ = clear.preprocess(nb, {})
+        # Optionally remove cells tagged with "hide" or similar
+        remove = TagRemovePreprocessor(
+            remove_cell_tags={"hide", "hidden", "remove"},
+            remove_input_tags=set(),
+            remove_all_outputs_tags=set(),
+        )
+        nb, _ = remove.preprocess(nb, {})
+        # Convert to markdown
+        exporter = MarkdownExporter()
+        markdown, _ = exporter.from_notebook_node(nb)
+        return markdown
+    except json.JSONDecodeError:
+        return content
+    except Exception:
+        return content
+def read_file(
+    repo: str,
+    path: str,
+    ref: str = "HEAD",
+    line_start: Optional[int] = None,
+    line_end: Optional[int] = None,
+) -> ToolResult:
+    """
+    Read file contents from a GitHub repository with line range support.
+    Args:
+        repo: Repository in format "owner/repo" (e.g., "github/github-mcp-server")
+        path: Path to file in repository (e.g., "pkg/github/search.go")
+        ref: Git reference - branch name, tag, or commit SHA (default: "HEAD")
+        line_start: Starting line number (1-indexed, inclusive)
+        line_end: Ending line number (1-indexed, inclusive)
+    Returns:
+        ToolResult with file contents and metadata
+    """
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        return {
+            "formatted": "Error: GITHUB_TOKEN environment variable is required",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    # Parse repo
+    if "/" not in repo:
+        return {
+            "formatted": "Error: repo must be in format 'owner/repo'",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    owner, repo_name = repo.split("/", 1)
+    headers = {
+        "Accept": "application/vnd.github+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+        "Authorization": f"Bearer {token}",
+    }
+    # Fetch file contents
+    url = f"https://api.github.com/repos/{owner}/{repo_name}/contents/{path}"
+    params = {}
+    if ref and ref != "HEAD":
+        params["ref"] = ref
+    try:
+        response = requests.get(url, headers=headers, params=params, timeout=30)
+        if response.status_code == 404:
+            return {
+                "formatted": f"File not found: {path} in {repo} (ref: {ref})",
+                "totalResults": 0,
+                "resultsShared": 0,
+                "isError": True,
+            }
+        if response.status_code != 200:
+            error_msg = f"GitHub API error (status {response.status_code})"
+            try:
+                error_data = response.json()
+                if "message" in error_data:
+                    error_msg += f": {error_data['message']}"
+            except Exception:
+                pass
+            return {
+                "formatted": error_msg,
+                "totalResults": 0,
+                "resultsShared": 0,
+                "isError": True,
+            }
+        data = response.json()
+        # Check if it's a file
+        if data.get("type") != "file":
+            return {
+                "formatted": f"Path {path} is not a file (type: {data.get('type')})",
+                "totalResults": 0,
+                "resultsShared": 0,
+                "isError": True,
+            }
+        # Decode content
+        content_b64 = data.get("content", "")
+        if content_b64:
+            content_b64 = content_b64.replace("\n", "").replace(" ", "")
+            content = base64.b64decode(content_b64).decode("utf-8", errors="replace")
+        else:
+            # For large files, fetch raw content
+            raw_headers = {
+                "Accept": "application/vnd.github.raw",
+                "X-GitHub-Api-Version": "2022-11-28",
+                "Authorization": f"Bearer {token}",
+            }
+            raw_response = requests.get(
+                url, headers=raw_headers, params=params, timeout=30
+            )
+            if raw_response.status_code != 200:
+                return {
+                    "formatted": "Failed to fetch file content",
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+            content = raw_response.text
+        if path.lower().endswith(".ipynb"):
+            content = _convert_ipynb_to_markdown(content)
+        # Process line ranges
+        lines = content.split("\n")
+        total_lines = len(lines)
+        truncated = False
+        if line_start is None and line_end is None:
+            # No range specified
+            if total_lines > 300:
+                line_start = 1
+                line_end = 300
+                truncated = True
+            else:
+                line_start = 1
+                line_end = total_lines
+        else:
+            # Range specified
+            if line_start is None:
+                line_start = 1
+            if line_end is None:
+                line_end = total_lines
+            # Validate range
+            line_start = max(1, line_start)
+            line_end = min(total_lines, line_end)
+            if line_start > line_end:
+                return {
+                    "formatted": f"Invalid range: line_start ({line_start}) > line_end ({line_end})",
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+        # Extract lines
+        selected_lines = lines[line_start - 1 : line_end]
+        selected_content = "\n".join(selected_lines)
+        # Format output
+        lines_output = [f"**Reading file from repo: {repo}, path: {path}**"]
+        if ref and ref != "HEAD":
+            lines_output.append(f"Ref: {ref}")
+        lines_output.append("\n**File content:")
+        lines_output.append("```")
+        lines_output.append(selected_content)
+        lines_output.append("```")
+        if truncated:
+            lines_output.append(
+                f"Currently showing lines {line_start}-{line_end} out of {total_lines} total lines. Use line_start and line_end to view more lines."
+            )
+        return {
+            "formatted": "\n".join(lines_output),
+            "totalResults": 1,
+            "resultsShared": 1,
+        }
+    except requests.exceptions.RequestException as e:
+        return {
+            "formatted": f"Failed to connect to GitHub API: {str(e)}",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+# Tool specification
+GITHUB_READ_FILE_TOOL_SPEC = {
+    "name": "github_read_file",
+    "description": (
+        "Read file contents from any GitHub repository with line range support.\n\n"
+        "Fetches exact file contents in the given line range (default 300 lines, use line_start/line_end adjust). \n\n"
+        "## When to use this tool\n\n"
+        "- When reading example code, implementations, or documentation on a specific github file\n"
+        "- When you found a file via github_list_repos, or github_find_examples and need its contents\n"
+        "- When investigating specific code sections with line ranges\n"
+        "- When reading from specific branches, tags, or commits\n"
+        "## When NOT to use this tool\n\n"
+        "- When you don't know the exact file path beforehand (use github_search_code or github_find_examples first)\n\n"
+        "## Examples\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Reading example code from for GRPO training with TRL\n"
+        "// Use case: Read trainer class to understand API and methods\n"
+        "{\n"
+        "  repo: 'huggingface/trl',\n"
+        "  path: 'trl/trainer/grpo_trainer.py',\n"
+        "  line_start: 1,\n"
+        "  line_end: 200\n"
+        "}\n"
+        "// Read class definition and constructor to understand parameters\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Study complete training script\n"
+        "// Use case: Learn end-to-end VLM fine-tuning with GRPO\n"
+        "{\n"
+        "  repo: 'huggingface/trl',\n"
+        "  path: 'examples/scripts/grpo_vlm.py'\n"
+        "}\n"
+        "// Returns first 300 lines of the file\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Check configuration patterns\n"
+        "// Use case: Learn how to structure training configs\n"
+        "{\n"
+        "  repo: 'huggingface/transformers',\n"
+        "  path: 'examples/pytorch/language-modeling/run_clm.py',\n"
+        "  line_start: 50,\n"
+        "  line_end: 150\n"
+        "}\n"
+        "// Read argument parsing and config setup section\n"
+        "</example>"
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "repo": {
+                "type": "string",
+                "description": "Repository in format 'owner/repo' (e.g., 'github/github-mcp-server'). Required.",
+            },
+            "path": {
+                "type": "string",
+                "description": "Path to file in repository (e.g., 'src/index.js'). Required.",
+            },
+            "ref": {
+                "type": "string",
+                "description": "Git reference - branch name, tag, or commit SHA. Default: 'HEAD'.",
+            },
+            "line_start": {
+                "type": "integer",
+                "description": "Starting line number (1-indexed, inclusive). Optional.",
+            },
+            "line_end": {
+                "type": "integer",
+                "description": "Ending line number (1-indexed, inclusive). Optional.",
+            },
+        },
+        "required": ["repo", "path"],
+    },
+}
+async def github_read_file_handler(arguments: Dict[str, Any]) -> tuple[str, bool]:
+    """Handler for agent tool router"""
+    try:
+        result = read_file(
+            repo=arguments["repo"],
+            path=arguments["path"],
+            ref=arguments.get("ref", "HEAD"),
+            line_start=arguments.get("line_start"),
+            line_end=arguments.get("line_end"),
+        )
+        return result["formatted"], not result.get("isError", False)
+    except Exception as e:
+        return f"Error reading file: {str(e)}", False

agent/tools/github_search_code.py ADDED Viewed

	@@ -0,0 +1,453 @@

+"""
+GitHub Code Search Tool - Search code across GitHub with intelligent filtering
+Maps user-friendly patterns to GitHub's Code Search API capabilities.
+"""
+import fnmatch
+import os
+import re
+from typing import Any, Dict, Optional
+import requests
+from agent.tools.types import ToolResult
+def _glob_match(text: str, pattern: str) -> bool:
+    """Check if text matches glob pattern, supporting ** for multi-level paths"""
+    if "**" in pattern:
+        regex_pattern = pattern.replace("**", "<<<DOUBLESTAR>>>")
+        regex_pattern = fnmatch.translate(regex_pattern)
+        regex_pattern = regex_pattern.replace("<<<DOUBLESTAR>>>", ".*")
+        return re.match(regex_pattern, text) is not None
+    return fnmatch.fnmatch(text, pattern)
+def _parse_repo_filter(repo_pattern: str) -> tuple[Optional[str], Optional[str]]:
+    """
+    Parse repository pattern into GitHub API filter and client-side glob pattern.
+    Returns: (api_filter, client_glob)
+    - api_filter: GitHub API filter string (e.g., "org:huggingface")
+    - client_glob: Pattern for client-side filtering (e.g., "huggingface/trl*")
+    Examples:
+        "huggingface/trl" → ("repo:huggingface/trl", None)
+        "huggingface/*" → ("org:huggingface", "huggingface/*")
+        "huggingface/trl*" → ("org:huggingface", "huggingface/trl*")
+        "huggingface" → ("org:huggingface", None)
+        "*/*" → (None, "*/*")
+    """
+    if not repo_pattern:
+        return None, None
+    # Pattern: owner/repo (exact match)
+    if "/" in repo_pattern and "*" not in repo_pattern and "?" not in repo_pattern:
+        return f"repo:{repo_pattern}", None
+    # Pattern: owner/* or owner/prefix* (org + client filter)
+    if "/" in repo_pattern and ("*" in repo_pattern or "?" in repo_pattern):
+        org_name = repo_pattern.split("/")[0]
+        if "*" not in org_name and "?" not in org_name:
+            return f"org:{org_name}", repo_pattern
+        # Org name has wildcards - can't filter server-side
+        return None, repo_pattern
+    # Pattern: owner (just org name, no wildcards)
+    if "*" not in repo_pattern and "?" not in repo_pattern:
+        return f"org:{repo_pattern}", None
+    # Pattern: */* or other complex patterns (client-side only)
+    return None, repo_pattern
+def _parse_path_filter(path_pattern: str) -> tuple[Optional[str], Optional[str]]:
+    """
+    Parse path pattern into GitHub API filter and client-side glob pattern.
+    Returns: (api_filter, client_glob)
+    Examples:
+        "*.py" → ("extension:py", None)
+        "**/*.py" → ("extension:py", None)
+        "src/**/*.py" → ("extension:py", "src/**/*.py")
+        "test_*.py" → ("extension:py", "test_*.py")
+        "src/main.py" → ("path:src/main.py", None)
+    """
+    if not path_pattern:
+        return None, None
+    # Exact path (no wildcards)
+    if "*" not in path_pattern and "?" not in path_pattern:
+        return f"path:{path_pattern}", None
+    # Extract extension if present
+    ext_match = re.search(r"\*\.(\w+)$", path_pattern)
+    if ext_match:
+        extension = ext_match.group(1)
+        api_filter = f"extension:{extension}"
+        # Check if there's a directory prefix that needs client-side filtering
+        # e.g., "src/**/*.py" needs client filter, "**/*.py" doesn't
+        if path_pattern in [f"*.{extension}", f"**/*.{extension}"]:
+            # Simple patterns - API filter is enough
+            return api_filter, None
+        else:
+            # Complex pattern - need client-side filter too
+            return api_filter, path_pattern
+    # Pattern like "test_*.py" or "README*" - use filename with client filter
+    # GitHub's filename: doesn't support wildcards, so we rely on client-side
+    if "/" not in path_pattern:
+        # Try to extract extension for API filtering
+        if "." in path_pattern:
+            parts = path_pattern.rsplit(".", 1)
+            if "*" not in parts[-1] and "?" not in parts[-1]:
+                # Extension is clean
+                return f"extension:{parts[-1]}", path_pattern
+        # No extension or complex - client-side only
+        return None, path_pattern
+    # Complex path pattern - client-side only
+    return None, path_pattern
+def search_code(
+    query: str,
+    repo_pattern: Optional[str] = None,
+    path_pattern: Optional[str] = None,
+    regex: bool = False,
+    max_results: int = 20,
+) -> ToolResult:
+    """
+    Search for code across GitHub with intelligent pattern matching.
+    This tool intelligently maps user patterns to GitHub's Code Search API capabilities:
+    Repository Patterns:
+        - "owner/repo" → Searches exact repository
+        - "owner/*" or "owner" → Searches all repos in organization
+        - "*/*" → Searches all GitHub (no repo filter)
+        - Wildcards trigger client-side filtering when needed
+    Path Patterns:
+        - "*.py" → Searches all Python files
+        - "**/*.js" → Searches all JavaScript files (any directory)
+        - "src/**/*.py" → Python files in src/ (uses client-side filtering)
+        - "test_*.py" → Files matching pattern (client-side filtering)
+        - "path/to/file.py" → Exact file path
+    Args:
+        query: Search term or pattern to find in code
+        repo_pattern: Repository pattern (e.g., "huggingface/trl", "huggingface/*", "huggingface")
+        path_pattern: File path pattern (e.g., "*.py", "src/**/*.js")
+        regex: If True, treat query as regular expression
+        max_results: Maximum number of results to return (default 20)
+    Returns:
+        ToolResult with code matches and snippets
+    """
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        return {
+            "formatted": "Error: GITHUB_TOKEN environment variable is required",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    # Build GitHub API query
+    query_parts = []
+    # Add search term
+    if regex:
+        query_parts.append(f"/{query}/")
+    else:
+        query_parts.append(f'"{query}"' if " " in query else query)
+    # Parse repository filter
+    repo_api_filter, repo_client_glob = _parse_repo_filter(repo_pattern)
+    if repo_api_filter:
+        query_parts.append(repo_api_filter)
+    # Parse path filter
+    path_api_filter, path_client_glob = _parse_path_filter(path_pattern)
+    if path_api_filter:
+        query_parts.append(path_api_filter)
+    github_query = " ".join(query_parts)
+    headers = {
+        "Accept": "application/vnd.github.text-match+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+        "Authorization": f"Bearer {token}",
+    }
+    all_matches = []
+    page = 1
+    per_page = min(100, max_results)
+    try:
+        while len(all_matches) < max_results:
+            params = {
+                "q": github_query,
+                "page": page,
+                "per_page": per_page,
+            }
+            response = requests.get(
+                "https://api.github.com/search/code",
+                headers=headers,
+                params=params,
+                timeout=30,
+            )
+            if response.status_code == 403:
+                error_data = response.json()
+                return {
+                    "formatted": f"GitHub API rate limit or permission error: {error_data.get('message', 'Unknown error')}",
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+            if response.status_code != 200:
+                error_msg = f"GitHub API error (status {response.status_code})"
+                try:
+                    error_data = response.json()
+                    if "message" in error_data:
+                        error_msg += f": {error_data['message']}"
+                except Exception:
+                    pass
+                return {
+                    "formatted": error_msg,
+                    "totalResults": 0,
+                    "resultsShared": 0,
+                    "isError": True,
+                }
+            data = response.json()
+            items = data.get("items", [])
+            if not items:
+                break
+            for item in items:
+                repo_name = item.get("repository", {}).get("full_name", "unknown")
+                file_path = item.get("path", "")
+                sha = item.get("sha", "")
+                # Apply client-side filtering
+                if repo_client_glob and not _glob_match(repo_name, repo_client_glob):
+                    continue
+                if path_client_glob and not _glob_match(file_path, path_client_glob):
+                    continue
+                # Extract text matches
+                text_matches = item.get("text_matches", [])
+                if text_matches:
+                    for text_match in text_matches:
+                        fragment = text_match.get("fragment", "")
+                        lines = fragment.split("\n")
+                        line_count = len([line for line in lines if line.strip()])
+                        all_matches.append(
+                            {
+                                "repo": repo_name,
+                                "path": file_path,
+                                "ref": sha,
+                                "line_start": 1,
+                                "line_end": line_count,
+                                "snippet": fragment.strip(),
+                                "url": item.get("html_url", ""),
+                            }
+                        )
+                else:
+                    all_matches.append(
+                        {
+                            "repo": repo_name,
+                            "path": file_path,
+                            "ref": sha,
+                            "line_start": 1,
+                            "line_end": 1,
+                            "snippet": "(snippet not available)",
+                            "url": item.get("html_url", ""),
+                        }
+                    )
+            if len(all_matches) >= data.get("total_count", 0):
+                break
+            page += 1
+    except requests.exceptions.RequestException as e:
+        return {
+            "formatted": f"Failed to connect to GitHub API: {str(e)}",
+            "totalResults": 0,
+            "resultsShared": 0,
+            "isError": True,
+        }
+    results = all_matches[:max_results]
+    if not results:
+        return {
+            "formatted": f"No code matches found for query: {query}",
+            "totalResults": 0,
+            "resultsShared": 0,
+        }
+    # Format output
+    lines_output = [f"**Found {len(results)} code matches:**\n"]
+    for i, match in enumerate(results, 1):
+        lines_output.append(f"{i}. **{match['repo']}:{match['path']}**")
+        lines_output.append(
+            f"   Lines: {match['line_start']}-{match['line_end']} | Ref: {match['ref'][:7]}"
+        )
+        lines_output.append(f"   URL: {match['url']}")
+        # Copyable parameters for read_file tool
+        read_params = f"{{'repo': '{match['repo']}', 'path': '{match['path']}', 'ref': '{match['ref'][:7]}'}}"
+        lines_output.append(f"   To read, use: {read_params}")
+        # Show snippet (first 5 lines)
+        snippet_lines = match["snippet"].split("\n")[:5]
+        if snippet_lines:
+            lines_output.append("   ```")
+            for line in snippet_lines:
+                lines_output.append(f"   {line}")
+            if len(match["snippet"].split("\n")) > 5:
+                lines_output.append("   ...")
+            lines_output.append("   ```")
+        lines_output.append("")
+    return {
+        "formatted": "\n".join(lines_output),
+        "totalResults": len(results),
+        "resultsShared": len(results),
+    }
+# Tool specification
+GITHUB_SEARCH_CODE_TOOL_SPEC = {
+    "name": "github_search_code",
+    "description": (
+        "Search for code patterns across GitHub repositories with intelligent pattern matching.\n\n"
+        "Searches for specific code patterns, functions, classes, or implementations across GitHub. "
+        "Intelligently maps patterns to GitHub's Code Search API for efficient server-side filtering, "
+        "with automatic client-side filtering for complex patterns. Returns code snippets with context.\n\n"
+        "## When to use this tool\n\n"
+        "- When searching for specific code patterns, functions, or classes across repositories\n"
+        "- When looking for implementation examples of specific methods or APIs\n"
+        "- When you need to find where specific code exists across multiple files or repos\n"
+        "- When investigating how a feature is implemented in different repositories\n"
+        "- When searching for TODO comments, specific patterns, or code structures\n"
+        "- Use this for searching actual implementation code (not examples - use github_find_examples for those)\n\n"
+        "## When NOT to use this tool\n\n"
+        "- When looking for example files or tutorials (use github_find_examples instead)\n"
+        "- When you already know the exact file path (use github_read_file directly)\n"
+        "- When you need to list repositories (use github_list_repos instead)\n\n"
+        "## Repository Patterns\n\n"
+        "- **Exact repo**: `'huggingface/trl'` → Searches only that repository\n"
+        "- **Organization**: `'huggingface'` or `'huggingface/*'` → All repos in organization\n"
+        "- **All GitHub**: `'*/*'` or omit repo_pattern → Searches across all GitHub\n"
+        "- **Wildcards**: `'huggingface/trl*'` → Automatic client-side filtering for complex patterns\n\n"
+        "## Path Patterns\n\n"
+        "- **Extension**: `'*.py'` or `'**/*.py'` → All Python files\n"
+        "- **Directory**: `'src/**/*.js'` → JavaScript files in src/ directory (client-filtered)\n"
+        "- **Pattern**: `'test_*.py'` → Files matching pattern (client-filtered)\n"
+        "- **Exact path**: `'README.md'` → Specific file\n\n"
+        "## How it works\n\n"
+        "1. Parses repository and path patterns\n"
+        "2. Converts to GitHub API filters when possible (server-side, fast)\n"
+        "3. Falls back to client-side filtering for complex patterns\n"
+        "4. Returns code snippets with line numbers, URLs, and file refs\n"
+        "5. Results can be used directly with github_read_file tool\n\n"
+        "## Examples\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Find how AutoModelForCausalLM is used\n"
+        "// Use case: Learning best practices for loading LLMs in TRL\n"
+        "{\n"
+        "  query: 'AutoModelForCausalLM.from_pretrained',\n"
+        "  repo_pattern: 'huggingface/trl',\n"
+        "  path_pattern: '*.py'\n"
+        "}\n"
+        "// Finds all model loading patterns with quantization, device_map, etc.\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Discover TrainingArguments configurations\n"
+        "// Use case: Setting up training hyperparameters correctly\n"
+        "{\n"
+        "  query: 'TrainingArguments',\n"
+        "  repo_pattern: 'huggingface/transformers',\n"
+        "  path_pattern: 'examples/**/*.py',\n"
+        "  max_results: 10\n"
+        "}\n"
+        "// Shows various TrainingArguments setups across different tasks\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Find dataset preprocessing patterns\n"
+        "// Use case: Learning how to prepare data for instruction tuning\n"
+        "{\n"
+        "  query: 'map(tokenize',\n"
+        "  repo_pattern: 'huggingface',\n"
+        "  path_pattern: '*.py'\n"
+        "}\n"
+        "// Discovers tokenization and dataset mapping patterns\n"
+        "</example>\n\n"
+        "<example>\n"
+        "// ML Workflow Step: Find all Trainer class implementations\n"
+        "// Use case: Understanding available trainer variants for different tasks\n"
+        "{\n"
+        "  query: 'class \\\\w+Trainer\\\\(',\n"
+        "  repo_pattern: 'huggingface/trl',\n"
+        "  path_pattern: 'trl/trainer/**/*.py',\n"
+        "  regex: true\n"
+        "}\n"
+        "// Lists: GRPOTrainer, DPOTrainer, PPOTrainer, RewardTrainer, etc.\n"
+        "</example>"
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "Search term or pattern to find in code. Required.",
+            },
+            "repo_pattern": {
+                "type": "string",
+                "description": "Repository pattern: 'owner/repo' (exact), 'owner' (org), 'owner/*' (org with filter), '*/*' (all). Optional.",
+            },
+            "path_pattern": {
+                "type": "string",
+                "description": "File path pattern: '*.ext' (extension), 'dir/**/*.ext' (directory), 'pattern*.ext' (name pattern). Optional.",
+            },
+            "regex": {
+                "type": "boolean",
+                "description": "If true, treat query as regular expression. Default: false.",
+            },
+            "max_results": {
+                "type": "integer",
+                "description": "Maximum number of results to return. Default: 20.",
+            },
+        },
+        "required": ["query"],
+    },
+}
+async def github_search_code_handler(arguments: Dict[str, Any]) -> tuple[str, bool]:
+    """Handler for agent tool router"""
+    try:
+        result = search_code(
+            query=arguments["query"],
+            repo_pattern=arguments.get("repo_pattern"),
+            path_pattern=arguments.get("path_pattern"),
+            regex=arguments.get("regex", False),
+            max_results=arguments.get("max_results", 20),
+        )
+        return result["formatted"], not result.get("isError", False)
+    except Exception as e:
+        return f"Error searching code: {str(e)}", False

agent/tools/jobs_tool.py CHANGED Viewed

@@ -7,6 +7,7 @@ Refactored to use official huggingface-hub library instead of custom HTTP client
 import asyncio
 import base64
 import os
 from typing import Any, Dict, Literal, Optional
 from huggingface_hub import HfApi
@@ -40,6 +41,20 @@ GPU_FLAVORS = [
     "h100",
     "h100x8",
 ]
 SPECIALIZED_FLAVORS = ["inf2x6"]
 ALL_FLAVORS = CPU_FLAVORS + GPU_FLAVORS + SPECIALIZED_FLAVORS
@@ -62,6 +77,44 @@ OperationType = Literal[
 UV_DEFAULT_IMAGE = "ghcr.io/astral-sh/uv:python3.12-bookworm"
 def _add_environment_variables(params: Dict[str, Any] | None) -> Dict[str, Any]:
     token = os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN") or ""
@@ -375,8 +428,11 @@ class HfJobsTool:
                 namespace=self.namespace,
             )
             # Format all logs for the agent
-            log_text = "\n".join(all_logs) if all_logs else "(no logs)"
             response = f"""{job_type} job completed!
@@ -741,12 +797,12 @@ HF_JOBS_TOOL_SPEC = {
         "1. **Python mode:** Provide 'script' + 'dependencies' → auto-handles pip install\n"
         "2. **Docker mode:** Provide 'image' + 'command' → full control\n"
         "(script and command are mutually exclusive)\n\n"
-        "## Hardware:\n"
-        "CPU: cpu-basic (default), cpu-upgrade, cpu-performance, cpu-xl\n"
-        "GPU: t4-small, t4-medium, l4x1, a10g-small, a10g-large, a100-large, h100\n\n"
         "## Examples:\n\n"
         "**Fine-tune LLM and push to Hub:**\n"
-        "{'operation': 'run', 'script': 'from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer\\nmodel = AutoModelForCausalLM.from_pretrained(\"gpt2\")\\n# ... training code ...\\nmodel.push_to_hub(\"user-name/my-finetuned-model\")', 'dependencies': ['transformers', 'torch', 'datasets'], 'hardware_flavor': 'a10g-large', 'timeout': '4h', 'env': {'CUSTOM_VAR': 'value'}}\n\n"
         "**Generate dataset daily and upload:**\n"
         "{'operation': 'scheduled run', 'script': 'from datasets import Dataset\\nimport pandas as pd\\n# scrape/generate data\\ndf = pd.DataFrame(data)\\nds = Dataset.from_pandas(df)\\nds.push_to_hub(\"user-name/daily-dataset\")', 'dependencies': ['datasets', 'pandas'], 'schedule': '@daily'}\n\n"
         "**Run custom training with Docker:**\n"
@@ -807,7 +863,7 @@ HF_JOBS_TOOL_SPEC = {
             # Hardware and environment
             "hardware_flavor": {
                 "type": "string",
-                "description": "Hardware type. CPU: cpu-basic (default), cpu-upgrade, cpu-performance, cpu-xl. GPU: t4-small, t4-medium, l4x1, a10g-small, a10g-large, a100-large, h100. Use with 'run'/'scheduled run'.",
             },
             "timeout": {
                 "type": "string",

 import asyncio
 import base64
 import os
+import re
 from typing import Any, Dict, Literal, Optional
 from huggingface_hub import HfApi
     "h100",
     "h100x8",
 ]
+# Detailed specs for display (vCPU/RAM/GPU VRAM)
+CPU_FLAVORS_DESC = (
+    "cpu-basic(2vCPU/16GB), cpu-upgrade(8vCPU/32GB), cpu-performance, cpu-xl"
+)
+GPU_FLAVORS_DESC = (
+    "t4-small(4vCPU/15GB/GPU 16GB), t4-medium(8vCPU/30GB/GPU 16GB), "
+    "l4x1(8vCPU/30GB/GPU 24GB), l4x4(48vCPU/186GB/GPU 96GB), "
+    "l40sx1(8vCPU/62GB/GPU 48GB), l40sx4(48vCPU/382GB/GPU 192GB), l40sx8(192vCPU/1534GB/GPU 384GB), "
+    "a10g-small(4vCPU/14GB/GPU 24GB), a10g-large(12vCPU/46GB/GPU 24GB), "
+    "a10g-largex2(24vCPU/92GB/GPU 48GB), a10g-largex4(48vCPU/184GB/GPU 96GB), "
+    "a100-large(12vCPU/142GB/GPU 80GB), h100(23vCPU/240GB/GPU 80GB), h100x8(184vCPU/1920GB/GPU 640GB), "
+    "zero-a10g(dynamic alloc)"
+)
 SPECIALIZED_FLAVORS = ["inf2x6"]
 ALL_FLAVORS = CPU_FLAVORS + GPU_FLAVORS + SPECIALIZED_FLAVORS
 UV_DEFAULT_IMAGE = "ghcr.io/astral-sh/uv:python3.12-bookworm"
+def _filter_uv_install_output(logs: list[str]) -> list[str]:
+    """
+    Filter out UV package installation output from logs.
+    Replaces installation details with "[installs truncated]" and keeps
+    the "Installed X packages in Y ms/s" summary line.
+    Args:
+        logs: List of log lines
+    Returns:
+        Filtered list of log lines
+    """
+    if not logs:
+        return logs
+    # Regex pattern to match: "Installed X packages in Y ms" or "Installed X package in Y s"
+    install_pattern = re.compile(
+        r"^Installed\s+\d+\s+packages?\s+in\s+\d+(?:\.\d+)?\s*(?:ms|s)$"
+    )
+    # Find the index of the "Installed X packages" line
+    install_line_idx = None
+    for idx, line in enumerate(logs):
+        if install_pattern.match(line.strip()):
+            install_line_idx = idx
+            break
+    # If pattern found, replace installation details with truncation message
+    if install_line_idx is not None and install_line_idx > 0:
+        # Keep logs from the "Installed X packages" line onward
+        # Add truncation message before the "Installed" line
+        return ["[installs truncated]"] + logs[install_line_idx:]
+    # If pattern not found, return original logs
+    return logs
 def _add_environment_variables(params: Dict[str, Any] | None) -> Dict[str, Any]:
     token = os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN") or ""
                 namespace=self.namespace,
             )
+            # Filter out UV package installation output
+            filtered_logs = _filter_uv_install_output(all_logs)
             # Format all logs for the agent
+            log_text = "\n".join(filtered_logs) if filtered_logs else "(no logs)"
             response = f"""{job_type} job completed!
         "1. **Python mode:** Provide 'script' + 'dependencies' → auto-handles pip install\n"
         "2. **Docker mode:** Provide 'image' + 'command' → full control\n"
         "(script and command are mutually exclusive)\n\n"
+        "## Available Hardware (vCPU/RAM/GPU):\n"
+        f"CPU: {CPU_FLAVORS_DESC}\n"
+        f"GPU: {GPU_FLAVORS_DESC}\n"
         "## Examples:\n\n"
         "**Fine-tune LLM and push to Hub:**\n"
+        "{'operation': 'run', 'script': 'from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer\\nmodel = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen3-4B-Thinking-2507\")\\n# ... training code ...\\nmodel.push_to_hub(\"user-name/my-finetuned-model\")', 'dependencies': ['transformers', 'torch', 'datasets'], 'hardware_flavor': 'a10g-large', 'timeout': '4h', 'env': {'CUSTOM_VAR': 'value'}}\n\n"
         "**Generate dataset daily and upload:**\n"
         "{'operation': 'scheduled run', 'script': 'from datasets import Dataset\\nimport pandas as pd\\n# scrape/generate data\\ndf = pd.DataFrame(data)\\nds = Dataset.from_pandas(df)\\nds.push_to_hub(\"user-name/daily-dataset\")', 'dependencies': ['datasets', 'pandas'], 'schedule': '@daily'}\n\n"
         "**Run custom training with Docker:**\n"
             # Hardware and environment
             "hardware_flavor": {
                 "type": "string",
+                "description": f"Hardware type. Available CPU flavors: {CPU_FLAVORS}. Available GPU flavors: {GPU_FLAVORS}. Use with 'run'/'scheduled run'.",
             },
             "timeout": {
                 "type": "string",

agent/tools/utilities.py CHANGED Viewed

@@ -2,8 +2,10 @@
 Utility functions for Hugging Face tools
 Ported from: hf-mcp-server/packages/mcp/src/jobs/formatters.ts
 """
 from datetime import datetime
 from typing import Any, Dict, List, Optional
@@ -126,7 +128,6 @@ def format_scheduled_jobs_table(jobs: List[Dict[str, Any]]) -> str:
 def format_job_details(jobs: Any) -> str:
     """Format job details as JSON in a markdown code block"""
-    import json
     job_array = jobs if isinstance(jobs, list) else [jobs]
     json_str = json.dumps(job_array, indent=2)
@@ -135,7 +136,6 @@ def format_job_details(jobs: Any) -> str:
 def format_scheduled_job_details(jobs: Any) -> str:
     """Format scheduled job details as JSON in a markdown code block"""
-    import json
     job_array = jobs if isinstance(jobs, list) else [jobs]
     json_str = json.dumps(job_array, indent=2)

 Utility functions for Hugging Face tools
 Ported from: hf-mcp-server/packages/mcp/src/jobs/formatters.ts
+Includes GPU memory validation for job submissions
 """
+import json
 from datetime import datetime
 from typing import Any, Dict, List, Optional
 def format_job_details(jobs: Any) -> str:
     """Format job details as JSON in a markdown code block"""
     job_array = jobs if isinstance(jobs, list) else [jobs]
     json_str = json.dumps(job_array, indent=2)
 def format_scheduled_job_details(jobs: Any) -> str:
     """Format scheduled job details as JSON in a markdown code block"""
     job_array = jobs if isinstance(jobs, list) else [jobs]
     json_str = json.dumps(job_array, indent=2)

agent/tools/utils_tools.py CHANGED Viewed

@@ -4,14 +4,9 @@ Utils Tools - General utility operations
 Provides system information like current date/time with timezone support.
 """
-import asyncio
 from datetime import datetime
-from typing import Any, Dict, Literal, Optional
-try:
-    import zoneinfo
-except ImportError:
-    from backports import zoneinfo
 from agent.tools.types import ToolResult
@@ -123,7 +118,9 @@ Common timezones: Europe/Paris, America/New_York, America/Los_Angeles, Asia/Toky
             date_str = now.strftime("%d-%m-%Y")
             # Format time as HH:MM:SS.mmm
-            time_str = now.strftime("%H:%M:%S.%f")[:-3]  # Remove last 3 digits to keep only milliseconds
             # Get timezone abbreviation/offset
             tz_offset = now.strftime("%z")

 Provides system information like current date/time with timezone support.
 """
+import zoneinfo
 from datetime import datetime
+from typing import Any, Dict, Literal
 from agent.tools.types import ToolResult
             date_str = now.strftime("%d-%m-%Y")
             # Format time as HH:MM:SS.mmm
+            time_str = now.strftime("%H:%M:%S.%f")[
+                :-3
+            ]  # Remove last 3 digits to keep only milliseconds
             # Get timezone abbreviation/offset
             tz_offset = now.strftime("%z")

configs/main_agent_config.json CHANGED Viewed

@@ -1,5 +1,7 @@
 {
-  "model_name": "anthropic/claude-sonnet-4-5-20250929",
   "mcpServers": {
     "hf-mcp-server": {
       "transport": "http",

 {
+  "model_name": "anthropic/claude-opus-4-5-20251101",
+  "save_sessions": true,
+  "session_dataset_repo": "smolagents/hf-agent-sessions",
   "mcpServers": {
     "hf-mcp-server": {
       "transport": "http",

pyproject.toml CHANGED Viewed

@@ -5,22 +5,41 @@ description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.12"
 dependencies = [
-    "numpy>=1.24.0",
-    "requests>=2.32.5",
     "pydantic>=2.12.3",
-    "litellm>=1.0.0",
-    "tenacity>=8.0.0",
-    "pandas>=2.3.3",
     "python-dotenv>=1.2.1",
-    "datasets>=4.3.0",
     "huggingface-hub>=1.0.1",
     "fastmcp>=2.4.0",
     "inspect-ai>=0.3.149",
-    "lmnr[all]>=0.7.23",
-    "transformers>=2.3.0",
-    "torch>=2.9.1",
     "pytest>=9.0.2",
-    "prompt-toolkit>=3.0.0",
-    "ipykernel>=7.1.0",
-    "ipywidgets>=8.1.8",
 ]

 readme = "README.md"
 requires-python = ">=3.12"
 dependencies = [
+    "datasets>=4.4.1",
+    # Core dependencies (always required)
     "pydantic>=2.12.3",
     "python-dotenv>=1.2.1",
+]
+[project.optional-dependencies]
+# Agent runtime dependencies
+agent = [
+    "requests>=2.32.5",
+    "litellm>=1.0.0",
     "huggingface-hub>=1.0.1",
     "fastmcp>=2.4.0",
+    "lmnr>=0.7.23",  # Note: Using base package to avoid torch/transformers from [all] extra
+    "prompt-toolkit>=3.0.0",
+    "thefuzz>=0.22.1",
+    "nbconvert>=7.16.6",
+    "nbformat>=5.10.4",
+    "datasets>=4.3.0",  # For session logging to HF datasets
+]
+# Evaluation/benchmarking dependencies
+eval = [
     "inspect-ai>=0.3.149",
+    "pandas>=2.3.3",
+    "datasets>=4.3.0",
+    "tenacity>=8.0.0",
+]
+# Development and testing dependencies
+dev = [
     "pytest>=9.0.2",
+]
+# All dependencies (agent + eval + dev)
+all = [
+    "hf-agent[agent,eval,dev]",
 ]

tests/unit/tools/test_jobs_tool.py CHANGED Viewed

@@ -452,3 +452,86 @@ async def test_list_jobs_with_status_filter():
         assert "job-3" in result["formatted"]
         assert "job-1" not in result["formatted"]
         assert result["resultsShared"] == 1

         assert "job-3" in result["formatted"]
         assert "job-1" not in result["formatted"]
         assert result["resultsShared"] == 1
+def test_filter_uv_install_output():
+    """Test filtering of UV package installation output"""
+    from agent.tools.jobs_tool import _filter_uv_install_output
+    # Test case 1: Logs with UV installation output
+    logs_with_install = [
+        "Resolved 68 packages in 1.01s",
+        "Installed 68 packages in 251ms",
+        "Hello from the script!",
+        "Script execution completed",
+    ]
+    filtered = _filter_uv_install_output(logs_with_install)
+    assert len(filtered) == 4
+    assert filtered[0] == "[installs truncated]"
+    assert filtered[1] == "Installed 68 packages in 251ms"
+    assert filtered[2] == "Hello from the script!"
+    assert filtered[3] == "Script execution completed"
+    # Test case 2: Logs without UV installation output
+    logs_without_install = [
+        "Script started",
+        "Processing data...",
+        "Done!",
+    ]
+    filtered = _filter_uv_install_output(logs_without_install)
+    assert len(filtered) == 3
+    assert filtered == logs_without_install
+    # Test case 3: Empty logs
+    assert _filter_uv_install_output([]) == []
+    # Test case 4: Different time formats (ms vs s)
+    logs_with_seconds = [
+        "Downloading packages...",
+        "Installed 10 packages in 2s",
+        "Running main.py",
+    ]
+    filtered = _filter_uv_install_output(logs_with_seconds)
+    assert len(filtered) == 3
+    assert filtered[0] == "[installs truncated]"
+    assert filtered[1] == "Installed 10 packages in 2s"
+    assert filtered[2] == "Running main.py"
+    # Test case 5: Single package
+    logs_single_package = [
+        "Resolving dependencies",
+        "Installed 1 package in 50ms",
+        "Import successful",
+    ]
+    filtered = _filter_uv_install_output(logs_single_package)
+    assert len(filtered) == 3
+    assert filtered[0] == "[installs truncated]"
+    assert filtered[1] == "Installed 1 package in 50ms"
+    assert filtered[2] == "Import successful"
+    # Test case 6: Decimal time values
+    logs_decimal_time = [
+        "Starting installation",
+        "Installed 25 packages in 125.5ms",
+        "All dependencies ready",
+    ]
+    filtered = _filter_uv_install_output(logs_decimal_time)
+    assert len(filtered) == 3
+    assert filtered[0] == "[installs truncated]"
+    assert filtered[1] == "Installed 25 packages in 125.5ms"
+    assert filtered[2] == "All dependencies ready"
+    # Test case 7: "Installed" line is first (no truncation needed)
+    logs_install_first = [
+        "Installed 5 packages in 100ms",
+        "Running script...",
+    ]
+    filtered = _filter_uv_install_output(logs_install_first)
+    # No truncation message if "Installed" is the first line
+    assert filtered == logs_install_first

uv.lock CHANGED Viewed

The diff for this file is too large to render. See raw diff