|
|
--- |
|
|
title: BirdScope AI - MCP Multi-Agent System |
|
|
emoji: π¦
|
|
|
colorFrom: green |
|
|
colorTo: blue |
|
|
sdk: gradio |
|
|
python_version: 3.11 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# π¦
BirdScope AI - Multi-Agent Bird Identification System |
|
|
|
|
|
**AI-powered bird identification with specialized MCP agents** |
|
|
|
|
|
Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday) |
|
|
|
|
|
--- |
|
|
|
|
|
## π― Overview |
|
|
|
|
|
BirdScope AI is a production-ready multi-agent system that combines **Modal GPU classification** with **Nuthatch species database** to provide comprehensive bird identification and exploration. Users can upload photos, search species, explore taxonomic families, and access rich multimedia content (images, audio recordings, conservation data). |
|
|
|
|
|
**Two Agent Modes:** |
|
|
1. **Specialized Subagents (3 Specialists)** - Router orchestrates image identifier, species explorer, and taxonomy specialist |
|
|
2. **Audio Finder Agent** - Specialized agent for discovering bird audio recordings |
|
|
|
|
|
--- |
|
|
|
|
|
## β¨ Features |
|
|
|
|
|
- π **Image Classification**: Upload bird photos for instant GPU-powered identification |
|
|
- πΈ **Reference Images**: High-quality Unsplash photos for each species |
|
|
- π΅ **Audio Recordings**: Bird calls and songs from xeno-canto.org |
|
|
- π **Conservation Data**: IUCN status and taxonomic information |
|
|
- π§ **Multi-Agent Architecture**: Specialized agents with focused tool subsets |
|
|
- π **Dual Streaming**: Separate outputs for chat responses and tool execution logs |
|
|
- π€ **Multi-Provider**: OpenAI (GPT-4), Anthropic (Claude), HuggingFace (Qwen) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Quick Start (For Users) |
|
|
|
|
|
### Option 1: OpenAI (Recommended) |
|
|
1. Get your OpenAI API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys) |
|
|
2. Select **OpenAI** as provider in the sidebar |
|
|
3. Enter your API key |
|
|
4. Model used: `gpt-4o-mini` |
|
|
|
|
|
### Option 2: Anthropic (Claude) |
|
|
1. Get your Anthropic API key from [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) |
|
|
2. Select **Anthropic** as provider |
|
|
3. Enter your API key |
|
|
4. Model used: `claude-sonnet-4-5` |
|
|
|
|
|
### Option 3: HuggingFace |
|
|
β οΈ **Note**: HuggingFace Inference API has limited function calling support. OpenAI or Anthropic recommended for full functionality. |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Environment Setup (For Developers) |
|
|
|
|
|
### Prerequisites |
|
|
|
|
|
- Python 3.11+ |
|
|
- Modal account (for GPU classifier) |
|
|
- Nuthatch API key |
|
|
- LLM API key (OpenAI, Anthropic, or HuggingFace) |
|
|
|
|
|
--- |
|
|
|
|
|
### π Local Development Setup |
|
|
|
|
|
#### Step 1: Clone and Install |
|
|
|
|
|
```bash |
|
|
cd ~/Desktop/hackathon/hackathon_draft |
|
|
|
|
|
# Create virtual environment |
|
|
python3.11 -m venv .venv |
|
|
source .venv/bin/activate # On Windows: .venv\Scripts\activate |
|
|
|
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
#### Step 2: Configure Environment Variables |
|
|
|
|
|
Create a `.env` file from the example: |
|
|
|
|
|
```bash |
|
|
cp .env.example .env |
|
|
``` |
|
|
|
|
|
Edit `.env` with your API keys: |
|
|
|
|
|
```bash |
|
|
# ================================================ |
|
|
# REQUIRED: Modal Bird Classifier (GPU) |
|
|
# ================================================ |
|
|
MODAL_MCP_URL=https://your-modal-app--mcp-server.modal.run/mcp |
|
|
BIRD_CLASSIFIER_API_KEY=your-modal-api-key-here |
|
|
|
|
|
# ================================================ |
|
|
# REQUIRED: Nuthatch Species Database |
|
|
# ================================================ |
|
|
NUTHATCH_API_KEY=your-nuthatch-api-key-here |
|
|
NUTHATCH_BASE_URL=https://nuthatch.lastelm.software/v2 # Default, can omit |
|
|
|
|
|
# Nuthatch Transport Mode (STDIO or HTTP) |
|
|
NUTHATCH_USE_STDIO=true # Recommended for local development |
|
|
|
|
|
# Only needed if NUTHATCH_USE_STDIO=false: |
|
|
# NUTHATCH_MCP_URL=http://localhost:8001/mcp |
|
|
# NUTHATCH_MCP_AUTH_KEY=your-auth-key-here |
|
|
|
|
|
# ================================================ |
|
|
# LLM Provider (Choose ONE) |
|
|
# ================================================ |
|
|
# OpenAI (Recommended) |
|
|
OPENAI_API_KEY=sk-your-openai-key-here |
|
|
DEFAULT_OPENAI_MODEL=gpt-4o-mini |
|
|
OPENAI_TEMPERATURE=0.0 |
|
|
|
|
|
# OR Anthropic |
|
|
# ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here |
|
|
# DEFAULT_ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 |
|
|
# ANTHROPIC_TEMPERATURE=0.0 |
|
|
|
|
|
# OR HuggingFace (Limited function calling support) |
|
|
# HF_API_KEY=hf_your-huggingface-token-here |
|
|
# DEFAULT_HF_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct |
|
|
# HF_TEMPERATURE=0.1 |
|
|
``` |
|
|
|
|
|
#### Step 3: Understanding Nuthatch Transport Modes |
|
|
|
|
|
**STDIO Mode (Recommended for Local):** |
|
|
- Nuthatch MCP server runs as subprocess |
|
|
- Automatically started by the app |
|
|
- No separate server process needed |
|
|
- Set `NUTHATCH_USE_STDIO=true` |
|
|
|
|
|
**HTTP Mode (Alternative for Local):** |
|
|
- Nuthatch MCP server runs as separate HTTP server |
|
|
- Useful for debugging or multiple clients |
|
|
- Requires running server in separate terminal |
|
|
|
|
|
To use HTTP mode: |
|
|
|
|
|
```bash |
|
|
# Terminal 1: Run Nuthatch MCP server |
|
|
python nuthatch_tools.py --http --port 8001 |
|
|
|
|
|
# Terminal 2: Run the app |
|
|
# Set in .env: |
|
|
# NUTHATCH_USE_STDIO=false |
|
|
# NUTHATCH_MCP_URL=http://localhost:8001/mcp |
|
|
python app.py |
|
|
``` |
|
|
|
|
|
#### Step 4: Run the App |
|
|
|
|
|
```bash |
|
|
# With STDIO mode (default, easiest): |
|
|
python app.py |
|
|
|
|
|
# Or using Gradio CLI: |
|
|
gradio app.py |
|
|
``` |
|
|
|
|
|
App will be available at: `http://127.0.0.1:7860` |
|
|
|
|
|
--- |
|
|
|
|
|
### βοΈ HuggingFace Spaces Deployment |
|
|
|
|
|
#### Step 1: Create a New Space |
|
|
|
|
|
1. Go to [huggingface.co/new-space](https://huggingface.co/new-space) |
|
|
2. Choose: |
|
|
- **SDK**: Gradio |
|
|
- **Hardware**: CPU Basic (free) or CPU Upgrade (faster) |
|
|
- **Visibility**: Public or Private |
|
|
|
|
|
#### Step 2: Upload Your Code |
|
|
|
|
|
**Option A: Using `upload_to_space.py` (Recommended)** |
|
|
|
|
|
```bash |
|
|
# 1. Install HuggingFace CLI |
|
|
pip install huggingface_hub |
|
|
|
|
|
# 2. Login |
|
|
huggingface-cli login |
|
|
|
|
|
# 3. Update upload_to_space.py with your Space name |
|
|
# Edit line with repo_id: |
|
|
# repo_id="YOUR-USERNAME/YOUR-SPACE-NAME" |
|
|
|
|
|
# 4. Upload |
|
|
python upload_to_space.py |
|
|
``` |
|
|
|
|
|
**Option B: Using Git** |
|
|
|
|
|
```bash |
|
|
git remote add hf-space https://huggingface.co/spaces/YOUR-USERNAME/YOUR-SPACE-NAME |
|
|
git push hf-space main |
|
|
``` |
|
|
|
|
|
#### Step 3: Configure Secrets in HuggingFace Spaces |
|
|
|
|
|
β οΈ **CRITICAL**: Spaces use **Secrets**, not `.env` files! |
|
|
|
|
|
Go to your Space β **Settings** β **Variables and secrets** |
|
|
|
|
|
**Add these secrets:** |
|
|
|
|
|
```bash |
|
|
# REQUIRED: Modal Bird Classifier |
|
|
MODAL_MCP_URL = https://your-modal-app--mcp-server.modal.run/mcp |
|
|
BIRD_CLASSIFIER_API_KEY = your-modal-api-key-here |
|
|
|
|
|
# REQUIRED: Nuthatch Species Database |
|
|
NUTHATCH_API_KEY = your-nuthatch-api-key-here |
|
|
NUTHATCH_BASE_URL = https://nuthatch.lastelm.software/v2 # Optional |
|
|
NUTHATCH_USE_STDIO = true # MUST be "true" for Spaces |
|
|
|
|
|
# OPTIONAL: Backend-provided LLM keys (users can provide their own) |
|
|
# Only add if you want to provide default keys: |
|
|
# OPENAI_API_KEY = sk-your-key-here |
|
|
# ANTHROPIC_API_KEY = sk-ant-your-key-here |
|
|
``` |
|
|
|
|
|
**Important Notes:** |
|
|
- β
**ALWAYS** use `NUTHATCH_USE_STDIO=true` on Spaces (subprocess mode) |
|
|
- β
HTTP mode not supported on Spaces (port binding restrictions) |
|
|
- β
Users can provide their own LLM keys via the UI |
|
|
- β
Environment variables from Spaces **do not** auto-inherit to subprocesses |
|
|
- The app explicitly passes `NUTHATCH_API_KEY` and `NUTHATCH_BASE_URL` to the subprocess (see `mcp_clients.py`) |
|
|
|
|
|
#### Step 4: Verify Deployment |
|
|
|
|
|
1. Wait for Space to build (2-5 minutes) |
|
|
2. Check **Logs** tab for errors |
|
|
3. Try the app - upload a bird photo or ask about species |
|
|
|
|
|
--- |
|
|
|
|
|
## π Project Structure |
|
|
|
|
|
``` |
|
|
hackathon_draft/ |
|
|
βββ app.py # Main Gradio app |
|
|
βββ upload_to_space.py # HF Spaces upload script |
|
|
βββ requirements.txt # Python dependencies |
|
|
βββ .env.example # Environment template |
|
|
βββ langgraph_agent/ |
|
|
β βββ __init__.py |
|
|
β βββ agents.py # Agent factory (single/multi-agent) |
|
|
β βββ config.py # Configuration loader |
|
|
β βββ mcp_clients.py # MCP client setup |
|
|
β βββ subagent_config.py # Agent mode definitions |
|
|
β βββ prompts.py # System prompts |
|
|
β βββ structured_output.py # Response formatting |
|
|
βββ nuthatch_tools.py # Nuthatch MCP server |
|
|
βββ agent_cache.py # Session-based agent caching |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## ποΈ Architecture |
|
|
|
|
|
### MCP Servers |
|
|
|
|
|
**1. Modal Bird Classifier (GPU)** |
|
|
- Hosted on Modal (serverless GPU) |
|
|
- ResNet50 trained on 555 bird species |
|
|
- Tools: `classify_from_url`, `classify_from_base64` |
|
|
- Transport: Streamable HTTP |
|
|
|
|
|
**2. Nuthatch Species Database** |
|
|
- Species reference API (1000+ birds) |
|
|
- Tools: `search_birds`, `get_bird_info`, `get_bird_images`, `get_bird_audio`, `search_by_family`, `filter_by_status`, `get_all_families` |
|
|
- Transport: **STDIO** (subprocess on Spaces), STDIO or HTTP (local) |
|
|
- Data sources: Unsplash (images), xeno-canto (audio) |
|
|
|
|
|
### Agent Modes |
|
|
|
|
|
**Mode 1: Specialized Subagents (3 Specialists)** |
|
|
- **Router** orchestrates 3 specialized agents: |
|
|
1. **Image Identifier**: classify images, show reference photos |
|
|
2. **Species Explorer**: search by name, provide multimedia |
|
|
3. **Taxonomy Specialist**: conservation status, family search |
|
|
- Each specialist has focused tool subset |
|
|
|
|
|
**Mode 2: Audio Finder Agent** |
|
|
- Single specialized agent for finding bird audio |
|
|
- Tools: `search_birds`, `get_bird_info`, `get_bird_audio` |
|
|
- Optimized workflow for xeno-canto recordings |
|
|
|
|
|
### Tech Stack |
|
|
|
|
|
- **Frontend**: Gradio 6.0 with custom CSS (cloud/sky theme) |
|
|
- **Agent Framework**: LangGraph with streaming |
|
|
- **MCP Integration**: FastMCP client library |
|
|
- **LLM Support**: OpenAI, Anthropic, HuggingFace |
|
|
- **Session Management**: In-memory agent caching |
|
|
- **Output Parsing**: LlamaIndex Pydantic + regex (optimized) |
|
|
|
|
|
--- |
|
|
|
|
|
## π¨ Special Features |
|
|
|
|
|
### Dual Streaming Output |
|
|
- **Chat Panel**: LLM responses with markdown rendering |
|
|
- **Tool Log Panel**: Real-time tool execution traces (inputs/outputs) |
|
|
|
|
|
### Dynamic Examples |
|
|
- Examples change based on selected agent mode |
|
|
- Photo examples always visible |
|
|
- Text examples adapt to Audio Finder vs Multi-Agent |
|
|
|
|
|
### Structured Output |
|
|
- Automatic image/audio URL extraction |
|
|
- Markdown formatting for media |
|
|
- xeno-canto audio links (browser-friendly) |
|
|
|
|
|
--- |
|
|
|
|
|
## π API Key Sources |
|
|
|
|
|
| Service | Get Key From | Purpose | |
|
|
|---------|-------------|---------| |
|
|
| **Modal** | [modal.com](https://modal.com) | GPU bird classifier | |
|
|
| **Nuthatch** | [nuthatch.lastelm.software](https://nuthatch.lastelm.software) | Species database | |
|
|
| **OpenAI** | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) | LLM (recommended) | |
|
|
| **Anthropic** | [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) | LLM (Claude) | |
|
|
| **HuggingFace** | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) | LLM (limited support) | |
|
|
|
|
|
--- |
|
|
|
|
|
## π Troubleshooting |
|
|
|
|
|
### Space stuck on "Building" |
|
|
- Check **Logs** tab for errors |
|
|
- Verify all required secrets are set |
|
|
- Try Factory Reboot (Settings β Factory Reboot) |
|
|
|
|
|
### "Invalid API key" errors |
|
|
- Ensure secrets are set correctly (no quotes needed) |
|
|
- Check secret names match exactly (case-sensitive) |
|
|
|
|
|
### HuggingFace provider fails with "function calling not support" |
|
|
- HuggingFace Inference API has limited tool calling |
|
|
- Use OpenAI or Anthropic instead |
|
|
|
|
|
### Nuthatch server not starting (local) |
|
|
- Check `NUTHATCH_API_KEY` is set in `.env` |
|
|
- Verify API key is valid |
|
|
- Try STDIO mode: `NUTHATCH_USE_STDIO=true` |
|
|
|
|
|
### Audio links broken |
|
|
- Check AUDIO_FINDER_PROMPT is working |
|
|
- Verify xeno-canto URLs include `/download` |
|
|
- Check structured output parsing logs |
|
|
|
|
|
--- |
|
|
|
|
|
## π Documentation |
|
|
|
|
|
For detailed implementation docs, see: |
|
|
- `project_docs/implementation/phase_5_final.md` - Complete agent architecture |
|
|
- `project_docs/commands_guide/git_spaces_cheatsheet.md` - Deployment guide |
|
|
|
|
|
--- |
|
|
|
|
|
## π Credits |
|
|
|
|
|
- **Bird Species Data**: [Nuthatch API](https://nuthatch.lastelm.software) by Last Elm Software |
|
|
- **Bird Audio**: [xeno-canto.org](https://xeno-canto.org) - Community bird recordings |
|
|
- **Reference Images**: [Unsplash](https://unsplash.com) + curated collections |
|
|
- **MCP Protocol**: [Anthropic Model Context Protocol](https://github.com/anthropics/mcp) |
|
|
- **Hackathon**: [HuggingFace MCP-1st-Birthday](https://huggingface.co/MCP-1st-Birthday) |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License - Built for educational and research purposes |
|
|
|