agentbee

Sleeping

File size: 20,683 Bytes

bd73133

# [dev_251222_01] API Integration Guide

**Date:** 2025-12-22
**Type:** 🔨 Development
**Status:** 🔄 In Progress
**Related Dev:** N/A (Initial documentation)

## Problem Description

As a beginner learning API integration, needed comprehensive documentation of the GAIA scoring API to understand how to properly interact with all endpoints. The existing code only uses 2 of 4 available endpoints, missing critical file download functionality that many GAIA questions require.

---

## API Overview

**Base URL:** `https://agents-course-unit4-scoring.hf.space`

**Purpose:** GAIA benchmark evaluation system that provides test questions, accepts agent answers, calculates scores, and maintains leaderboards.

**Documentation Format:** FastAPI with Swagger UI (OpenAPI specification)

**Authentication:** None required (public API)

## Complete Endpoint Reference

### Endpoint 1: GET /questions

**Purpose:** Retrieve complete list of all GAIA test questions

**Request:**

```python
import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
response = requests.get(f"{api_url}/questions", timeout=15)
questions = response.json()
```

**Parameters:** None

**Response Format:**

```json
[
  {
    "task_id": "string",
    "question": "string",
    "level": "integer (1-3)",
    "file_name": "string or null",
    "file_path": "string or null",
    ...additional metadata...
  }
]
```

**Response Codes:**

- 200: Success - Returns array of question objects
- 500: Server error

**Key Fields:**

- `task_id`: Unique identifier for each question (required for submission)
- `question`: The actual question text your agent needs to answer
- `level`: Difficulty level (1=easy, 2=medium, 3=hard)
- `file_name`: Name of attached file if question includes one (null if no file)
- `file_path`: Path to file on server (null if no file)

**Current Implementation:** ✅ Already implemented in app.py:41-73

**Usage in Your Code:**

```python
# Existing code location: app.py:54-66
response = requests.get(questions_url, timeout=15)
response.raise_for_status()
questions_data = response.json()
```

---

### Endpoint 2: GET /random-question

**Purpose:** Get single random question for testing/debugging

**Request:**

```python
import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
response = requests.get(f"{api_url}/random-question", timeout=15)
question = response.json()
```

**Parameters:** None

**Response Format:**

```json
{
  "task_id": "string",
  "question": "string",
  "level": "integer",
  "file_name": "string or null",
  "file_path": "string or null"
}
```

**Response Codes:**

- 200: Success - Returns single question object
- 404: No questions available
- 500: Server error

**Current Implementation:** ❌ Not implemented

**Use Cases:**

- Quick testing during agent development
- Debugging specific question types
- Iterative development without processing all questions

**Example Implementation:**

```python
def test_agent_on_random_question(agent):
    """Test agent on a single random question"""
    api_url = "https://agents-course-unit4-scoring.hf.space"
    response = requests.get(f"{api_url}/random-question", timeout=15)

    if response.status_code == 404:
        return "No questions available"

    response.raise_for_status()
    question_data = response.json()

    task_id = question_data.get("task_id")
    question_text = question_data.get("question")

    answer = agent(question_text)
    print(f"Task: {task_id}")
    print(f"Question: {question_text}")
    print(f"Agent Answer: {answer}")

    return answer
```

---

### Endpoint 3: POST /submit

**Purpose:** Submit all agent answers for evaluation and receive score

**Request:**

```python
import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
submission_data = {
    "username": "your-hf-username",
    "agent_code": "https://huggingface.co/spaces/your-space/tree/main",
    "answers": [
        {"task_id": "task_001", "submitted_answer": "42"},
        {"task_id": "task_002", "submitted_answer": "Paris"}
    ]
}

response = requests.post(
    f"{api_url}/submit",
    json=submission_data,
    timeout=60
)
result = response.json()
```

**Request Body Schema:**

```json
{
  "username": "string (required)",
  "agent_code": "string (min 10 chars, required)",
  "answers": [
    {
      "task_id": "string (required)",
      "submitted_answer": "string | number | integer (required)"
    }
  ]
}
```

**Field Requirements:**

- `username`: Your Hugging Face username (obtained from OAuth profile)
- `agent_code`: URL to your agent's source code (typically HF Space repo URL)
- `answers`: Array of answer objects, one per question
  - `task_id`: Must match task_id from /questions endpoint
  - `submitted_answer`: Can be string, integer, or number depending on question

**Response Format:**

```json
{
  "username": "string",
  "score": 85.5,
  "correct_count": 17,
  "total_attempted": 20,
  "message": "Submission successful!",
  "timestamp": "2025-12-22T10:30:00.123Z"
}
```

**Response Codes:**

- 200: Success - Returns score and statistics
- 400: Invalid input (missing fields, wrong format)
- 404: One or more task_ids not found
- 500: Server error

**Current Implementation:** ✅ Already implemented in app.py:112-161

**Usage in Your Code:**

```python
# Existing code location: app.py:112-135
submission_data = {
    "username": username.strip(),
    "agent_code": agent_code,
    "answers": answers_payload,
}
response = requests.post(submit_url, json=submission_data, timeout=60)
response.raise_for_status()
result_data = response.json()
```

**Important Notes:**

- Timeout set to 60 seconds (longer than /questions because scoring takes time)
- All answers must be submitted together in single request
- Score is calculated immediately and returned in response
- Results also update the public leaderboard

---

### Endpoint 4: GET /files/{task_id}

**Purpose:** Download files attached to questions (images, PDFs, data files, etc.)

**Request:**

```python
import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
task_id = "task_001"
response = requests.get(f"{api_url}/files/{task_id}", timeout=30)

# Save file to disk
with open(f"downloaded_{task_id}.file", "wb") as f:
    f.write(response.content)
```

**Parameters:**

- `task_id` (string, required, path parameter): The task_id of the question

**Response Format:**

- Binary file content (could be image, PDF, CSV, JSON, etc.)
- Content-Type header indicates file type

**Response Codes:**

- 200: Success - Returns file content
- 403: Access denied (path traversal attempt blocked)
- 404: Task not found OR task has no associated file
- 500: Server error

**Current Implementation:** ❌ Not implemented - THIS IS CRITICAL GAP

**Why This Matters:**
Many GAIA questions include attached files that contain essential information for answering the question. Without downloading these files, your agent cannot answer those questions correctly.

**Detection Logic:**

```python
# Check if question has an attached file
question_data = {
    "task_id": "task_001",
    "question": "What is shown in the image?",
    "file_name": "image.png",      # Not null = file exists
    "file_path": "/files/task_001" # Path to file
}

has_file = question_data.get("file_name") is not None
```

**Example Implementation:**

```python
def download_task_file(task_id, save_dir="input/"):
    """Download file associated with a task_id"""
    api_url = "https://agents-course-unit4-scoring.hf.space"
    file_url = f"{api_url}/files/{task_id}"

    try:
        response = requests.get(file_url, timeout=30)
        response.raise_for_status()

        # Determine file extension from Content-Type or use generic
        content_type = response.headers.get('Content-Type', '')
        extension_map = {
            'image/png': '.png',
            'image/jpeg': '.jpg',
            'application/pdf': '.pdf',
            'text/csv': '.csv',
            'application/json': '.json',
        }
        extension = extension_map.get(content_type, '.file')

        # Save file
        file_path = f"{save_dir}{task_id}{extension}"
        with open(file_path, 'wb') as f:
            f.write(response.content)

        print(f"Downloaded file for {task_id}: {file_path}")
        return file_path

    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            print(f"No file found for task {task_id}")
            return None
        raise
```

**Integration Example:**

```python
# Enhanced agent workflow
for item in questions_data:
    task_id = item.get("task_id")
    question_text = item.get("question")
    file_name = item.get("file_name")

    # Download file if question has one
    file_path = None
    if file_name:
        file_path = download_task_file(task_id)

    # Pass both question and file to agent
    answer = agent(question_text, file_path=file_path)
```

---

## API Request Flow Diagram

```
Student Agent Workflow:
┌─────────────────────────────────────────────────────────────┐
│ 1. Fetch Questions                                          │
│    GET /questions                                           │
│    → Receive list of all questions with metadata           │
└────────────────────┬────────────────────────────────────────┘
                     ↓
┌─────────────────────────────────────────────────────────────┐
│ 2. Process Each Question                                    │
│    For each question:                                       │
│    a) Check if file_name exists                            │
│    b) If yes: GET /files/{task_id}                         │
│       → Download and save file                             │
│    c) Pass question + file to agent                        │
│    d) Agent generates answer                               │
└────────────────────┬────────────────────────────────────────┘
                     ↓
┌─────────────────────────────────────────────────────────────┐
│ 3. Submit All Answers                                       │
│    POST /submit                                             │
│    → Send username, agent_code, and all answers            │
│    → Receive score and statistics                          │
└─────────────────────────────────────────────────────────────┘
```

## Error Handling Best Practices

### Connection Errors

```python
try:
    response = requests.get(url, timeout=15)
    response.raise_for_status()
except requests.exceptions.Timeout:
    print("Request timed out")
except requests.exceptions.ConnectionError:
    print("Network connection error")
except requests.exceptions.HTTPError as e:
    print(f"HTTP error: {e.response.status_code}")
```

### Response Validation

```python
# Always validate response format
response = requests.get(questions_url)
response.raise_for_status()

try:
    data = response.json()
except requests.exceptions.JSONDecodeError:
    print("Invalid JSON response")
    print(f"Response text: {response.text[:500]}")
```

### Timeout Recommendations

- GET /questions: 15 seconds (fetching list)
- GET /random-question: 15 seconds (single question)
- GET /files/{task_id}: 30 seconds (file download may be larger)
- POST /submit: 60 seconds (scoring all answers takes time)

## Current Implementation Status

### ✅ Implemented Endpoints

1. **GET /questions** - Fully implemented in app.py:41-73
2. **POST /submit** - Fully implemented in app.py:112-161

### ❌ Missing Endpoints

1. **GET /random-question** - Not implemented (useful for testing)
2. **GET /files/{task_id}** - Not implemented (CRITICAL - many questions need files)

### 🚨 Critical Gap Analysis

**Impact of Missing /files Endpoint:**

- Questions with attached files cannot be answered correctly
- Agent will only see question text, not the actual content to analyze
- Significantly reduces potential score on GAIA benchmark

**Example Questions That Need Files:**

- "What is shown in this image?" → Needs image file
- "What is the total in column B?" → Needs spreadsheet file
- "Summarize this document" → Needs PDF/text file
- "What patterns do you see in this data?" → Needs CSV/JSON file

**Estimated Impact:**

- GAIA benchmark: ~30-40% of questions include files
- Without file handling: Maximum achievable score ~60-70%
- With file handling: Can potentially achieve 100%

## Next Steps for Implementation

### Priority 1: Add File Download Support

1. Detect questions with files (check `file_name` field)
2. Download files using GET /files/{task_id}
3. Save files to input/ directory
4. Modify BasicAgent to accept file_path parameter
5. Update agent logic to process files

### Priority 2: Add Testing Endpoint

1. Implement GET /random-question for quick testing
2. Create test script in test/ directory
3. Enable iterative development without full evaluation runs

### Priority 3: Enhanced Error Handling

1. Add retry logic for network failures
2. Validate file downloads (check file size, type)
3. Handle partial failures gracefully

## How to Read FastAPI Swagger Documentation

### Understanding the Swagger UI

FastAPI APIs use Swagger UI for interactive documentation. Here's how to read it systematically:

### Main UI Components

#### 1. Header Section

```
Agent Evaluation API  [0.1.0] [OAS 3.1]
/openapi.json
```

**What you learn:**

- **API Name:** Service identification
- **Version:** `0.1.0` - API version (important for tracking changes)
- **OAS 3.1:** OpenAPI Specification standard version
- **Link:** `/openapi.json` - raw machine-readable specification

#### 2. API Description

High-level summary of what the service provides

#### 3. Endpoints Section (Expandable List)

**HTTP Method Colors:**

- **Blue "GET"** = Retrieve/fetch data (read-only, safe to call multiple times)
- **Green "POST"** = Submit/create data (writes data, may change state)
- **Orange "PUT"** = Update existing data
- **Red "DELETE"** = Remove data

**Each endpoint shows:**

- Path (URL structure)
- Short description
- Click to expand for details

#### 4. Expanded Endpoint Details

When you click an endpoint, you get:

**Section A: Description**

- Detailed explanation of functionality
- Use cases and purpose

**Section B: Parameters**

- **Path Parameters:** Variables in URL like `/files/{task_id}`
- **Query Parameters:** Key-value pairs after `?` like `?level=1&limit=10`
- Each parameter shows:
  - Name
  - Type (string, integer, boolean, etc.)
  - Required vs Optional
  - Description
  - Example values

**Section C: Request Body** (POST/PUT only)

- JSON structure to send
- Field names and types
- Required vs optional fields
- Example payload
- Schema button shows structure

**Section D: Responses**

- Status codes (200, 400, 404, 500)
- Response structure for each code
- Example responses
- What each status means

**Section E: Try It Out Button**

- Test API directly in browser
- Fill parameters and send real requests
- See actual responses

#### 5. Schemas Section (Bottom)

Reusable data structures used across endpoints:

```
Schemas
  ├─ AnswerItem
  ├─ ErrorResponse
  ├─ ScoreResponse
  └─ Submission
```

Click each to see:

- All fields in the object
- Field types and constraints
- Required vs optional
- Descriptions

### Step-by-Step: Reading One Endpoint

**Example: POST /submit**

**Step 1:** Click the endpoint to expand

**Step 2:** Read description
*"Submit agent answers, calculate scores, and update leaderboard"*

**Step 3:** Check Parameters

- Path parameters? None (URL is just `/submit`)
- Query parameters? None

**Step 4:** Check Request Body

```json
{
  "username": "string (required)",
  "agent_code": "string, min 10 chars (required)",
  "answers": [
    {
      "task_id": "string (required)",
      "submitted_answer": "string | number | integer (required)"
    }
  ]
}
```

**Step 5:** Check Responses

**200 Success:**

```json
{
  "username": "string",
  "score": 85.5,
  "correct_count": 15,
  "total_attempted": 20,
  "message": "Success!"
}
```

**Other codes:**

- 400: Invalid input
- 404: Task ID not found
- 500: Server error

**Step 6:** Write Python code

```python
url = "https://agents-course-unit4-scoring.hf.space/submit"
payload = {
    "username": "your-username",
    "agent_code": "https://...",
    "answers": [
        {"task_id": "task_001", "submitted_answer": "42"}
    ]
}
response = requests.post(url, json=payload, timeout=60)
result = response.json()
```

### Information Extraction Checklist

For each endpoint, extract:

**Basic Info:**

- HTTP method (GET, POST, PUT, DELETE)
- Endpoint path (URL)
- One-line description

**Request Details:**

- Path parameters (variables in URL)
- Query parameters (after ? in URL)
- Request body structure (POST/PUT)
- Required vs optional fields
- Data types and constraints

**Response Details:**

- Success response structure (200)
- Success response example
- All possible status codes
- Error response structures
- What each status code means

**Additional Info:**

- Authentication requirements
- Rate limits
- Example requests
- Related schemas

### Pro Tips

**Tip 1: Start with GET endpoints**
Simpler (no request body) and safe to test

**Tip 2: Use "Try it out" button**
Best way to learn - send real requests and see responses

**Tip 3: Check Schemas section**
Understanding schemas helps decode complex structures

**Tip 4: Copy examples**
Most Swagger UIs have example values - use them!

**Tip 5: Required vs Optional**
Required fields cause 400 error if missing

**Tip 6: Read error responses**
They tell you what went wrong and how to fix it

### Practice Exercise

**Try reading GET /files/{task_id}:**

1. What HTTP method? → GET
2. What's the path parameter? → `task_id` (string, required)
3. What does it return? → File content (binary)
4. What status codes? → 200, 403, 404, 500
5. Python code? → `requests.get(f"{api_url}/files/{task_id}")`

## Learning Resources

**Understanding REST APIs:**

- REST = Representational State Transfer
- APIs communicate using HTTP methods: GET (retrieve), POST (submit), PUT (update), DELETE (remove)
- Data typically exchanged in JSON format

**Key Concepts:**

- **Endpoint:** Specific URL path that performs one action (/questions, /submit)
- **Request:** Data you send to the API (parameters, body)
- **Response:** Data the API sends back (JSON, files, status codes)
- **Status Codes:**
  - 200 = Success
  - 400 = Bad request (your input was wrong)
  - 404 = Not found
  - 500 = Server error

**Python Requests Library:**

```python
# GET request - retrieve data
response = requests.get(url, params={...}, timeout=15)

# POST request - submit data
response = requests.post(url, json={...}, timeout=60)

# Always check status
response.raise_for_status()  # Raises error if status >= 400

# Parse JSON response
data = response.json()
```

---

## Key Decisions

- **Documentation Structure:** Organized by endpoint with complete examples for each
- **Learning Approach:** Beginner-friendly explanations with code examples
- **Priority Focus:** Highlighted critical missing functionality (file downloads)
- **Practical Examples:** Included copy-paste ready code snippets

## Outcome

Created comprehensive API integration guide documenting all 4 endpoints of the GAIA scoring API, identified critical gap in current implementation (missing file download support), and provided actionable examples for enhancement.

**Deliverables:**

- `dev/dev_251222_01_api_integration_guide.md` - Complete API reference documentation

## Changelog

**What was changed:**

- Created new documentation file: dev_251222_01_api_integration_guide.md
- Documented all 4 API endpoints with request/response formats
- Added code examples for each endpoint
- Identified critical missing functionality (file downloads)
- Provided implementation roadmap for enhancements