agentbee / dev /dev_251222_01_api_integration_guide.md
mangubee's picture
Stage 1: Foundation Setup - LangGraph agent with isolated environment
bd73133
|
raw
history blame
20.7 kB

[dev_251222_01] API Integration Guide

Date: 2025-12-22 Type: πŸ”¨ Development Status: πŸ”„ In Progress Related Dev: N/A (Initial documentation)

Problem Description

As a beginner learning API integration, needed comprehensive documentation of the GAIA scoring API to understand how to properly interact with all endpoints. The existing code only uses 2 of 4 available endpoints, missing critical file download functionality that many GAIA questions require.


API Overview

Base URL: https://agents-course-unit4-scoring.hf.space

Purpose: GAIA benchmark evaluation system that provides test questions, accepts agent answers, calculates scores, and maintains leaderboards.

Documentation Format: FastAPI with Swagger UI (OpenAPI specification)

Authentication: None required (public API)

Complete Endpoint Reference

Endpoint 1: GET /questions

Purpose: Retrieve complete list of all GAIA test questions

Request:

import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
response = requests.get(f"{api_url}/questions", timeout=15)
questions = response.json()

Parameters: None

Response Format:

[
  {
    "task_id": "string",
    "question": "string",
    "level": "integer (1-3)",
    "file_name": "string or null",
    "file_path": "string or null",
    ...additional metadata...
  }
]

Response Codes:

  • 200: Success - Returns array of question objects
  • 500: Server error

Key Fields:

  • task_id: Unique identifier for each question (required for submission)
  • question: The actual question text your agent needs to answer
  • level: Difficulty level (1=easy, 2=medium, 3=hard)
  • file_name: Name of attached file if question includes one (null if no file)
  • file_path: Path to file on server (null if no file)

Current Implementation: βœ… Already implemented in app.py:41-73

Usage in Your Code:

# Existing code location: app.py:54-66
response = requests.get(questions_url, timeout=15)
response.raise_for_status()
questions_data = response.json()

Endpoint 2: GET /random-question

Purpose: Get single random question for testing/debugging

Request:

import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
response = requests.get(f"{api_url}/random-question", timeout=15)
question = response.json()

Parameters: None

Response Format:

{
  "task_id": "string",
  "question": "string",
  "level": "integer",
  "file_name": "string or null",
  "file_path": "string or null"
}

Response Codes:

  • 200: Success - Returns single question object
  • 404: No questions available
  • 500: Server error

Current Implementation: ❌ Not implemented

Use Cases:

  • Quick testing during agent development
  • Debugging specific question types
  • Iterative development without processing all questions

Example Implementation:

def test_agent_on_random_question(agent):
    """Test agent on a single random question"""
    api_url = "https://agents-course-unit4-scoring.hf.space"
    response = requests.get(f"{api_url}/random-question", timeout=15)

    if response.status_code == 404:
        return "No questions available"

    response.raise_for_status()
    question_data = response.json()

    task_id = question_data.get("task_id")
    question_text = question_data.get("question")

    answer = agent(question_text)
    print(f"Task: {task_id}")
    print(f"Question: {question_text}")
    print(f"Agent Answer: {answer}")

    return answer

Endpoint 3: POST /submit

Purpose: Submit all agent answers for evaluation and receive score

Request:

import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
submission_data = {
    "username": "your-hf-username",
    "agent_code": "https://huggingface.co/spaces/your-space/tree/main",
    "answers": [
        {"task_id": "task_001", "submitted_answer": "42"},
        {"task_id": "task_002", "submitted_answer": "Paris"}
    ]
}

response = requests.post(
    f"{api_url}/submit",
    json=submission_data,
    timeout=60
)
result = response.json()

Request Body Schema:

{
  "username": "string (required)",
  "agent_code": "string (min 10 chars, required)",
  "answers": [
    {
      "task_id": "string (required)",
      "submitted_answer": "string | number | integer (required)"
    }
  ]
}

Field Requirements:

  • username: Your Hugging Face username (obtained from OAuth profile)
  • agent_code: URL to your agent's source code (typically HF Space repo URL)
  • answers: Array of answer objects, one per question
    • task_id: Must match task_id from /questions endpoint
    • submitted_answer: Can be string, integer, or number depending on question

Response Format:

{
  "username": "string",
  "score": 85.5,
  "correct_count": 17,
  "total_attempted": 20,
  "message": "Submission successful!",
  "timestamp": "2025-12-22T10:30:00.123Z"
}

Response Codes:

  • 200: Success - Returns score and statistics
  • 400: Invalid input (missing fields, wrong format)
  • 404: One or more task_ids not found
  • 500: Server error

Current Implementation: βœ… Already implemented in app.py:112-161

Usage in Your Code:

# Existing code location: app.py:112-135
submission_data = {
    "username": username.strip(),
    "agent_code": agent_code,
    "answers": answers_payload,
}
response = requests.post(submit_url, json=submission_data, timeout=60)
response.raise_for_status()
result_data = response.json()

Important Notes:

  • Timeout set to 60 seconds (longer than /questions because scoring takes time)
  • All answers must be submitted together in single request
  • Score is calculated immediately and returned in response
  • Results also update the public leaderboard

Endpoint 4: GET /files/{task_id}

Purpose: Download files attached to questions (images, PDFs, data files, etc.)

Request:

import requests

api_url = "https://agents-course-unit4-scoring.hf.space"
task_id = "task_001"
response = requests.get(f"{api_url}/files/{task_id}", timeout=30)

# Save file to disk
with open(f"downloaded_{task_id}.file", "wb") as f:
    f.write(response.content)

Parameters:

  • task_id (string, required, path parameter): The task_id of the question

Response Format:

  • Binary file content (could be image, PDF, CSV, JSON, etc.)
  • Content-Type header indicates file type

Response Codes:

  • 200: Success - Returns file content
  • 403: Access denied (path traversal attempt blocked)
  • 404: Task not found OR task has no associated file
  • 500: Server error

Current Implementation: ❌ Not implemented - THIS IS CRITICAL GAP

Why This Matters: Many GAIA questions include attached files that contain essential information for answering the question. Without downloading these files, your agent cannot answer those questions correctly.

Detection Logic:

# Check if question has an attached file
question_data = {
    "task_id": "task_001",
    "question": "What is shown in the image?",
    "file_name": "image.png",      # Not null = file exists
    "file_path": "/files/task_001" # Path to file
}

has_file = question_data.get("file_name") is not None

Example Implementation:

def download_task_file(task_id, save_dir="input/"):
    """Download file associated with a task_id"""
    api_url = "https://agents-course-unit4-scoring.hf.space"
    file_url = f"{api_url}/files/{task_id}"

    try:
        response = requests.get(file_url, timeout=30)
        response.raise_for_status()

        # Determine file extension from Content-Type or use generic
        content_type = response.headers.get('Content-Type', '')
        extension_map = {
            'image/png': '.png',
            'image/jpeg': '.jpg',
            'application/pdf': '.pdf',
            'text/csv': '.csv',
            'application/json': '.json',
        }
        extension = extension_map.get(content_type, '.file')

        # Save file
        file_path = f"{save_dir}{task_id}{extension}"
        with open(file_path, 'wb') as f:
            f.write(response.content)

        print(f"Downloaded file for {task_id}: {file_path}")
        return file_path

    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            print(f"No file found for task {task_id}")
            return None
        raise

Integration Example:

# Enhanced agent workflow
for item in questions_data:
    task_id = item.get("task_id")
    question_text = item.get("question")
    file_name = item.get("file_name")

    # Download file if question has one
    file_path = None
    if file_name:
        file_path = download_task_file(task_id)

    # Pass both question and file to agent
    answer = agent(question_text, file_path=file_path)

API Request Flow Diagram

Student Agent Workflow:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. Fetch Questions                                          β”‚
β”‚    GET /questions                                           β”‚
β”‚    β†’ Receive list of all questions with metadata           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 2. Process Each Question                                    β”‚
β”‚    For each question:                                       β”‚
β”‚    a) Check if file_name exists                            β”‚
β”‚    b) If yes: GET /files/{task_id}                         β”‚
β”‚       β†’ Download and save file                             β”‚
β”‚    c) Pass question + file to agent                        β”‚
β”‚    d) Agent generates answer                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 3. Submit All Answers                                       β”‚
β”‚    POST /submit                                             β”‚
β”‚    β†’ Send username, agent_code, and all answers            β”‚
β”‚    β†’ Receive score and statistics                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Error Handling Best Practices

Connection Errors

try:
    response = requests.get(url, timeout=15)
    response.raise_for_status()
except requests.exceptions.Timeout:
    print("Request timed out")
except requests.exceptions.ConnectionError:
    print("Network connection error")
except requests.exceptions.HTTPError as e:
    print(f"HTTP error: {e.response.status_code}")

Response Validation

# Always validate response format
response = requests.get(questions_url)
response.raise_for_status()

try:
    data = response.json()
except requests.exceptions.JSONDecodeError:
    print("Invalid JSON response")
    print(f"Response text: {response.text[:500]}")

Timeout Recommendations

  • GET /questions: 15 seconds (fetching list)
  • GET /random-question: 15 seconds (single question)
  • GET /files/{task_id}: 30 seconds (file download may be larger)
  • POST /submit: 60 seconds (scoring all answers takes time)

Current Implementation Status

βœ… Implemented Endpoints

  1. GET /questions - Fully implemented in app.py:41-73
  2. POST /submit - Fully implemented in app.py:112-161

❌ Missing Endpoints

  1. GET /random-question - Not implemented (useful for testing)
  2. GET /files/{task_id} - Not implemented (CRITICAL - many questions need files)

🚨 Critical Gap Analysis

Impact of Missing /files Endpoint:

  • Questions with attached files cannot be answered correctly
  • Agent will only see question text, not the actual content to analyze
  • Significantly reduces potential score on GAIA benchmark

Example Questions That Need Files:

  • "What is shown in this image?" β†’ Needs image file
  • "What is the total in column B?" β†’ Needs spreadsheet file
  • "Summarize this document" β†’ Needs PDF/text file
  • "What patterns do you see in this data?" β†’ Needs CSV/JSON file

Estimated Impact:

  • GAIA benchmark: ~30-40% of questions include files
  • Without file handling: Maximum achievable score ~60-70%
  • With file handling: Can potentially achieve 100%

Next Steps for Implementation

Priority 1: Add File Download Support

  1. Detect questions with files (check file_name field)
  2. Download files using GET /files/{task_id}
  3. Save files to input/ directory
  4. Modify BasicAgent to accept file_path parameter
  5. Update agent logic to process files

Priority 2: Add Testing Endpoint

  1. Implement GET /random-question for quick testing
  2. Create test script in test/ directory
  3. Enable iterative development without full evaluation runs

Priority 3: Enhanced Error Handling

  1. Add retry logic for network failures
  2. Validate file downloads (check file size, type)
  3. Handle partial failures gracefully

How to Read FastAPI Swagger Documentation

Understanding the Swagger UI

FastAPI APIs use Swagger UI for interactive documentation. Here's how to read it systematically:

Main UI Components

1. Header Section

Agent Evaluation API  [0.1.0] [OAS 3.1]
/openapi.json

What you learn:

  • API Name: Service identification
  • Version: 0.1.0 - API version (important for tracking changes)
  • OAS 3.1: OpenAPI Specification standard version
  • Link: /openapi.json - raw machine-readable specification

2. API Description

High-level summary of what the service provides

3. Endpoints Section (Expandable List)

HTTP Method Colors:

  • Blue "GET" = Retrieve/fetch data (read-only, safe to call multiple times)
  • Green "POST" = Submit/create data (writes data, may change state)
  • Orange "PUT" = Update existing data
  • Red "DELETE" = Remove data

Each endpoint shows:

  • Path (URL structure)
  • Short description
  • Click to expand for details

4. Expanded Endpoint Details

When you click an endpoint, you get:

Section A: Description

  • Detailed explanation of functionality
  • Use cases and purpose

Section B: Parameters

  • Path Parameters: Variables in URL like /files/{task_id}
  • Query Parameters: Key-value pairs after ? like ?level=1&limit=10
  • Each parameter shows:
    • Name
    • Type (string, integer, boolean, etc.)
    • Required vs Optional
    • Description
    • Example values

Section C: Request Body (POST/PUT only)

  • JSON structure to send
  • Field names and types
  • Required vs optional fields
  • Example payload
  • Schema button shows structure

Section D: Responses

  • Status codes (200, 400, 404, 500)
  • Response structure for each code
  • Example responses
  • What each status means

Section E: Try It Out Button

  • Test API directly in browser
  • Fill parameters and send real requests
  • See actual responses

5. Schemas Section (Bottom)

Reusable data structures used across endpoints:

Schemas
  β”œβ”€ AnswerItem
  β”œβ”€ ErrorResponse
  β”œβ”€ ScoreResponse
  └─ Submission

Click each to see:

  • All fields in the object
  • Field types and constraints
  • Required vs optional
  • Descriptions

Step-by-Step: Reading One Endpoint

Example: POST /submit

Step 1: Click the endpoint to expand

Step 2: Read description "Submit agent answers, calculate scores, and update leaderboard"

Step 3: Check Parameters

  • Path parameters? None (URL is just /submit)
  • Query parameters? None

Step 4: Check Request Body

{
  "username": "string (required)",
  "agent_code": "string, min 10 chars (required)",
  "answers": [
    {
      "task_id": "string (required)",
      "submitted_answer": "string | number | integer (required)"
    }
  ]
}

Step 5: Check Responses

200 Success:

{
  "username": "string",
  "score": 85.5,
  "correct_count": 15,
  "total_attempted": 20,
  "message": "Success!"
}

Other codes:

  • 400: Invalid input
  • 404: Task ID not found
  • 500: Server error

Step 6: Write Python code

url = "https://agents-course-unit4-scoring.hf.space/submit"
payload = {
    "username": "your-username",
    "agent_code": "https://...",
    "answers": [
        {"task_id": "task_001", "submitted_answer": "42"}
    ]
}
response = requests.post(url, json=payload, timeout=60)
result = response.json()

Information Extraction Checklist

For each endpoint, extract:

Basic Info:

  • HTTP method (GET, POST, PUT, DELETE)
  • Endpoint path (URL)
  • One-line description

Request Details:

  • Path parameters (variables in URL)
  • Query parameters (after ? in URL)
  • Request body structure (POST/PUT)
  • Required vs optional fields
  • Data types and constraints

Response Details:

  • Success response structure (200)
  • Success response example
  • All possible status codes
  • Error response structures
  • What each status code means

Additional Info:

  • Authentication requirements
  • Rate limits
  • Example requests
  • Related schemas

Pro Tips

Tip 1: Start with GET endpoints Simpler (no request body) and safe to test

Tip 2: Use "Try it out" button Best way to learn - send real requests and see responses

Tip 3: Check Schemas section Understanding schemas helps decode complex structures

Tip 4: Copy examples Most Swagger UIs have example values - use them!

Tip 5: Required vs Optional Required fields cause 400 error if missing

Tip 6: Read error responses They tell you what went wrong and how to fix it

Practice Exercise

Try reading GET /files/{task_id}:

  1. What HTTP method? β†’ GET
  2. What's the path parameter? β†’ task_id (string, required)
  3. What does it return? β†’ File content (binary)
  4. What status codes? β†’ 200, 403, 404, 500
  5. Python code? β†’ requests.get(f"{api_url}/files/{task_id}")

Learning Resources

Understanding REST APIs:

  • REST = Representational State Transfer
  • APIs communicate using HTTP methods: GET (retrieve), POST (submit), PUT (update), DELETE (remove)
  • Data typically exchanged in JSON format

Key Concepts:

  • Endpoint: Specific URL path that performs one action (/questions, /submit)
  • Request: Data you send to the API (parameters, body)
  • Response: Data the API sends back (JSON, files, status codes)
  • Status Codes:
    • 200 = Success
    • 400 = Bad request (your input was wrong)
    • 404 = Not found
    • 500 = Server error

Python Requests Library:

# GET request - retrieve data
response = requests.get(url, params={...}, timeout=15)

# POST request - submit data
response = requests.post(url, json={...}, timeout=60)

# Always check status
response.raise_for_status()  # Raises error if status >= 400

# Parse JSON response
data = response.json()

Key Decisions

  • Documentation Structure: Organized by endpoint with complete examples for each
  • Learning Approach: Beginner-friendly explanations with code examples
  • Priority Focus: Highlighted critical missing functionality (file downloads)
  • Practical Examples: Included copy-paste ready code snippets

Outcome

Created comprehensive API integration guide documenting all 4 endpoints of the GAIA scoring API, identified critical gap in current implementation (missing file download support), and provided actionable examples for enhancement.

Deliverables:

  • dev/dev_251222_01_api_integration_guide.md - Complete API reference documentation

Changelog

What was changed:

  • Created new documentation file: dev_251222_01_api_integration_guide.md
  • Documented all 4 API endpoints with request/response formats
  • Added code examples for each endpoint
  • Identified critical missing functionality (file downloads)
  • Provided implementation roadmap for enhancements