open-navigator / website /docs /development /api-logging-errors.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc
metadata
sidebar_position: 5

API Logging & Error Handling Implementation

Summary of Changes

1. Comprehensive Logging Configuration (api/main.py)

Added dual-output logging that appears in both files and container logs:

# Console output (shows in HuggingFace container logs)
logger.add(
    sys.stderr,
    format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | ...",
    level=settings.log_level
)

# File output with rotation and retention
logger.add(
    settings.log_file,
    rotation="500 MB",      # New file when size exceeds 500MB
    retention="10 days",    # Auto-delete logs older than 10 days
    level=settings.log_level
)

Benefits:

  • βœ… Logs visible in HuggingFace Spaces container logs
  • βœ… Automatic rotation prevents disk space issues
  • βœ… 10-day retention for compliance and debugging

2. Automatic Request Logging Middleware (api/main.py)

Every API request is automatically logged with:

  • Request method & path
  • Client IP address
  • Response status code
  • Request duration (milliseconds)
  • Response size (bytes)

Example Log Output:

➑️  GET /api/bills?state=ma - Client: 192.168.1.1
βœ… GET /api/bills?state=ma - Status: 200 - Duration: 45.32ms - Size: 12834 bytes
➑️  POST /api/search/ - Client: 10.0.0.5
⚠️  POST /api/search/ - Status: 404 - Duration: 12.45ms
➑️  GET /api/stats - Client: 172.16.0.1
❌ GET /api/stats - Status: 500 - Duration: 234.12ms

3. Startup Data Validation (api/main.py)

API now validates data availability on startup:

================================================================================
πŸš€ STARTING Open Navigator API
================================================================================
Configuration: oral_health.policy_analysis
Log Level: INFO
Log File: logs/oral-health-policy-pulse.log

πŸ“Š VALIDATING DATA AVAILABILITY...
--------------------------------------------------------------------------------
  βœ… reference/jurisdictions_cities.parquet: 19,502 records (2.45 MB)
  βœ… reference/jurisdictions_counties.parquet: 3,143 records (0.34 MB)
  βœ… reference/causes_ntee_codes.parquet: 645 records (0.02 MB)

πŸ“ STATE DATA AVAILABILITY:
  βœ… AL: nonprofits, officials, events
  βœ… AK: nonprofits, officials
  ... and 42 more states

  πŸ“Š Total states with data: 50

================================================================================
βœ… API READY - 3/3 critical files available
================================================================================

Benefits:

  • Catch missing data files before users encounter errors
  • Clear visibility into what data is available
  • Early warning for data pipeline issues

4. Structured Error Responses (api/models/errors.py)

Instead of raw error dumps, users now receive helpful, structured errors:

Old Response (Bad):

{
  "detail": "HTTP Error: HTTP GET error on 'https://huggingface.co/datasets/...' (HTTP 404 Not Found)..."
}

New Response (Good):

{
  "message": "No bills data available for MA",
  "error_type": "data_not_found",
  "technical_details": "Dataset 'CommunityOne/states-ma-bills-bills' not found on HuggingFace.\n\nFull error: HTTP GET error...",
  "suggestions": [
    "Try a different state - we have data for 50+ states",
    "Check /api/bills/map to see which states have bills data",
    "Contact support if you believe this data should be available"
  ],
  "metadata": {
    "dataset": "CommunityOne/states-ma-bills-bills",
    "state": "MA",
    "data_type": "bills"
  }
}

Error Types Handled:

  • data_not_found - Missing datasets or files
  • network_error - Timeouts and connection issues
  • query_error - Invalid SQL/DuckDB queries
  • validation_error - Invalid parameters
  • server_error - Unexpected errors

Updated Endpoints:

  • /api/bills - Bill search
  • /api/bills/sessions - Session list
  • /api/bills/map - Map data
  • /api/search - Unified search
  • /api/search/suggest - Suggestions

Frontend Integration

The frontend can now show errors like this:

// Error response structure
interface ErrorDetail {
  message: string;              // User-friendly message (always show)
  error_type: string;          // Error category
  technical_details?: string;  // Full error (show in expandable section)
  suggestions?: string[];      // Helpful tips
  metadata?: Record<string, any>;
}

// Example usage in React
try {
  const response = await api.get('/api/bills?state=MA');
} catch (error) {
  if (error.response?.data?.message) {
    // Show user-friendly message
    showError(error.response.data.message);
    
    // Option to expand technical details
    if (error.response.data.technical_details) {
      showExpandableDetails(error.response.data.technical_details);
    }
    
    // Show suggestions
    if (error.response.data.suggestions) {
      showSuggestions(error.response.data.suggestions);
    }
  }
}

UI Example:

❌ No bills data available for MA

πŸ’‘ Suggestions:
  β€’ Try a different state - we have data for 50+ states
  β€’ Check /api/bills/map to see which states have bills data

[Show Technical Details β–Ό]

Expanded:

❌ No bills data available for MA

πŸ’‘ Suggestions:
  β€’ Try a different state - we have data for 50+ states
  β€’ Check /api/bills/map to see which states have bills data

[Hide Technical Details β–²]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Dataset 'CommunityOne/states-ma-bills-bills' not found on HuggingFace.

Full error: HTTP GET error on 'https://huggingface.co/datasets/...'

HuggingFace Container Logs

Yes, logs will appear in HuggingFace Spaces container logs!

The logger.add(sys.stderr, ...) configuration ensures all logs are written to stderr, which Docker/HuggingFace captures and displays in the container log console.

You'll see:

  1. βœ… Startup validation logs
  2. βœ… Request/response logs for every API call
  3. βœ… Detailed error logs with context

Configuration

Logging is controlled by environment variables in .env:

# Logging Configuration
LOG_LEVEL=INFO        # DEBUG, INFO, WARNING, ERROR
LOG_FILE=logs/oral-health-policy-pulse.log

Log Rotation:

  • New file created when size exceeds 500 MB
  • Format: oral-health-policy-pulse.log, oral-health-policy-pulse.log.1, etc.

Log Retention:

  • Files older than 10 days are automatically deleted
  • Prevents disk space issues in production

Testing

Test the new error responses:

# Test missing data error
curl https://www.communityone.com/api/bills?state=ZZ

# Expected response:
{
  "message": "No bills data available for ZZ",
  "error_type": "data_not_found",
  "suggestions": ["Try a different state - we have data for 50+ states", ...],
  "metadata": {"state": "ZZ", "data_type": "bills"}
}
# View container logs (HuggingFace Spaces)
# Look for:
# - πŸš€ STARTING... header
# - ➑️ Request logs
# - βœ…/❌ Response logs
# - πŸ“Š Data validation results

Benefits

  1. Better User Experience

    • Clear, actionable error messages
    • Suggestions for next steps
    • Option to see technical details
  2. Easier Debugging

    • All requests logged with timing
    • Structured error context
    • Full stack traces in logs
  3. Production Ready

    • Automatic log rotation
    • Configurable retention
    • Visible in container logs
  4. Compliance

    • Complete audit trail of API usage
    • Automatic cleanup of old logs
    • Configurable log levels