Deepfake_audio_detector / README_API.md
NikhilSol9876's picture
FastAPI implementation
41a2b5c verified

AI Audio Detector - FastAPI

FastAPI-based REST API for detecting AI-generated vs human speech.

Features

  • πŸš€ Fast and efficient inference with PyTorch
  • 🎯 Base64 encoded audio support
  • πŸ“ Direct file upload support
  • πŸ”’ CORS enabled for cross-origin requests
  • πŸ“š Auto-generated API documentation (Swagger UI)
  • 🐳 Docker support

Installation

Option 1: Local Installation

# Install dependencies
pip install -r requirements_api.txt

# Run the API
python api.py

The API will be available at http://localhost:8000

Option 2: Using uvicorn directly

pip install -r requirements_api.txt
uvicorn api:app --host 0.0.0.0 --port 8000 --reload

Option 3: Docker

# Build the Docker image
docker build -f Dockerfile.api -t ai-audio-detector-api .

# Run the container
docker run -p 8000:8000 ai-audio-detector-api

API Endpoints

1. Health Check

GET /health

Returns API status and model information.

Response:

{
  "status": "healthy",
  "device": "cuda",
  "model_loaded": true,
  "threshold": 0.5
}

2. Detect from Base64

POST /detect/base64

Detect AI voice from base64 encoded audio.

Request Body:

{
  "audio_base64": "base64_encoded_audio_string",
  "audio_format": "mp3",
  "language": "English",
  "threshold": 0.5
}

Response:

{
  "classification": "AI",
  "confidence": 0.8523,
  "explanation": "High confidence AI detection...",
  "language": "English",
  "threshold_used": 0.5
}

3. Detect from File Upload

POST /detect/file

Detect AI voice from uploaded audio file.

Form Data:

  • file: Audio file (mp3, wav, etc.)
  • language: Language (optional, default: "English")
  • threshold: Detection threshold (optional)

Response:

{
  "classification": "Human",
  "confidence": 0.2341,
  "explanation": "High confidence human detection...",
  "language": "English",
  "threshold_used": 0.5
}

Usage Examples

Python with requests

import requests
import base64

# Health check
response = requests.get("http://localhost:8000/health")
print(response.json())

# Base64 detection
with open("audio.mp3", "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode()

payload = {
    "audio_base64": audio_base64,
    "audio_format": "mp3",
    "language": "English"
}
response = requests.post("http://localhost:8000/detect/base64", json=payload)
print(response.json())

# File upload
with open("audio.mp3", "rb") as f:
    files = {"file": f}
    response = requests.post("http://localhost:8000/detect/file", files=files)
print(response.json())

cURL

# Health check
curl http://localhost:8000/health

# File upload
curl -X POST "http://localhost:8000/detect/file" \
  -F "file=@audio.mp3" \
  -F "language=English"

# Base64 (create base64 first)
base64 audio.mp3 > audio_base64.txt
curl -X POST "http://localhost:8000/detect/base64" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_base64": "'$(cat audio_base64.txt)'",
    "audio_format": "mp3",
    "language": "English"
  }'

JavaScript/Fetch

// File upload
const formData = new FormData();
formData.append('file', audioFile);
formData.append('language', 'English');

fetch('http://localhost:8000/detect/file', {
  method: 'POST',
  body: formData
})
.then(response => response.json())
.then(data => console.log(data));

// Base64
const reader = new FileReader();
reader.onload = function() {
  const base64Audio = reader.result.split(',')[1];
  
  fetch('http://localhost:8000/detect/base64', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify({
      audio_base64: base64Audio,
      audio_format: 'mp3',
      language: 'English'
    })
  })
  .then(response => response.json())
  .then(data => console.log(data));
};
reader.readAsDataURL(audioFile);

Testing

Run the test script:

python test_api.py

Or visit the interactive API documentation at:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Model Files

Place your trained model files in the same directory as api.py:

  • best_model.pt - Trained model checkpoint
  • optimal_threshold.txt - Optimal detection threshold

If these files are not found, the API will use randomly initialized heads and a default threshold of 0.5.

Configuration

Edit the constants at the top of api.py to adjust:

  • Audio processing parameters
  • Model architecture settings
  • Ensemble weights
  • Detection threshold

Deployment

Production with Gunicorn

pip install gunicorn
gunicorn api:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

Hugging Face Spaces

  1. Create a new Space with Docker SDK
  2. Upload: api.py, requirements_api.txt, Dockerfile.api
  3. Add model files: best_model.pt, optimal_threshold.txt
  4. The API will start automatically

License

MIT License