# AI Audio Detector - FastAPI FastAPI-based REST API for detecting AI-generated vs human speech. ## Features - 🚀 Fast and efficient inference with PyTorch - 🎯 Base64 encoded audio support - 📁 Direct file upload support - 🔒 CORS enabled for cross-origin requests - 📚 Auto-generated API documentation (Swagger UI) - 🐳 Docker support ## Installation ### Option 1: Local Installation ```bash # Install dependencies pip install -r requirements_api.txt # Run the API python api.py ``` The API will be available at `http://localhost:8000` ### Option 2: Using uvicorn directly ```bash pip install -r requirements_api.txt uvicorn api:app --host 0.0.0.0 --port 8000 --reload ``` ### Option 3: Docker ```bash # Build the Docker image docker build -f Dockerfile.api -t ai-audio-detector-api . # Run the container docker run -p 8000:8000 ai-audio-detector-api ``` ## API Endpoints ### 1. Health Check ```bash GET /health ``` Returns API status and model information. **Response:** ```json { "status": "healthy", "device": "cuda", "model_loaded": true, "threshold": 0.5 } ``` ### 2. Detect from Base64 ```bash POST /detect/base64 ``` Detect AI voice from base64 encoded audio. **Request Body:** ```json { "audio_base64": "base64_encoded_audio_string", "audio_format": "mp3", "language": "English", "threshold": 0.5 } ``` **Response:** ```json { "classification": "AI", "confidence": 0.8523, "explanation": "High confidence AI detection...", "language": "English", "threshold_used": 0.5 } ``` ### 3. Detect from File Upload ```bash POST /detect/file ``` Detect AI voice from uploaded audio file. **Form Data:** - `file`: Audio file (mp3, wav, etc.) - `language`: Language (optional, default: "English") - `threshold`: Detection threshold (optional) **Response:** ```json { "classification": "Human", "confidence": 0.2341, "explanation": "High confidence human detection...", "language": "English", "threshold_used": 0.5 } ``` ## Usage Examples ### Python with requests ```python import requests import base64 # Health check response = requests.get("http://localhost:8000/health") print(response.json()) # Base64 detection with open("audio.mp3", "rb") as f: audio_base64 = base64.b64encode(f.read()).decode() payload = { "audio_base64": audio_base64, "audio_format": "mp3", "language": "English" } response = requests.post("http://localhost:8000/detect/base64", json=payload) print(response.json()) # File upload with open("audio.mp3", "rb") as f: files = {"file": f} response = requests.post("http://localhost:8000/detect/file", files=files) print(response.json()) ``` ### cURL ```bash # Health check curl http://localhost:8000/health # File upload curl -X POST "http://localhost:8000/detect/file" \ -F "file=@audio.mp3" \ -F "language=English" # Base64 (create base64 first) base64 audio.mp3 > audio_base64.txt curl -X POST "http://localhost:8000/detect/base64" \ -H "Content-Type: application/json" \ -d '{ "audio_base64": "'$(cat audio_base64.txt)'", "audio_format": "mp3", "language": "English" }' ``` ### JavaScript/Fetch ```javascript // File upload const formData = new FormData(); formData.append('file', audioFile); formData.append('language', 'English'); fetch('http://localhost:8000/detect/file', { method: 'POST', body: formData }) .then(response => response.json()) .then(data => console.log(data)); // Base64 const reader = new FileReader(); reader.onload = function() { const base64Audio = reader.result.split(',')[1]; fetch('http://localhost:8000/detect/base64', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({ audio_base64: base64Audio, audio_format: 'mp3', language: 'English' }) }) .then(response => response.json()) .then(data => console.log(data)); }; reader.readAsDataURL(audioFile); ``` ## Testing Run the test script: ```bash python test_api.py ``` Or visit the interactive API documentation at: - Swagger UI: `http://localhost:8000/docs` - ReDoc: `http://localhost:8000/redoc` ## Model Files Place your trained model files in the same directory as `api.py`: - `best_model.pt` - Trained model checkpoint - `optimal_threshold.txt` - Optimal detection threshold If these files are not found, the API will use randomly initialized heads and a default threshold of 0.5. ## Configuration Edit the constants at the top of `api.py` to adjust: - Audio processing parameters - Model architecture settings - Ensemble weights - Detection threshold ## Deployment ### Production with Gunicorn ```bash pip install gunicorn gunicorn api:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 ``` ### Hugging Face Spaces 1. Create a new Space with Docker SDK 2. Upload: `api.py`, `requirements_api.txt`, `Dockerfile.api` 3. Add model files: `best_model.pt`, `optimal_threshold.txt` 4. The API will start automatically ## License MIT License