| # AI Audio Detector - FastAPI | |
| FastAPI-based REST API for detecting AI-generated vs human speech. | |
| ## Features | |
| - π Fast and efficient inference with PyTorch | |
| - π― Base64 encoded audio support | |
| - π Direct file upload support | |
| - π CORS enabled for cross-origin requests | |
| - π Auto-generated API documentation (Swagger UI) | |
| - π³ Docker support | |
| ## Installation | |
| ### Option 1: Local Installation | |
| ```bash | |
| # Install dependencies | |
| pip install -r requirements_api.txt | |
| # Run the API | |
| python api.py | |
| ``` | |
| The API will be available at `http://localhost:8000` | |
| ### Option 2: Using uvicorn directly | |
| ```bash | |
| pip install -r requirements_api.txt | |
| uvicorn api:app --host 0.0.0.0 --port 8000 --reload | |
| ``` | |
| ### Option 3: Docker | |
| ```bash | |
| # Build the Docker image | |
| docker build -f Dockerfile.api -t ai-audio-detector-api . | |
| # Run the container | |
| docker run -p 8000:8000 ai-audio-detector-api | |
| ``` | |
| ## API Endpoints | |
| ### 1. Health Check | |
| ```bash | |
| GET /health | |
| ``` | |
| Returns API status and model information. | |
| **Response:** | |
| ```json | |
| { | |
| "status": "healthy", | |
| "device": "cuda", | |
| "model_loaded": true, | |
| "threshold": 0.5 | |
| } | |
| ``` | |
| ### 2. Detect from Base64 | |
| ```bash | |
| POST /detect/base64 | |
| ``` | |
| Detect AI voice from base64 encoded audio. | |
| **Request Body:** | |
| ```json | |
| { | |
| "audio_base64": "base64_encoded_audio_string", | |
| "audio_format": "mp3", | |
| "language": "English", | |
| "threshold": 0.5 | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "classification": "AI", | |
| "confidence": 0.8523, | |
| "explanation": "High confidence AI detection...", | |
| "language": "English", | |
| "threshold_used": 0.5 | |
| } | |
| ``` | |
| ### 3. Detect from File Upload | |
| ```bash | |
| POST /detect/file | |
| ``` | |
| Detect AI voice from uploaded audio file. | |
| **Form Data:** | |
| - `file`: Audio file (mp3, wav, etc.) | |
| - `language`: Language (optional, default: "English") | |
| - `threshold`: Detection threshold (optional) | |
| **Response:** | |
| ```json | |
| { | |
| "classification": "Human", | |
| "confidence": 0.2341, | |
| "explanation": "High confidence human detection...", | |
| "language": "English", | |
| "threshold_used": 0.5 | |
| } | |
| ``` | |
| ## Usage Examples | |
| ### Python with requests | |
| ```python | |
| import requests | |
| import base64 | |
| # Health check | |
| response = requests.get("http://localhost:8000/health") | |
| print(response.json()) | |
| # Base64 detection | |
| with open("audio.mp3", "rb") as f: | |
| audio_base64 = base64.b64encode(f.read()).decode() | |
| payload = { | |
| "audio_base64": audio_base64, | |
| "audio_format": "mp3", | |
| "language": "English" | |
| } | |
| response = requests.post("http://localhost:8000/detect/base64", json=payload) | |
| print(response.json()) | |
| # File upload | |
| with open("audio.mp3", "rb") as f: | |
| files = {"file": f} | |
| response = requests.post("http://localhost:8000/detect/file", files=files) | |
| print(response.json()) | |
| ``` | |
| ### cURL | |
| ```bash | |
| # Health check | |
| curl http://localhost:8000/health | |
| # File upload | |
| curl -X POST "http://localhost:8000/detect/file" \ | |
| -F "file=@audio.mp3" \ | |
| -F "language=English" | |
| # Base64 (create base64 first) | |
| base64 audio.mp3 > audio_base64.txt | |
| curl -X POST "http://localhost:8000/detect/base64" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "audio_base64": "'$(cat audio_base64.txt)'", | |
| "audio_format": "mp3", | |
| "language": "English" | |
| }' | |
| ``` | |
| ### JavaScript/Fetch | |
| ```javascript | |
| // File upload | |
| const formData = new FormData(); | |
| formData.append('file', audioFile); | |
| formData.append('language', 'English'); | |
| fetch('http://localhost:8000/detect/file', { | |
| method: 'POST', | |
| body: formData | |
| }) | |
| .then(response => response.json()) | |
| .then(data => console.log(data)); | |
| // Base64 | |
| const reader = new FileReader(); | |
| reader.onload = function() { | |
| const base64Audio = reader.result.split(',')[1]; | |
| fetch('http://localhost:8000/detect/base64', { | |
| method: 'POST', | |
| headers: {'Content-Type': 'application/json'}, | |
| body: JSON.stringify({ | |
| audio_base64: base64Audio, | |
| audio_format: 'mp3', | |
| language: 'English' | |
| }) | |
| }) | |
| .then(response => response.json()) | |
| .then(data => console.log(data)); | |
| }; | |
| reader.readAsDataURL(audioFile); | |
| ``` | |
| ## Testing | |
| Run the test script: | |
| ```bash | |
| python test_api.py | |
| ``` | |
| Or visit the interactive API documentation at: | |
| - Swagger UI: `http://localhost:8000/docs` | |
| - ReDoc: `http://localhost:8000/redoc` | |
| ## Model Files | |
| Place your trained model files in the same directory as `api.py`: | |
| - `best_model.pt` - Trained model checkpoint | |
| - `optimal_threshold.txt` - Optimal detection threshold | |
| If these files are not found, the API will use randomly initialized heads and a default threshold of 0.5. | |
| ## Configuration | |
| Edit the constants at the top of `api.py` to adjust: | |
| - Audio processing parameters | |
| - Model architecture settings | |
| - Ensemble weights | |
| - Detection threshold | |
| ## Deployment | |
| ### Production with Gunicorn | |
| ```bash | |
| pip install gunicorn | |
| gunicorn api:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 | |
| ``` | |
| ### Hugging Face Spaces | |
| 1. Create a new Space with Docker SDK | |
| 2. Upload: `api.py`, `requirements_api.txt`, `Dockerfile.api` | |
| 3. Add model files: `best_model.pt`, `optimal_threshold.txt` | |
| 4. The API will start automatically | |
| ## License | |
| MIT License | |