Spaces:
Running
title: AudioShield AI Voice Detector
emoji: π‘οΈ
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
AudioShield AI: Voice Fraud Detection System
Problem Statement 01: AI-Generated Voice Detection for Regional Languages
π Overview
AudioShield AI is a high-performance REST API designed to detect AI-generated voice deepfakes with exceptional accuracy. Built for the GUVI Hackathon, it specifically addresses the challenge of identifying synthetic audio in Tamil, English, Hindi, Malayalam, and Telugu.
Unlike standard detectors, AudioShield uses a Multi-Model Voting Ensemble approach, aggregating the intelligence of 4 state-of-the-art Wav2Vec2 models to make a final, highly reliable decision.
π― Problem It Solves
With the rise of Generative AI, voice scams and deepfakes are becoming indistinguishable from reality. Financial fraud, impersonation, and misinformation are growing threats. AudioShield provides a robust, scalable defense mechanism that can be integrated into calls, messaging apps, and verification systems.
β¨ Key Features
- π‘οΈ Voting Ensemble Power: Leverages 4 distinct AI models (MelodyMachine, Mo-Creator, Hemgg, Gustking-XLSR) to minimize false positives.
- π Multi-Lingual Support: Optimized for Indian regional languages (Tamil, Telugu, Hindi, Malayalam) + English.
- β‘ Zero Cold Start: Implements a "Warm-up" routine to ensure the first API request is as fast as the 100th.
- π Render-Ready: Configured for seamless deployment on cloud platforms like Render.
- π Explainable AI: Provides detailed JSON responses with classification confidence and logic.
ποΈ System Architecture
The system follows a Microservices-ready, Layered Architecture:
graph TD
User[Client / Postman] -->|"HTTP POST (Base64)"| API[FastAPI Service]
API -->|"Async Thread"| Engine[Detection Engine]
subgraph "Ensemble Committee (The AI Core)"
Engine -->|Input| M1[MelodyMachine]
Engine -->|Input| M2[Mo-Creator]
Engine -->|Input| M3[Hemgg]
Engine -->|Input| M4["Gustking (XLSR)"]
M1 -->|Vote| Agg[Weighted Aggregator]
M2 -->|Vote| Agg
M3 -->|Vote| Agg
M4 -->|Vote| Agg
end
Agg -->|Final Score| Verdict[Classification Logic]
Verdict -->|JSON Response| User
Core Components
- FastAPI Layer (
app.py): Handles HTTP requests, validation, and async processing. - Detection Engine (
detect.py): Manages model loading, inference, and the ensemble voting logic. - Models:
MelodyMachine/Deepfake-audio-detection-V2mo-thecreator/Deepfake-audio-detectionHemgg/Deepfake-audio-detectionGustking/wav2vec2-large-xlsr-deepfake-audio-classification(The "Expert" model)
π οΈ Tech Stack
- Language: Python 3.10+
- API Framework: FastAPI, Uvicorn
- ML Libraries: PyTorch, Transformers, Librosa, NumPy
- Deployment: Docker-ready, Render-compatible
π Installation & Usage
1. Clone the Repository
git clone https://github.com/krish1440/AI-Generated-Voice-Detection.git
cd AI-Generated-Voice-Detection
2. Install Dependencies
pip install -r requirements.txt
3. Run the Server
python app.py
The server will start on port 8000 (or the PORT env var).
Note: On the first run, it will download necessary model weights (approx. 2-3GB).
π API Documentation
Detect Voice
Endpoint: POST /api/voice-detection
Request Body (JSON):
{
"language": "Tamil",
"audioFormat": "mp3",
"audioBase64": "<Base64 encoded MP3 string>"
}
Response (Success):
{
"status": "success",
"language": "Tamil",
"classification": "AI_GENERATED",
"confidenceScore": 0.98,
"explanation": "Ensemble Analysis: 4/4 models flagged this audio as AI-generated."
}
Response (Error):
{
"status": "error",
"message": "Invalid Base64 encoding."
}
βοΈ Deployment (Hugging Face Spaces)
This project is Dockerized for Hugging Face Spaces.
- Create a new Space on Hugging Face using the Docker SDK.
- Connect your GitHub repository.
- Hugging Face will automatically build using the
Dockerfile. - The API will be live at
https://huggingface.co/spaces/YOUR_USERNAME/SPACE_NAME/api/voice-detection.
Note: The Dockerfile builds ffmpeg and runs as user 1000 for security compliance on Spaces.
Tip: If the build fails with a registry error, try "Factory Reboot" in the Settings tab.
Developed for GUVI Hackathon.
