Spaces:

shivam-2211
/

voice-detection-api

Sleeping

App Files Files Community

shivam0897-i commited on Feb 16

Commit

8a6ab53

1 Parent(s): 395a38b

perf: optimize deps, imports, Dockerfile; improve README for judges

Browse files

Files changed (10) hide show

.dockerignore +46 -0
Dockerfile +2 -7
README.md +134 -49
audio_utils.py +2 -31
evaluation_results.json +50 -0
main.py +4 -7
model.py +3 -6
requirements.txt +2 -8
run_final_tests.py +0 -44
test_my_api.py +171 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,46 @@

+# Tests and test data
+tests/
+_test_*.py
+test_*.py
+test_*.json
+pytest.ini
+drive-download-*/
+run_final_tests.py
+test_my_api.py
+# Documentation (not needed at runtime)
+docs/
+*.md
+!README.md
+# Training artifacts
+training/
+*.ipynb
+# Python caches
+__pycache__/
+*.pyc
+*.pyo
+# IDE and OS files
+.vscode/
+.idea/
+*.swp
+.DS_Store
+Thumbs.db
+# Scripts
+scripts/
+# Analysis results
+*.json
+!test_request.json
+!test_valid.json
+# Git
+.git/
+.gitignore
+# Env files
+.env
+.env.*

Dockerfile CHANGED Viewed

@@ -6,13 +6,8 @@ WORKDIR /app
 RUN apt-get update && apt-get install -y \
     libsndfile1 \
     ffmpeg \
-    git \
-    git-lfs \
     && rm -rf /var/lib/apt/lists/*
-# Initialize git lfs
-RUN git lfs install
 # Copy requirements first for better caching
 COPY requirements.txt .
@@ -36,5 +31,5 @@ WORKDIR /app
 # Hugging Face Spaces uses port 7860
 EXPOSE 7860
-# Run the application
-CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]

 RUN apt-get update && apt-get install -y \
     libsndfile1 \
     ffmpeg \
     && rm -rf /var/lib/apt/lists/*
 # Copy requirements first for better caching
 COPY requirements.txt .
 # Hugging Face Spaces uses port 7860
 EXPOSE 7860
+# Run the application (2 workers for concurrent request handling)
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860", "--workers", "2"]

README.md CHANGED Viewed

@@ -11,16 +11,93 @@ app_port: 7860
 # AI Voice Detection API
-Detects whether a voice sample is AI-generated or spoken by a real human using a fine-tuned Wav2Vec2 model.
 ## API Endpoint
-`POST /api/voice-detection`
-### Headers
-- `x-api-key`: Your API key (set via environment variable `API_KEY`)
-### Request Body
 ```json
 {
   "language": "English",
@@ -29,65 +106,73 @@ Detects whether a voice sample is AI-generated or spoken by a real human using a
 }
 ```
-### Response
 ```json
 {
   "status": "success",
   "language": "English",
-  "classification": "AI_GENERATED" | "HUMAN",
-  "confidenceScore": 0.95,
-  "explanation": "AI voice indicators: ..."
 }
 ```
-## Supported Languages
-- English
-- Tamil
-- Hindi
-- Malayalam
-- Telugu
-## Realtime Session APIs
-The backend also supports session-based realtime analysis:
-- `POST /v1/session/start`
-- `POST /v1/session/{session_id}/chunk`
-- `GET /v1/session/{session_id}/summary`
-- `GET /v1/session/{session_id}/alerts`
-- `POST /v1/session/{session_id}/end`
-Compatibility aliases are available under `/api/voice-detection/v1/...`.
-## Optional LLM Semantic Verifier
-A second-layer semantic verifier can be enabled to improve ambiguous chunk scoring:
-- `LLM_SEMANTIC_ENABLED=true`
-- `LLM_PROVIDER=openai` with `OPENAI_API_KEY=<your_key>`, or
-- `LLM_PROVIDER=gemini` with `GEMINI_API_KEY=<your_key>`
-- Tune with `LLM_SEMANTIC_*` env variables in `.env.example`.
-If `LLM_SEMANTIC_MODEL` is empty, provider defaults are used (`gpt-4o-mini` for OpenAI, `gemini-1.5-flash` for Gemini).
-The LLM layer is optional and the API continues to work when disabled.
-## Session Store Backend
-Realtime sessions support two backends:
-- `memory` (default): single-instance, volatile
-- `redis`: multi-worker and restart-safe (recommended for finals)
-Backend env settings:
-- `SESSION_STORE_BACKEND=redis`
-- `REDIS_URL=redis://...` (or `rediss://...`)
-- `REDIS_PREFIX=ai_call_shield`
-`GET /health` now includes `session_store_backend` so you can verify active backend.
-See `docs/architecture/redis-credentials-guide.md` for credential formats and setup steps.

 # AI Voice Detection API
+Detects whether a voice sample is **AI-generated** or spoken by a **real human** using a fine-tuned Wav2Vec2 transformer model combined with multi-signal forensic analysis.
+## Model Architecture
+```
+Audio Input (Base64 MP3/WAV)
+        │
+        ▼
+┌─────────────────────┐
+│  Audio Preprocessing │  librosa 16 kHz mono, normalization
+└────────┬────────────┘
+         │
+    ┌────┴────┐
+    ▼         ▼
+┌────────┐ ┌──────────────────┐
+│Wav2Vec2│ │ Signal Forensics │
+│  Model │ │  (4 dimensions)  │
+└───┬────┘ └───────┬──────────┘
+    │              │
+    ▼              ▼
+  Softmax    ┌─────────────┐
+ Confidence  │ Pitch       │
+    │        │ Spectral    │
+    │        │ Temporal    │
+    │        │ Authenticity│
+    │        └──────┬──────┘
+    └───────┬───────┘
+            ▼
+   Final Classification
+   (HUMAN / AI_GENERATED)
+```
+### Key Components
+| Component | Description |
+|-----------|-------------|
+| **ML Backbone** | [Wav2Vec2ForSequenceClassification](https://huggingface.co/shivam-2211/voice-detection-model) fine-tuned on human vs. AI-generated speech |
+| **Temperature Scaling** | Logits scaled by T=1.5 before softmax for well-calibrated confidence scores |
+| **Signal Forensics** | Pitch stability, spectral entropy, temporal rhythm, and acoustic anomaly detection |
+| **ASR Integration** | Faster-Whisper (tiny, int8) for language detection and transcript extraction |
+| **Timeout Safety** | 20-second budget with audio truncation to guarantee <30s response |
+## Quick Start
+### Prerequisites
+- Python 3.10+
+- FFmpeg (`apt-get install ffmpeg` or `brew install ffmpeg`)
+### Local Setup
+```bash
+# Clone the repository
+git clone https://github.com/shivam0897-i/voice_backend.git
+cd voice_backend
+# Install CPU-only PyTorch
+pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
+# Install dependencies
+pip install -r requirements.txt
+# Set your API key
+echo "API_KEY=your_secret_key" > .env
+# Run the server
+uvicorn main:app --host 0.0.0.0 --port 7860
+```
+### Docker
+```bash
+docker build -t voice-detection-api .
+docker run -p 7860:7860 -e API_KEY=your_secret_key voice-detection-api
+```
 ## API Endpoint
+### `POST /api/voice-detection`
+**Headers:**
+| Header | Description |
+|--------|-------------|
+| `Content-Type` | `application/json` |
+| `x-api-key` | Your API key (set via `API_KEY` env var) |
+**Request Body:**
 ```json
 {
   "language": "English",
 }
 ```
+**Response (200 OK):**
 ```json
 {
   "status": "success",
   "language": "English",
+  "classification": "AI_GENERATED",
+  "confidenceScore": 0.99,
+  "explanation": "AI voice indicators detected with high confidence..."
 }
 ```
+**Example with curl:**
+```bash
+# Encode audio to Base64 and send
+AUDIO_B64=$(base64 -w0 sample.mp3)
+curl -X POST https://shivam-2211-voice-detection-api.hf.space/api/voice-detection \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: YOUR_KEY" \
+  -d "{\"language\": \"English\", \"audioFormat\": \"mp3\", \"audioBase64\": \"$AUDIO_B64\"}"
+```
+## Supported Languages
+| Language | Code |
+|----------|------|
+| English | `English` |
+| Hindi | `Hindi` |
+| Tamil | `Tamil` |
+| Malayalam | `Malayalam` |
+| Telugu | `Telugu` |
+| Auto-detect | `Auto` |
+## Environment Variables
+| Variable | Required | Default | Description |
+|----------|----------|---------|-------------|
+| `API_KEY` | **Yes** | — | API authentication key |
+| `MODEL_NAME` | No | `shivam-2211/voice-detection-model` | HuggingFace model ID |
+| `MODEL_LOGIT_TEMPERATURE` | No | `1.5` | Softmax temperature scaling |
+| `SESSION_STORE_BACKEND` | No | `redis` | Session backend (`memory` or `redis`) |
+| `REDIS_URL` | No | — | Redis connection URL |
+| `LLM_SEMANTIC_ENABLED` | No | `false` | Enable LLM semantic verifier |
+| `PORT` | No | `7860` | Server port |
+## Deployment
+The API is deployed on **HuggingFace Spaces** using Docker:
+- **Live URL**: `https://shivam-2211-voice-detection-api.hf.space`
+- **Health Check**: `GET /health`
+- **Infrastructure**: CPU inference, 2 Uvicorn workers, Redis session store
+## Project Structure
+```
+├── main.py              # FastAPI app, all endpoints, error handling
+├── model.py             # Wav2Vec2 inference + signal forensics engine
+├── audio_utils.py       # Base64 decoding, audio validation, loading
+├── config.py            # Pydantic Settings (env-based configuration)
+├── speech_to_text.py    # Faster-Whisper ASR integration
+├── fraud_language.py    # Fraud language pattern detection
+├── privacy_utils.py     # PII redaction utilities
+├── Dockerfile           # Production Docker image
+├── requirements.txt     # Python dependencies
+└── tests/               # Test suite
+```
+## License
+MIT

audio_utils.py CHANGED Viewed

@@ -8,6 +8,8 @@ import os
 import logging
 from typing import Tuple, Optional
 import numpy as np
 # Configure logging
 logger = logging.getLogger(__name__)
@@ -113,9 +115,6 @@ def load_audio_from_bytes(audio_bytes: bytes, target_sr: int = 22050, audio_form
     tmp_path = None
     try:
-        import librosa
-        import soundfile as sf
         # Normalize format
         audio_format = audio_format.lower().strip()
         if audio_format.startswith("."):
@@ -153,31 +152,3 @@ def load_audio_from_bytes(audio_bytes: bytes, target_sr: int = 22050, audio_form
                 pass  # Best effort cleanup
-def get_audio_duration(audio: np.ndarray, sr: int) -> float:
-    """
-    Calculate the duration of audio in seconds.
-    Args:
-        audio: Audio waveform
-        sr: Sample rate
-    Returns:
-        Duration in seconds
-    """
-    return len(audio) / sr
-def normalize_audio(audio: np.ndarray) -> np.ndarray:
-    """
-    Normalize audio to have maximum amplitude of 1.0.
-    Args:
-        audio: Audio waveform
-    Returns:
-        Normalized audio
-    """
-    max_val = np.max(np.abs(audio))
-    if max_val > 0:
-        return audio / max_val
-    return audio

 import logging
 from typing import Tuple, Optional
 import numpy as np
+import librosa
+import soundfile as sf
 # Configure logging
 logger = logging.getLogger(__name__)
     tmp_path = None
     try:
         # Normalize format
         audio_format = audio_format.lower().strip()
         if audio_format.startswith("."):
                 pass  # Best effort cleanup

evaluation_results.json ADDED Viewed

	@@ -0,0 +1,50 @@

+{
+  "finalScore": 100,
+  "totalFiles": 5,
+  "scorePerFile": 20.0,
+  "successfulClassifications": 5,
+  "wrongClassifications": 0,
+  "failedTests": 0,
+  "fileResults": [
+    {
+      "fileIndex": 0,
+      "status": "success",
+      "matched": true,
+      "score": 20.0,
+      "actualClassification": "AI_GENERATED",
+      "confidenceScore": 0.99
+    },
+    {
+      "fileIndex": 1,
+      "status": "success",
+      "matched": true,
+      "score": 20.0,
+      "actualClassification": "HUMAN",
+      "confidenceScore": 0.99
+    },
+    {
+      "fileIndex": 2,
+      "status": "success",
+      "matched": true,
+      "score": 20.0,
+      "actualClassification": "AI_GENERATED",
+      "confidenceScore": 0.99
+    },
+    {
+      "fileIndex": 3,
+      "status": "success",
+      "matched": true,
+      "score": 20.0,
+      "actualClassification": "HUMAN",
+      "confidenceScore": 0.99
+    },
+    {
+      "fileIndex": 4,
+      "status": "success",
+      "matched": true,
+      "score": 20.0,
+      "actualClassification": "AI_GENERATED",
+      "confidenceScore": 0.99
+    }
+  ]
+}

main.py CHANGED Viewed

@@ -17,9 +17,11 @@ from datetime import datetime, timezone
 from typing import Optional, Any, Dict, List
 from contextlib import asynccontextmanager
 import numpy as np
-from fastapi import FastAPI, HTTPException, Request, Depends, WebSocket, WebSocketDisconnect
 from fastapi.middleware.cors import CORSMiddleware
-from fastapi.responses import JSONResponse
 from pydantic import BaseModel, Field, field_validator, ValidationError
 from slowapi import Limiter, _rate_limit_exceeded_handler
 from slowapi.util import get_remote_address
@@ -353,8 +355,6 @@ async def lifespan(app: FastAPI):
     logger.info("Shutting down...")
-from fastapi.responses import RedirectResponse
 # Initialize FastAPI app with lifespan
 app = FastAPI(
     title="AI Voice Detection API",
@@ -1737,8 +1737,6 @@ def session_to_summary(session: SessionState) -> SessionSummaryResponse:
 # Authentication
-from fastapi.security import APIKeyHeader
-from fastapi import Security
 api_key_header = APIKeyHeader(name="x-api-key", auto_error=False)  # Changed to False for better error messages
@@ -2152,7 +2150,6 @@ async def detect_voice(
 # Exception handlers
-from fastapi.exceptions import RequestValidationError
 def to_json_safe(value: Any) -> Any:
     """Recursively convert values to JSON-safe primitives."""

 from typing import Optional, Any, Dict, List
 from contextlib import asynccontextmanager
 import numpy as np
+from fastapi import FastAPI, HTTPException, Request, Depends, WebSocket, WebSocketDisconnect, Security
 from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse, RedirectResponse
+from fastapi.security import APIKeyHeader
+from fastapi.exceptions import RequestValidationError
 from pydantic import BaseModel, Field, field_validator, ValidationError
 from slowapi import Limiter, _rate_limit_exceeded_handler
 from slowapi.util import get_remote_address
     logger.info("Shutting down...")
 # Initialize FastAPI app with lifespan
 app = FastAPI(
     title="AI Voice Detection API",
 # Authentication
 api_key_header = APIKeyHeader(name="x-api-key", auto_error=False)  # Changed to False for better error messages
 # Exception handlers
 def to_json_safe(value: Any) -> Any:
     """Recursively convert values to JSON-safe primitives."""

model.py CHANGED Viewed

@@ -5,6 +5,9 @@ Combines Wav2Vec2 deepfake detection with signal forensics.
 import logging
 import os
 import numpy as np
 from typing import Dict, Tuple, List, Optional
 from dataclasses import dataclass
 import warnings
@@ -57,7 +60,6 @@ def get_device():
     """Get the best available device (GPU or CPU)."""
     global _device
     if _device is None:
-        import torch
         if torch.cuda.is_available():
             _device = "cuda"
         else:
@@ -136,8 +138,6 @@ def load_model():
 def extract_signal_features(audio: np.ndarray, sr: int, fast_mode: bool = False) -> Dict[str, float]:
     """Extract signal-based features (pitch, entropy, silence)."""
-    import librosa
-    from scipy.stats import entropy
     features = {}
@@ -475,9 +475,6 @@ def classify_with_model(audio: np.ndarray, sr: int) -> Tuple[str, float]:
     Returns:
         Tuple of (classification, confidence)
     """
-    import torch
-    import librosa
     model, processor = load_model()
     device = get_device()

 import logging
 import os
 import numpy as np
+import librosa
+import torch
+from scipy.stats import entropy
 from typing import Dict, Tuple, List, Optional
 from dataclasses import dataclass
 import warnings
     """Get the best available device (GPU or CPU)."""
     global _device
     if _device is None:
         if torch.cuda.is_available():
             _device = "cuda"
         else:
 def extract_signal_features(audio: np.ndarray, sr: int, fast_mode: bool = False) -> Dict[str, float]:
     """Extract signal-based features (pitch, entropy, silence)."""
     features = {}
     Returns:
         Tuple of (classification, confidence)
     """
     model, processor = load_model()
     device = get_device()

requirements.txt CHANGED Viewed

@@ -8,16 +8,10 @@ scipy>=1.10.0
 python-dotenv
 pydantic>=2.0.0
 transformers>=4.30.0
-datasets>=2.14.0
-scikit-learn>=1.3.0
-accelerate>=0.20.0
 slowapi>=0.1.9
 pydantic-settings>=2.0.0
 httpx>=0.27.0
-# PyTorch - install manually for your platform if not using Docker:
-# pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
-torch>=2.0.0
-torchaudio>=2.0.0
 faster-whisper>=1.0.3
 redis>=5.0.0

 python-dotenv
 pydantic>=2.0.0
 transformers>=4.30.0
 slowapi>=0.1.9
 pydantic-settings>=2.0.0
 httpx>=0.27.0
 faster-whisper>=1.0.3
 redis>=5.0.0
+# PyTorch CPU — installed separately in Dockerfile for smaller image.
+# For local dev: pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu

run_final_tests.py DELETED Viewed

@@ -1,44 +0,0 @@
-"""Final hackathon test: all 5 files against legacy POST /api/voice-detection"""
-import base64, json, time, requests
-DIR = r"c:\Users\shiva\OneDrive\Desktop\Voice Project\voice-detection-api\drive-download-20260216T053632Z-1-001"
-URL = "http://localhost:7860/api/voice-detection"
-HEADERS = {"Content-Type": "application/json", "x-api-key": "sk_test_voice_detection_2026"}
-FILES = [
-    ("English_voice_AI_GENERATED.mp3", "English", "AI_GENERATED"),
-    ("Hindi_Voice_HUMAN.mp3", "Hindi", "HUMAN"),
-    ("Malayalam_AI_GENERATED.mp3", "Malayalam", "AI_GENERATED"),
-    ("TAMIL_VOICE__HUMAN.mp3", "Tamil", "HUMAN"),
-    ("Telugu_Voice_AI_GENERATED.mp3", "Telugu", "AI_GENERATED"),
-]
-print("=" * 90)
-print(f"{'File':<42} {'Expected':<16} {'Got':<16} {'Conf':>6}  Result")
-print("=" * 90)
-passed = 0
-for fname, lang, expected in FILES:
-    with open(f"{DIR}\\{fname}", "rb") as f:
-        b64 = base64.b64encode(f.read()).decode()
-    payload = {"audioBase64": b64, "language": lang, "audioFormat": "mp3"}
-    t0 = time.time()
-    try:
-        r = requests.post(URL, json=payload, headers=HEADERS, timeout=30)
-        elapsed = time.time() - t0
-        d = r.json()
-        cls = d.get("classification", "?")
-        conf = d.get("confidenceScore", "?")
-        ok = cls == expected
-        if ok:
-            passed += 1
-        tag = "PASS" if ok else "FAIL"
-        print(f"{fname:<42} {expected:<16} {cls:<16} {conf:>6}  {tag}  ({elapsed:.1f}s)")
-    except Exception as e:
-        elapsed = time.time() - t0
-        print(f"{fname:<42} {expected:<16} {'ERROR':<16} {'--':>6}  FAIL  ({elapsed:.1f}s) {e}")
-    # small pause between requests to avoid CPU thermal throttle
-    time.sleep(2)
-print("=" * 90)
-print(f"Result: {passed}/{len(FILES)} passed")

test_my_api.py ADDED Viewed

	@@ -0,0 +1,171 @@

+"""
+Official evaluation script from the hackathon guide, configured with our 5 test files.
+This mirrors EXACTLY what the evaluator will run.
+"""
+import requests
+import base64
+import json
+def evaluate_voice_detection_api(endpoint_url, api_key, test_files):
+    if not endpoint_url:
+        print("Error: Endpoint URL is required")
+        return False
+    if not test_files or len(test_files) == 0:
+        print("Error: No test files provided")
+        return False
+    total_files = len(test_files)
+    score_per_file = 100 / total_files
+    total_score = 0
+    file_results = []
+    print(f"\n{'='*60}")
+    print(f"Starting Evaluation")
+    print(f"{'='*60}")
+    print(f"Endpoint: {endpoint_url}")
+    print(f"Total Test Files: {total_files}")
+    print(f"Score per File: {score_per_file:.2f}")
+    print(f"{'='*60}\n")
+    for idx, file_data in enumerate(test_files):
+        language = file_data.get('language', 'English')
+        file_path = file_data.get('file_path', '')
+        expected_classification = file_data.get('expected_classification', '')
+        print(f"Test {idx + 1}/{total_files}: {file_path}")
+        if not file_path or not expected_classification:
+            file_results.append({'fileIndex': idx, 'status': 'skipped', 'score': 0})
+            print(f"   Skipped: Missing file path or expected classification\n")
+            continue
+        try:
+            with open(file_path, 'rb') as audio_file:
+                audio_base64 = base64.b64encode(audio_file.read()).decode('utf-8')
+        except Exception as e:
+            file_results.append({'fileIndex': idx, 'status': 'failed', 'message': f'Failed to read: {e}', 'score': 0})
+            print(f"   Failed to read file: {e}\n")
+            continue
+        headers = {'Content-Type': 'application/json', 'x-api-key': api_key}
+        request_body = {'language': language, 'audioFormat': 'mp3', 'audioBase64': audio_base64}
+        try:
+            response = requests.post(endpoint_url, headers=headers, json=request_body, timeout=30)
+            if response.status_code != 200:
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': f'HTTP {response.status_code}', 'score': 0})
+                print(f"   HTTP Status: {response.status_code}")
+                print(f"   Response: {response.text[:200]}\n")
+                continue
+            response_data = response.json()
+            if not isinstance(response_data, dict):
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': 'Not a JSON object', 'score': 0})
+                print(f"   Invalid response type\n")
+                continue
+            response_status = response_data.get('status', '')
+            response_classification = response_data.get('classification', '')
+            confidence_score = response_data.get('confidenceScore', None)
+            if not response_status or not response_classification or confidence_score is None:
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': 'Missing required fields', 'score': 0})
+                print(f"   Missing required fields")
+                print(f"   Response: {json.dumps(response_data, indent=2)[:200]}\n")
+                continue
+            if response_status != 'success':
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': f'Status: {response_status}', 'score': 0})
+                print(f"   Status not 'success': {response_status}\n")
+                continue
+            if not isinstance(confidence_score, (int, float)) or confidence_score < 0 or confidence_score > 1:
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': f'Invalid confidence: {confidence_score}', 'score': 0})
+                print(f"   Invalid confidence score: {confidence_score}\n")
+                continue
+            valid_classifications = ['HUMAN', 'AI_GENERATED']
+            if response_classification not in valid_classifications:
+                file_results.append({'fileIndex': idx, 'status': 'failed', 'message': f'Invalid classification: {response_classification}', 'score': 0})
+                print(f"   Invalid classification: {response_classification}\n")
+                continue
+            # Score calculation
+            file_score = 0
+            if response_classification == expected_classification:
+                if confidence_score >= 0.8:
+                    file_score = score_per_file
+                    confidence_tier = "100%"
+                elif confidence_score >= 0.6:
+                    file_score = score_per_file * 0.75
+                    confidence_tier = "75%"
+                elif confidence_score >= 0.4:
+                    file_score = score_per_file * 0.5
+                    confidence_tier = "50%"
+                else:
+                    file_score = score_per_file * 0.25
+                    confidence_tier = "25%"
+                total_score += file_score
+                file_results.append({'fileIndex': idx, 'status': 'success', 'matched': True, 'score': round(file_score, 2),
+                                     'actualClassification': response_classification, 'confidenceScore': confidence_score})
+                print(f"   CORRECT: {response_classification}")
+                print(f"   Confidence: {confidence_score:.2f} -> {confidence_tier} of points")
+                print(f"   Score: {file_score:.2f}/{score_per_file:.2f}\n")
+            else:
+                file_results.append({'fileIndex': idx, 'status': 'success', 'matched': False, 'score': 0,
+                                     'actualClassification': response_classification, 'confidenceScore': confidence_score})
+                print(f"   WRONG: {response_classification} (Expected: {expected_classification})")
+                print(f"   Score: 0/{score_per_file:.2f}\n")
+        except requests.exceptions.Timeout:
+            file_results.append({'fileIndex': idx, 'status': 'failed', 'message': 'Timeout (>30s)', 'score': 0})
+            print(f"   TIMEOUT: Request took longer than 30 seconds\n")
+        except requests.exceptions.ConnectionError:
+            file_results.append({'fileIndex': idx, 'status': 'failed', 'message': 'Connection error', 'score': 0})
+            print(f"   CONNECTION ERROR\n")
+        except Exception as e:
+            file_results.append({'fileIndex': idx, 'status': 'failed', 'message': str(e), 'score': 0})
+            print(f"   ERROR: {e}\n")
+    final_score = round(total_score)
+    print(f"{'='*60}")
+    print(f"EVALUATION SUMMARY")
+    print(f"{'='*60}")
+    print(f"Total Files Tested: {total_files}")
+    print(f"Final Score: {final_score}/100")
+    print(f"{'='*60}\n")
+    successful = sum(1 for r in file_results if r.get('matched', False))
+    failed = sum(1 for r in file_results if r['status'] == 'failed')
+    wrong = sum(1 for r in file_results if r['status'] == 'success' and not r.get('matched', False))
+    print(f"Correct Classifications: {successful}/{total_files}")
+    print(f"Wrong Classifications: {wrong}/{total_files}")
+    print(f"Failed/Errors: {failed}/{total_files}\n")
+    with open('evaluation_results.json', 'w') as f:
+        json.dump({'finalScore': final_score, 'totalFiles': total_files, 'scorePerFile': round(score_per_file, 2),
+                   'successfulClassifications': successful, 'wrongClassifications': wrong, 'failedTests': failed,
+                   'fileResults': file_results}, f, indent=2)
+    print(f"Detailed results saved to: evaluation_results.json\n")
+    return True
+if __name__ == '__main__':
+    ENDPOINT_URL = 'https://shivam-2211-voice-detection-api.hf.space/api/voice-detection'
+    API_KEY = 'sk_test_voice_detection_2026'
+    DIR = r'c:\Users\shiva\OneDrive\Desktop\Voice Project\voice-detection-api\drive-download-20260216T053632Z-1-001'
+    TEST_FILES = [
+        {'language': 'English', 'file_path': f'{DIR}\\English_voice_AI_GENERATED.mp3', 'expected_classification': 'AI_GENERATED'},
+        {'language': 'Hindi',   'file_path': f'{DIR}\\Hindi_Voice_HUMAN.mp3',          'expected_classification': 'HUMAN'},
+        {'language': 'Malayalam','file_path': f'{DIR}\\Malayalam_AI_GENERATED.mp3',     'expected_classification': 'AI_GENERATED'},
+        {'language': 'Tamil',   'file_path': f'{DIR}\\TAMIL_VOICE__HUMAN.mp3',         'expected_classification': 'HUMAN'},
+        {'language': 'Telugu',  'file_path': f'{DIR}\\Telugu_Voice_AI_GENERATED.mp3',  'expected_classification': 'AI_GENERATED'},
+    ]
+    evaluate_voice_detection_api(ENDPOINT_URL, API_KEY, TEST_FILES)