Spaces:

Alaaharoun
/

faster-whisper-api

Sleeping

App Files Files Community

Alaaharoun commited on Jul 28, 2025

Commit

9e4d788

verified ·

1 Parent(s): dcf548c

Upload 7 files

Browse files

Files changed (7) hide show

.dockerignore +25 -0
Dockerfile +35 -0
README.md +152 -12
app.py +306 -0
config.json +4 -0
docker-compose.yml +18 -0
requirements.txt +5 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,25 @@

+.git
+.gitignore
+README.md
+.env
+*.log
+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.Python
+env
+pip-log.txt
+pip-delete-this-directory.txt
+.tox
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.log
+.git
+.mypy_cache
+.pytest_cache
+.hypothesis

Dockerfile ADDED Viewed

	@@ -0,0 +1,35 @@

+# Use Python 3.9 slim image as base
+FROM python:3.9-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies including FFmpeg
+RUN apt-get update && apt-get install -y \
+    ffmpeg \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY app.py .
+# Create a non-root user for security
+RUN useradd --create-home --shell /bin/bash app \
+    && chown -R app:app /app
+USER app
+# Expose port
+EXPOSE 7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run the application
+CMD ["python", "app.py"]

README.md CHANGED Viewed

@@ -1,12 +1,152 @@
----
-title: Faster Whisper Api
-emoji: 😻
-colorFrom: purple
-colorTo: gray
-sdk: docker
-pinned: false
-license: apache-2.0
-short_description: Alaaharoun/faster-whisper-api
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+---
+title: "Faster Whisper API"
+emoji: "🎤"
+colorFrom: "blue"
+colorTo: "purple"
+sdk: "docker"
+sdk_version: "latest"
+app_file: "app.py"
+pinned: false
+---
+# 🎤 Faster Whisper API - Fixed Version
+## 🆕 Latest Fixes Applied:
+### ✅ Critical Bug Fixes:
+- **Fixed "name 'traceback' is not defined" error** - Removed problematic traceback import
+- **Improved error handling** - Better error messages and logging
+- **Enhanced CORS middleware** - Better browser compatibility
+- **Added detailed logging** - For easier debugging on Hugging Face Spaces
+### 🔧 Performance Improvements:
+- **Better file validation** - 25MB file size limit
+- **Enhanced VAD support** - Voice Activity Detection with fallback
+- **Improved model loading** - Better error handling during startup
+- **Added health check endpoint** - For monitoring service status
+## 🚀 Quick Start:
+### Health Check:
+```bash
+curl https://alaaharoun-faster-whisper-api.hf.space/health
+```
+### Transcribe Audio (without VAD):
+```bash
+curl -X POST \
+  -F "file=@audio.wav" \
+  -F "language=en" \
+  -F "task=transcribe" \
+  https://alaaharoun-faster-whisper-api.hf.space/transcribe
+```
+### Transcribe Audio (with VAD):
+```bash
+curl -X POST \
+  -F "file=@audio.wav" \
+  -F "language=en" \
+  -F "task=transcribe" \
+  -F "vad_filter=true" \
+  -F "vad_parameters=threshold=0.5" \
+  https://alaaharoun-faster-whisper-api.hf.space/transcribe
+```
+## 📊 Supported Parameters:
+- **`file`**: Audio file (WAV, MP3, M4A, FLAC, OGG, WEBM)
+- **`language`**: Language code (optional, e.g., "en", "ar", "es")
+- **`task`**: "transcribe" or "translate" (default: "transcribe")
+- **`vad_filter`**: Enable Voice Activity Detection (default: false)
+- **`vad_parameters`**: VAD parameters (default: "threshold=0.5")
+## 🔧 Response Format:
+### Success Response:
+```json
+{
+  "success": true,
+  "text": "Transcribed text here",
+  "language": "en",
+  "language_probability": 0.95,
+  "vad_enabled": false,
+  "vad_threshold": null
+}
+```
+### Error Response:
+```json
+{
+  "error": "Error message",
+  "error_type": "ExceptionType",
+  "success": false
+}
+```
+## 🛠️ Local Development:
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run the server
+python app.py
+```
+Or with uvicorn:
+```bash
+uvicorn app:app --host 0.0.0.0 --port 7860
+```
+## 📝 Important Notes:
+- **Maximum file size**: 25MB
+- **Supported formats**: WAV, MP3, M4A, FLAC, OGG, WEBM
+- **VAD support**: Configurable threshold with fallback mechanism
+- **Language detection**: Automatic if not specified
+- **Error handling**: Detailed error messages for debugging
+## 🔍 Troubleshooting:
+### Common Issues:
+1. **500 Internal Server Error**:
+   - Check if the model is loaded properly
+   - Verify file format and size
+   - Check server logs for detailed error messages
+2. **VAD Issues**:
+   - The service will automatically fallback to standard transcription
+   - Check VAD parameters format
+3. **File Upload Issues**:
+   - Ensure file size is under 25MB
+   - Check file format compatibility
+## 🌐 Service URLs:
+- **Main Service**: https://alaaharoun-faster-whisper-api.hf.space
+- **Health Check**: https://alaaharoun-faster-whisper-api.hf.space/health
+- **API Documentation**: https://alaaharoun-faster-whisper-api.hf.space/docs
+## 📈 Performance:
+- **Model**: Whisper base model with int8 quantization
+- **Processing**: Optimized for real-time transcription
+- **Memory**: Efficient memory usage for Hugging Face Spaces
+- **Concurrency**: Supports multiple concurrent requests
+## 🔒 Security:
+- **CORS**: Configured for cross-origin requests
+- **File Validation**: Strict file type and size validation
+- **Error Handling**: No sensitive information in error messages
+- **Authentication**: Optional API token support (currently disabled)
+## 📞 Support:
+For issues or questions:
+1. Check the health endpoint first
+2. Review server logs for detailed error messages
+3. Test with a simple audio file
+4. Verify file format and size requirements

app.py ADDED Viewed

	@@ -0,0 +1,306 @@

+from fastapi import FastAPI, UploadFile, File, Form, HTTPException, Depends
+from fastapi.responses import JSONResponse
+from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
+from fastapi.middleware.cors import CORSMiddleware
+from faster_whisper import WhisperModel
+import shutil
+import os
+import tempfile
+import sys
+from typing import Optional
+# Create FastAPI app
+app = FastAPI(
+    title="Faster Whisper Service",
+    description="High-performance speech-to-text service using Faster Whisper",
+    version="1.0.0"
+)
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Security
+security = HTTPBearer(auto_error=False)
+# Configuration
+API_TOKEN = ""
+REQUIRE_AUTH = False
+# Global model variable
+model = None
+def load_model():
+    """Load the Whisper model"""
+    global model
+    try:
+        print("🔄 Loading Whisper model...")
+        model = WhisperModel("base", compute_type="int8")
+        print("✅ Model loaded successfully")
+        return True
+    except Exception as e:
+        print(f"❌ Error loading model: {e}")
+        print(f"Python version: {sys.version}")
+        print(f"Current working directory: {os.getcwd()}")
+        model = None
+        return False
+def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
+    """Verify API token if authentication is required"""
+    if REQUIRE_AUTH:
+        if not credentials:
+            raise HTTPException(
+                status_code=401,
+                detail="API token required",
+                headers={"WWW-Authenticate": "Bearer"},
+            )
+        if credentials.credentials != API_TOKEN:
+            raise HTTPException(
+                status_code=403,
+                detail="Invalid API token",
+                headers={"WWW-Authenticate": "Bearer"},
+            )
+    return credentials
+@app.on_event("startup")
+async def startup_event():
+    """Load model on startup"""
+    load_model()
+@app.get("/")
+async def root():
+    """Root endpoint"""
+    return {"message": "Faster Whisper Service is running"}
+@app.get("/health")
+async def health_check(credentials: HTTPAuthorizationCredentials = Depends(verify_token)):
+    """Health check endpoint"""
+    return {
+        "status": "healthy",
+        "model_loaded": model is not None,
+        "service": "faster-whisper",
+        "auth_required": REQUIRE_AUTH,
+        "auth_configured": bool(API_TOKEN),
+        "vad_support": True,
+        "python_version": sys.version
+    }
+@app.post("/transcribe")
+async def transcribe(
+    file: UploadFile = File(...),
+    language: Optional[str] = Form(None),
+    task: Optional[str] = Form("transcribe"),
+    vad_filter: Optional[bool] = Form(False),
+    vad_parameters: Optional[str] = Form("threshold=0.5"),
+    credentials: HTTPAuthorizationCredentials = Depends(verify_token)
+):
+    """
+    Transcribe audio file to text with optional VAD support
+    """
+    temp_path = None
+    try:
+        print(f"🎵 Starting transcription for file: {file.filename}")
+        # Check if model is loaded
+        if model is None:
+            print("❌ Model not loaded")
+            return JSONResponse(
+                status_code=500,
+                content={"error": "Model not loaded", "success": False}
+            )
+        # Validate file
+        if not file.filename:
+            print("❌ No file provided")
+            return JSONResponse(
+                status_code=400,
+                content={"error": "No file provided", "success": False}
+            )
+        # Validate file size (25MB limit)
+        file.file.seek(0, 2)
+        file_size = file.file.tell()
+        file.file.seek(0)
+        print(f"📁 File size: {file_size} bytes")
+        if file_size > 25 * 1024 * 1024:  # 25MB
+            print("❌ File too large")
+            return JSONResponse(
+                status_code=400,
+                content={"error": "File too large. Maximum size is 25MB", "success": False}
+            )
+        # Create temporary file
+        print("📝 Creating temporary file...")
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.wav') as temp_file:
+            shutil.copyfileobj(file.file, temp_file)
+            temp_path = temp_file.name
+        print(f"✅ Temporary file created: {temp_path}")
+        # Parse VAD parameters
+        vad_threshold = 0.5  # default
+        if vad_filter and vad_parameters:
+            try:
+                for param in vad_parameters.split(','):
+                    if '=' in param:
+                        key, value = param.strip().split('=')
+                        if key == 'threshold':
+                            vad_threshold = float(value)
+            except Exception as e:
+                print(f"⚠️ Warning: Failed to parse VAD parameters: {e}")
+        # Transcribe audio
+        print("🎤 Starting transcription...")
+        if vad_filter:
+            print(f"🔊 Using VAD with threshold: {vad_threshold}")
+            try:
+                if language:
+                    segments, info = model.transcribe(
+                        temp_path,
+                        language=language,
+                        task=task,
+                        vad_filter=True,
+                        vad_parameters=f"threshold={vad_threshold}"
+                    )
+                else:
+                    segments, info = model.transcribe(
+                        temp_path,
+                        task=task,
+                        vad_filter=True,
+                        vad_parameters=f"threshold={vad_threshold}"
+                    )
+            except Exception as vad_error:
+                print(f"⚠️ VAD transcription failed, falling back to standard: {vad_error}")
+                if language:
+                    segments, info = model.transcribe(temp_path, language=language, task=task)
+                else:
+                    segments, info = model.transcribe(temp_path, task=task)
+        else:
+            if language:
+                segments, info = model.transcribe(temp_path, language=language, task=task)
+            else:
+                segments, info = model.transcribe(temp_path, task=task)
+        # Collect transcription results
+        transcription = " ".join([seg.text for seg in segments])
+        print(f"✅ Transcription completed: {len(transcription)} characters")
+        print(f"🌍 Detected language: {info.language} (probability: {info.language_probability:.2f})")
+        # Prepare response
+        response = {
+            "success": True,
+            "text": transcription,
+            "language": info.language,
+            "language_probability": info.language_probability,
+            "vad_enabled": vad_filter,
+            "vad_threshold": vad_threshold if vad_filter else None
+        }
+        return JSONResponse(content=response)
+    except Exception as e:
+        error_msg = str(e)
+        error_type = type(e).__name__
+        print(f"❌ Transcription error ({error_type}): {error_msg}")
+        return JSONResponse(
+            status_code=500,
+            content={
+                "error": error_msg,
+                "error_type": error_type,
+                "success": False
+            }
+        )
+    finally:
+        # Clean up temporary file
+        if temp_path and os.path.exists(temp_path):
+            try:
+                os.unlink(temp_path)
+                print(f"🧹 Cleaned up temporary file: {temp_path}")
+            except Exception as e:
+                print(f"⚠️ Warning: Failed to delete temp file: {e}")
+@app.post("/detect-language")
+async def detect_language(
+    file: UploadFile = File(...),
+    credentials: HTTPAuthorizationCredentials = Depends(verify_token)
+):
+    """
+    Detect the language of an audio file
+    """
+    temp_path = None
+    try:
+        print(f"🌍 Starting language detection for file: {file.filename}")
+        # Check if model is loaded
+        if model is None:
+            print("❌ Model not loaded")
+            return JSONResponse(
+                status_code=500,
+                content={"error": "Model not loaded", "success": False}
+            )
+        # Validate file
+        if not file.filename:
+            print("❌ No file provided")
+            return JSONResponse(
+                status_code=400,
+                content={"error": "No file provided", "success": False}
+            )
+        # Create temporary file
+        print("📝 Creating temporary file...")
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.wav') as temp_file:
+            shutil.copyfileobj(file.file, temp_file)
+            temp_path = temp_file.name
+        print(f"✅ Temporary file created: {temp_path}")
+        # Detect language
+        print("🌍 Detecting language...")
+        segments, info = model.transcribe(temp_path)
+        print(f"✅ Language detected: {info.language} (probability: {info.language_probability:.2f})")
+        return JSONResponse(content={
+            "success": True,
+            "language": info.language,
+            "language_probability": info.language_probability
+        })
+    except Exception as e:
+        error_msg = str(e)
+        error_type = type(e).__name__
+        print(f"❌ Language detection error ({error_type}): {error_msg}")
+        return JSONResponse(
+            status_code=500,
+            content={
+                "error": error_msg,
+                "error_type": error_type,
+                "success": False
+            }
+        )
+    finally:
+        # Clean up temporary file
+        if temp_path and os.path.exists(temp_path):
+            try:
+                os.unlink(temp_path)
+                print(f"🧹 Cleaned up temporary file: {temp_path}")
+            except Exception as e:
+                print(f"⚠️ Warning: Failed to delete temp file: {e}")
+# For Hugging Face Spaces compatibility
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=7860)

config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "sdk": "docker",
+  "app_file": "app.py"
+}

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,18 @@

+version: '3.8'
+services:
+  faster-whisper-api:
+    build: .
+    ports:
+      - "7860:7860"
+    environment:
+      - PYTHONUNBUFFERED=1
+    volumes:
+      - .:/app
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:7860/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+fastapi==0.104.1
+uvicorn==0.24.0
+faster-whisper==0.9.0
+python-multipart==0.0.6
+python-multipart