Spaces:

Alaaharoun
/

faster-whisper-api

Sleeping

App Files Files Community

faster-whisper-api / README.md

Alaaharoun

Upload 7 files

9e4d788 verified 9 months ago

preview code

raw

history blame contribute delete

4.14 kB

metadata

title: Faster Whisper API
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false

🎤 Faster Whisper API - Fixed Version

🆕 Latest Fixes Applied:

✅ Critical Bug Fixes:

Fixed "name 'traceback' is not defined" error - Removed problematic traceback import
Improved error handling - Better error messages and logging
Enhanced CORS middleware - Better browser compatibility
Added detailed logging - For easier debugging on Hugging Face Spaces

🔧 Performance Improvements:

Better file validation - 25MB file size limit
Enhanced VAD support - Voice Activity Detection with fallback
Improved model loading - Better error handling during startup
Added health check endpoint - For monitoring service status

🚀 Quick Start:

Health Check:

curl https://alaaharoun-faster-whisper-api.hf.space/health

Transcribe Audio (without VAD):

curl -X POST \
  -F "file=@audio.wav" \
  -F "language=en" \
  -F "task=transcribe" \
  https://alaaharoun-faster-whisper-api.hf.space/transcribe

Transcribe Audio (with VAD):

curl -X POST \
  -F "file=@audio.wav" \
  -F "language=en" \
  -F "task=transcribe" \
  -F "vad_filter=true" \
  -F "vad_parameters=threshold=0.5" \
  https://alaaharoun-faster-whisper-api.hf.space/transcribe

📊 Supported Parameters:

file: Audio file (WAV, MP3, M4A, FLAC, OGG, WEBM)
language: Language code (optional, e.g., "en", "ar", "es")
task: "transcribe" or "translate" (default: "transcribe")
vad_filter: Enable Voice Activity Detection (default: false)
vad_parameters: VAD parameters (default: "threshold=0.5")

🔧 Response Format:

Success Response:

{
  "success": true,
  "text": "Transcribed text here",
  "language": "en",
  "language_probability": 0.95,
  "vad_enabled": false,
  "vad_threshold": null
}

Error Response:

{
  "error": "Error message",
  "error_type": "ExceptionType",
  "success": false
}

🛠️ Local Development:

# Install dependencies
pip install -r requirements.txt

# Run the server
python app.py

Or with uvicorn:

uvicorn app:app --host 0.0.0.0 --port 7860

📝 Important Notes:

Maximum file size: 25MB
Supported formats: WAV, MP3, M4A, FLAC, OGG, WEBM
VAD support: Configurable threshold with fallback mechanism
Language detection: Automatic if not specified
Error handling: Detailed error messages for debugging

🔍 Troubleshooting:

Common Issues:

500 Internal Server Error:
- Check if the model is loaded properly
- Verify file format and size
- Check server logs for detailed error messages
VAD Issues:
- The service will automatically fallback to standard transcription
- Check VAD parameters format
File Upload Issues:
- Ensure file size is under 25MB
- Check file format compatibility

🌐 Service URLs:

Main Service: https://alaaharoun-faster-whisper-api.hf.space
Health Check: https://alaaharoun-faster-whisper-api.hf.space/health
API Documentation: https://alaaharoun-faster-whisper-api.hf.space/docs

📈 Performance:

Model: Whisper base model with int8 quantization
Processing: Optimized for real-time transcription
Memory: Efficient memory usage for Hugging Face Spaces
Concurrency: Supports multiple concurrent requests

🔒 Security:

CORS: Configured for cross-origin requests
File Validation: Strict file type and size validation
Error Handling: No sensitive information in error messages
Authentication: Optional API token support (currently disabled)

📞 Support:

For issues or questions:

Check the health endpoint first
Review server logs for detailed error messages
Test with a simple audio file
Verify file format and size requirements