Spaces:
Sleeping
Sleeping
metadata
title: Faster Whisper API
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false
π€ Faster Whisper API - Fixed Version
π Latest Fixes Applied:
β Critical Bug Fixes:
- Fixed "name 'traceback' is not defined" error - Removed problematic traceback import
- Improved error handling - Better error messages and logging
- Enhanced CORS middleware - Better browser compatibility
- Added detailed logging - For easier debugging on Hugging Face Spaces
π§ Performance Improvements:
- Better file validation - 25MB file size limit
- Enhanced VAD support - Voice Activity Detection with fallback
- Improved model loading - Better error handling during startup
- Added health check endpoint - For monitoring service status
π Quick Start:
Health Check:
curl https://alaaharoun-faster-whisper-api.hf.space/health
Transcribe Audio (without VAD):
curl -X POST \
-F "file=@audio.wav" \
-F "language=en" \
-F "task=transcribe" \
https://alaaharoun-faster-whisper-api.hf.space/transcribe
Transcribe Audio (with VAD):
curl -X POST \
-F "file=@audio.wav" \
-F "language=en" \
-F "task=transcribe" \
-F "vad_filter=true" \
-F "vad_parameters=threshold=0.5" \
https://alaaharoun-faster-whisper-api.hf.space/transcribe
π Supported Parameters:
file: Audio file (WAV, MP3, M4A, FLAC, OGG, WEBM)language: Language code (optional, e.g., "en", "ar", "es")task: "transcribe" or "translate" (default: "transcribe")vad_filter: Enable Voice Activity Detection (default: false)vad_parameters: VAD parameters (default: "threshold=0.5")
π§ Response Format:
Success Response:
{
"success": true,
"text": "Transcribed text here",
"language": "en",
"language_probability": 0.95,
"vad_enabled": false,
"vad_threshold": null
}
Error Response:
{
"error": "Error message",
"error_type": "ExceptionType",
"success": false
}
π οΈ Local Development:
# Install dependencies
pip install -r requirements.txt
# Run the server
python app.py
Or with uvicorn:
uvicorn app:app --host 0.0.0.0 --port 7860
π Important Notes:
- Maximum file size: 25MB
- Supported formats: WAV, MP3, M4A, FLAC, OGG, WEBM
- VAD support: Configurable threshold with fallback mechanism
- Language detection: Automatic if not specified
- Error handling: Detailed error messages for debugging
π Troubleshooting:
Common Issues:
500 Internal Server Error:
- Check if the model is loaded properly
- Verify file format and size
- Check server logs for detailed error messages
VAD Issues:
- The service will automatically fallback to standard transcription
- Check VAD parameters format
File Upload Issues:
- Ensure file size is under 25MB
- Check file format compatibility
π Service URLs:
- Main Service: https://alaaharoun-faster-whisper-api.hf.space
- Health Check: https://alaaharoun-faster-whisper-api.hf.space/health
- API Documentation: https://alaaharoun-faster-whisper-api.hf.space/docs
π Performance:
- Model: Whisper base model with int8 quantization
- Processing: Optimized for real-time transcription
- Memory: Efficient memory usage for Hugging Face Spaces
- Concurrency: Supports multiple concurrent requests
π Security:
- CORS: Configured for cross-origin requests
- File Validation: Strict file type and size validation
- Error Handling: No sensitive information in error messages
- Authentication: Optional API token support (currently disabled)
π Support:
For issues or questions:
- Check the health endpoint first
- Review server logs for detailed error messages
- Test with a simple audio file
- Verify file format and size requirements