Spaces:
Running
title: whisper.api
emoji: πΆβπ«οΈ
colorFrom: purple
colorTo: gray
sdk: docker
app_file: Dockerfile
app_port: 7860
Whisper API - Speech to Text Transcription
This open source project provides a self-hostable API for speech to text transcription using a finetuned Whisper ASR model. The API allows you to easily convert audio files to text through HTTP requests. Ideal for adding speech recognition capabilities to your applications.
Key features:
- Uses a finetuned Whisper model for accurate speech recognition
- Simple HTTP API for audio file transcription
- User level access with API keys for managing usage
- Self-hostable code for your own speech transcription service
- Quantized model optimization for fast and efficient inference
- Asynchronous Processing: Non-blocking transcription for high availability
- Concurrency Control: Built-in request queuing to prevent server overload
- Open source implementation for customization and transparency
This repository contains code to deploy the API server along with finetuning and quantizing models. Check out the documentation for getting started!
Installation
To install the necessary dependencies and setup the Whisper binary, follow these steps:
1. System Dependencies
Install ffmpeg for audio processing and build tools (make, cmake, g++) for compiling Whisper.
# Ubuntu/Debian
sudo apt install ffmpeg git make cmake g++
# macOS
brew install ffmpeg cmake
2. Python Dependencies
Install the required Python packages.
pip install -r requirements.txt
3. Setup Environment
Copy the example environment file and configure it:
cp .env.example .env
# Edit .env with your database credentials and settings
# Optional: Set MAX_CONCURRENT_TRANSCRIPTIONS (default: 2) in .env to control parallel jobs
4. Setup Whisper
Run the setup script to clone, build, and configure the Whisper binary.
chmod +x setup_whisper.sh
./setup_whisper.sh
Running the Project
Run Locally (without Docker)
To run the project locally (e.g., inside a Conda environment or virtualenv):
# Ensure your environment is active (e.g., conda activate whisper-api)
uvicorn app.main:app --host 0.0.0.0 --port 7860 --reload
Docker (Production)
To run the project using Docker:
# Build the image
docker build -t whisper-api .
# Run the container (ensure env vars are passed or secrets used)
# For local testing with .env file:
docker run --env-file .env -p 7860:7860 whisper-api
Get Your token
To get your token, use the following command:
curl -X 'POST' \
'http://localhost:8000/api/v1/users/get_token' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"email": "example@domain.com",
"password": "password"
}'
Example to Transcribe a File
To upload a file and transcribe it, use the following command: Note: The token is a dummy token and will not work. Please use the token provided by the admin.
Here are the available models:
- tiny.en
- tiny.en.q5
- base.en.q5
# Modify the token and audioFilePath
curl -X 'POST' \
'http://localhost:8000/api/v1/transcribe/?model=tiny.en.q5' \
-H 'accept: application/json' \
-H 'Authentication: e9b7658aa93342c492fa64153849c68b8md9uBmaqCwKq4VcgkuBD0G54FmsE8JT' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@audioFilePath.wav;type=audio/wav'
License
Reference & Credits
- https://github.com/openai/whisper
- https://openai.com/blog/whisper/
- https://github.com/ggerganov/whisper.cpp
Authors
π About Me
Just try to be a developer!
Support
For support, email vedgupta@protonmail.com