WhisperAPI / README.md
LiamKhoaLe's picture
Upd name
0b9b474
---
title: Whisper Large V3 Turbo
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.46.0
pinned: false
license: mit
short_description: Whisper LV3Turbo for ASR -API
---
# Whisper Large V3 Turbo - Speech Recognition
This Space provides a complete speech recognition solution using OpenAI's Whisper Large V3 Turbo model. Features a beautiful, modern web interface with direct transcription and download capabilities.
## Features
- 🚀 **Fast Inference**: Uses Whisper Large V3 Turbo (reduced from 32 to 4 decoding layers)
- 🎯 **High Accuracy**: State-of-the-art speech recognition
- 🌍 **Multilingual**: Supports 99 languages
- 🎨 **Modern UI**: Beautiful, responsive Gradio interface
- 📥 **Download Support**: Save transcriptions as text files
- 🔄 **API Endpoints**: RESTful API for external integration
- 🔒 **CORS Enabled**: Works with any frontend
## Usage
### Web Interface
Simply visit the Space and use the intuitive web interface:
1. **Upload** an audio/video file or **record** directly
2. Click **Transcribe Audio** to process
3. View results in the scrollable area
4. **Download** the transcription as a text file
### API Endpoints
#### Transcribe Audio
```bash
POST /transcribe
Content-Type: multipart/form-data
# Send audio file as form data
curl -X POST "https://your-space.hf.space/transcribe" \
-F "file=@audio.mp3"
```
#### Health Check
```bash
GET /health
curl "https://your-space.hf.space/health"
```
## API Endpoints
### POST /transcribe
Transcribe an audio/video file.
**Request:**
- Method: POST
- Content-Type: multipart/form-data
- Body: File upload
**Response:**
```json
{
"text": "Transcribed text here...",
"success": true
}
```
### GET /health
Check API health status.
**Response:**
```json
{
"status": "healthy",
"model_loaded": true
}
```
## Usage Examples
### cURL Example
```bash
curl -X POST "https://your-space.hf.space/transcribe" \
-F "file=@audio.mp3"
```
### Python Example
```python
import requests
# Transcribe audio file
with open('audio.mp3', 'rb') as f:
files = {'file': ('audio.mp3', f, 'audio/mpeg')}
response = requests.post('https://your-space.hf.space/transcribe', files=files)
result = response.json()
if result['success']:
print(result['text'])
else:
print(f"Error: {result['error']}")
```
### JavaScript Example
```javascript
const formData = new FormData();
formData.append('file', audioFile);
fetch('https://your-space.hf.space/transcribe', {
method: 'POST',
body: formData
})
.then(response => response.json())
.then(result => {
if (result.success) {
console.log(result.text);
} else {
console.error(result.error);
}
});
```
## Supported File Formats
### Audio Formats
- MP3, WAV, FLAC, M4A, OGG
### Video Formats
- MP4, AVI, MOV, MKV
*Note: For video files, only the audio track will be processed.*
## Model Details
- **Model**: Whisper Large V3 Turbo (809M parameters)
- **Speed**: ~4x faster than standard Large V3
- **GPU**: Uses ZeroGPU for efficient inference
- **Languages**: Supports 99 languages
- **Accuracy**: Maintains high accuracy despite speed optimizations
## Example Response
```json
{
"text": "Hello, this is a transcription of the audio file.",
"success": true
}
```
## Troubleshooting
```json
{
"error": "Transcription failed: [error message]",
"success": false
}
```