WhisperAPI / README.md
LiamKhoaLe's picture
Upd name
0b9b474

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: Whisper Large V3 Turbo
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.46.0
pinned: false
license: mit
short_description: Whisper LV3Turbo for ASR -API

Whisper Large V3 Turbo - Speech Recognition

This Space provides a complete speech recognition solution using OpenAI's Whisper Large V3 Turbo model. Features a beautiful, modern web interface with direct transcription and download capabilities.

Features

  • 🚀 Fast Inference: Uses Whisper Large V3 Turbo (reduced from 32 to 4 decoding layers)
  • 🎯 High Accuracy: State-of-the-art speech recognition
  • 🌍 Multilingual: Supports 99 languages
  • 🎨 Modern UI: Beautiful, responsive Gradio interface
  • 📥 Download Support: Save transcriptions as text files
  • 🔄 API Endpoints: RESTful API for external integration
  • 🔒 CORS Enabled: Works with any frontend

Usage

Web Interface

Simply visit the Space and use the intuitive web interface:

  1. Upload an audio/video file or record directly
  2. Click Transcribe Audio to process
  3. View results in the scrollable area
  4. Download the transcription as a text file

API Endpoints

Transcribe Audio

POST /transcribe
Content-Type: multipart/form-data

# Send audio file as form data
curl -X POST "https://your-space.hf.space/transcribe" \
  -F "file=@audio.mp3"

Health Check

GET /health

curl "https://your-space.hf.space/health"

API Endpoints

POST /transcribe

Transcribe an audio/video file.

Request:

  • Method: POST
  • Content-Type: multipart/form-data
  • Body: File upload

Response:

{
  "text": "Transcribed text here...",
  "success": true
}

GET /health

Check API health status.

Response:

{
  "status": "healthy",
  "model_loaded": true
}

Usage Examples

cURL Example

curl -X POST "https://your-space.hf.space/transcribe" \
  -F "file=@audio.mp3"

Python Example

import requests

# Transcribe audio file
with open('audio.mp3', 'rb') as f:
    files = {'file': ('audio.mp3', f, 'audio/mpeg')}
    response = requests.post('https://your-space.hf.space/transcribe', files=files)
    result = response.json()
    
    if result['success']:
        print(result['text'])
    else:
        print(f"Error: {result['error']}")

JavaScript Example

const formData = new FormData();
formData.append('file', audioFile);

fetch('https://your-space.hf.space/transcribe', {
    method: 'POST',
    body: formData
})
.then(response => response.json())
.then(result => {
    if (result.success) {
        console.log(result.text);
    } else {
        console.error(result.error);
    }
});

Supported File Formats

Audio Formats

  • MP3, WAV, FLAC, M4A, OGG

Video Formats

  • MP4, AVI, MOV, MKV

Note: For video files, only the audio track will be processed.

Model Details

  • Model: Whisper Large V3 Turbo (809M parameters)
  • Speed: ~4x faster than standard Large V3
  • GPU: Uses ZeroGPU for efficient inference
  • Languages: Supports 99 languages
  • Accuracy: Maintains high accuracy despite speed optimizations

Example Response

{
  "text": "Hello, this is a transcription of the audio file.",
  "success": true
}

Troubleshooting

{
  "error": "Transcription failed: [error message]",
  "success": false
}