Spaces:

BinKhoaLe1812
/

WhisperAPI

Running on Zero

App Files Files Community

WhisperAPI / README.md

LiamKhoaLe

Upd name

0b9b474 2 months ago

preview code

raw

history blame contribute delete

3.41 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: Whisper Large V3 Turbo
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.46.0
pinned: false
license: mit
short_description: Whisper LV3Turbo for ASR -API

Whisper Large V3 Turbo - Speech Recognition

This Space provides a complete speech recognition solution using OpenAI's Whisper Large V3 Turbo model. Features a beautiful, modern web interface with direct transcription and download capabilities.

Features

🚀 Fast Inference: Uses Whisper Large V3 Turbo (reduced from 32 to 4 decoding layers)
🎯 High Accuracy: State-of-the-art speech recognition
🌍 Multilingual: Supports 99 languages
🎨 Modern UI: Beautiful, responsive Gradio interface
📥 Download Support: Save transcriptions as text files
🔄 API Endpoints: RESTful API for external integration
🔒 CORS Enabled: Works with any frontend

Usage

Web Interface

Simply visit the Space and use the intuitive web interface:

Upload an audio/video file or record directly
Click Transcribe Audio to process
View results in the scrollable area
Download the transcription as a text file

API Endpoints

Transcribe Audio

POST /transcribe
Content-Type: multipart/form-data

# Send audio file as form data
curl -X POST "https://your-space.hf.space/transcribe" \
  -F "file=@audio.mp3"

Health Check

GET /health

curl "https://your-space.hf.space/health"

API Endpoints

POST /transcribe

Transcribe an audio/video file.

Request:

Method: POST
Content-Type: multipart/form-data
Body: File upload

Response:

{
  "text": "Transcribed text here...",
  "success": true
}

GET /health

Check API health status.

Response:

{
  "status": "healthy",
  "model_loaded": true
}

Usage Examples

cURL Example

curl -X POST "https://your-space.hf.space/transcribe" \
  -F "file=@audio.mp3"

Python Example

import requests

# Transcribe audio file
with open('audio.mp3', 'rb') as f:
    files = {'file': ('audio.mp3', f, 'audio/mpeg')}
    response = requests.post('https://your-space.hf.space/transcribe', files=files)
    result = response.json()
    
    if result['success']:
        print(result['text'])
    else:
        print(f"Error: {result['error']}")

JavaScript Example

const formData = new FormData();
formData.append('file', audioFile);

fetch('https://your-space.hf.space/transcribe', {
    method: 'POST',
    body: formData
})
.then(response => response.json())
.then(result => {
    if (result.success) {
        console.log(result.text);
    } else {
        console.error(result.error);
    }
});

Supported File Formats

Audio Formats

MP3, WAV, FLAC, M4A, OGG

Video Formats

MP4, AVI, MOV, MKV

Note: For video files, only the audio track will be processed.

Model Details

Model: Whisper Large V3 Turbo (809M parameters)
Speed: ~4x faster than standard Large V3
GPU: Uses ZeroGPU for efficient inference
Languages: Supports 99 languages
Accuracy: Maintains high accuracy despite speed optimizations

Example Response

{
  "text": "Hello, this is a transcription of the audio file.",
  "success": true
}

Troubleshooting

{
  "error": "Transcription failed: [error message]",
  "success": false
}