whisper / api_examples.md
cyberspyde's picture
update
3ef0477

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

API Usage Examples

This document provides examples of how to use the Whisper Uzbek STT API programmatically.

Prerequisites

Install the Gradio client:

pip install gradio-client

Python Examples

Basic Usage

from gradio_client import Client

# Connect to your Space
client = Client("YOUR_USERNAME/whisper-uzbek-stt")

# Transcribe an audio file
result = client.predict(
    "path/to/audio.mp3",
    api_name="/predict"
)

print(result)

Advanced Usage with Error Handling

from gradio_client import Client
import os

def transcribe_audio(audio_path, space_url):
    """Transcribe audio with error handling"""

    if not os.path.exists(audio_path):
        raise FileNotFoundError(f"Audio file not found: {audio_path}")

    try:
        client = Client(space_url)
        result = client.predict(audio_path, api_name="/predict")
        return result
    except Exception as e:
        print(f"Transcription error: {e}")
        return None

# Usage
space_url = "YOUR_USERNAME/whisper-uzbek-stt"
transcription = transcribe_audio("uzbek_speech.wav", space_url)

if transcription:
    print(f"Transcription: {transcription}")

Batch Processing

from gradio_client import Client
import os
from pathlib import Path

def batch_transcribe(audio_files, space_url):
    """Transcribe multiple audio files"""

    client = Client(space_url)
    results = {}

    for audio_file in audio_files:
        try:
            print(f"Processing: {audio_file}")
            result = client.predict(audio_file, api_name="/predict")
            results[audio_file] = result
            print(f"✓ Done: {audio_file}")
        except Exception as e:
            print(f"✗ Failed: {audio_file} - {e}")
            results[audio_file] = None

    return results

# Usage
audio_files = [
    "audio1.mp3",
    "audio2.wav",
    "audio3.m4a"
]

space_url = "YOUR_USERNAME/whisper-uzbek-stt"
results = batch_transcribe(audio_files, space_url)

# Print results
for file, transcription in results.items():
    print(f"\n{file}:")
    print(f"  {transcription}")

JavaScript/Node.js Example

const fs = require('fs');
const axios = require('axios');
const FormData = require('form-data');

async function transcribeAudio(audioPath, spaceUrl) {
    const form = new FormData();
    form.append('data', JSON.stringify([audioPath]));

    try {
        const response = await axios.post(
            `${spaceUrl}/api/predict`,
            form,
            {
                headers: form.getHeaders()
            }
        );

        return response.data.data[0];
    } catch (error) {
        console.error('Error:', error.message);
        return null;
    }
}

// Usage
const spaceUrl = 'https://huggingface.co/spaces/YOUR_USERNAME/whisper-uzbek-stt';
const audioPath = './audio.mp3';

transcribeAudio(audioPath, spaceUrl)
    .then(result => console.log('Transcription:', result));

cURL Example

Upload and Transcribe

curl -X POST "https://YOUR_USERNAME-whisper-uzbek-stt.hf.space/api/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "data": ["path/to/audio.mp3"]
  }'

Using a File Upload

# Save audio file first
audio_file="sample.mp3"

# Make API request
curl -X POST "https://YOUR_USERNAME-whisper-uzbek-stt.hf.space/api/predict" \
  -F "data=@${audio_file}"

Response Format

The API returns JSON with the following structure:

{
  "data": ["Transcribed text in Uzbek"],
  "duration": 2.5,
  "is_generating": false
}

Error Handling

Possible error responses:

No Audio Provided

{
  "data": ["⚠️ No audio provided. Please upload or record audio."]
}

Processing Error

{
  "data": ["❌ Error during transcription: <error message>"]
}

Rate Limiting

Hugging Face Spaces may have rate limits. For production use:

  • Implement retry logic with exponential backoff
  • Consider caching results
  • Monitor your Space's usage metrics

Best Practices

  1. File Formats: Supported formats include MP3, WAV, M4A, FLAC
  2. File Size: Keep files under 25MB for best performance
  3. Sample Rate: Any sample rate works (automatically resampled to 16kHz)
  4. Audio Quality: Higher quality audio = better transcription
  5. Language: Optimized for Uzbek language

Troubleshooting

Connection Issues

# Add timeout
from gradio_client import Client

client = Client("YOUR_SPACE_URL", timeout=60)

Large Files

# Use file upload instead of path
with open("large_audio.mp3", "rb") as f:
    result = client.predict(f, api_name="/predict")

Support

For issues or questions:

  • Check the Space logs on Hugging Face
  • Review the error messages in the response
  • Ensure your audio file is valid and accessible