Spaces:

cyberspyde
/

whisper

Sleeping

File size: 4,876 Bytes

3ef0477

# API Usage Examples

This document provides examples of how to use the Whisper Uzbek STT API programmatically.

## Prerequisites

Install the Gradio client:
```bash
pip install gradio-client
```

---

## Python Examples

### Basic Usage

```python
from gradio_client import Client

# Connect to your Space
client = Client("YOUR_USERNAME/whisper-uzbek-stt")

# Transcribe an audio file
result = client.predict(
    "path/to/audio.mp3",
    api_name="/predict"
)

print(result)
```

### Advanced Usage with Error Handling

```python
from gradio_client import Client
import os

def transcribe_audio(audio_path, space_url):
    """Transcribe audio with error handling"""

    if not os.path.exists(audio_path):
        raise FileNotFoundError(f"Audio file not found: {audio_path}")

    try:
        client = Client(space_url)
        result = client.predict(audio_path, api_name="/predict")
        return result
    except Exception as e:
        print(f"Transcription error: {e}")
        return None

# Usage
space_url = "YOUR_USERNAME/whisper-uzbek-stt"
transcription = transcribe_audio("uzbek_speech.wav", space_url)

if transcription:
    print(f"Transcription: {transcription}")
```

### Batch Processing

```python
from gradio_client import Client
import os
from pathlib import Path

def batch_transcribe(audio_files, space_url):
    """Transcribe multiple audio files"""

    client = Client(space_url)
    results = {}

    for audio_file in audio_files:
        try:
            print(f"Processing: {audio_file}")
            result = client.predict(audio_file, api_name="/predict")
            results[audio_file] = result
            print(f"✓ Done: {audio_file}")
        except Exception as e:
            print(f"✗ Failed: {audio_file} - {e}")
            results[audio_file] = None

    return results

# Usage
audio_files = [
    "audio1.mp3",
    "audio2.wav",
    "audio3.m4a"
]

space_url = "YOUR_USERNAME/whisper-uzbek-stt"
results = batch_transcribe(audio_files, space_url)

# Print results
for file, transcription in results.items():
    print(f"\n{file}:")
    print(f"  {transcription}")
```

---

## JavaScript/Node.js Example

```javascript
const fs = require('fs');
const axios = require('axios');
const FormData = require('form-data');

async function transcribeAudio(audioPath, spaceUrl) {
    const form = new FormData();
    form.append('data', JSON.stringify([audioPath]));

    try {
        const response = await axios.post(
            `${spaceUrl}/api/predict`,
            form,
            {
                headers: form.getHeaders()
            }
        );

        return response.data.data[0];
    } catch (error) {
        console.error('Error:', error.message);
        return null;
    }
}

// Usage
const spaceUrl = 'https://huggingface.co/spaces/YOUR_USERNAME/whisper-uzbek-stt';
const audioPath = './audio.mp3';

transcribeAudio(audioPath, spaceUrl)
    .then(result => console.log('Transcription:', result));
```

---

## cURL Example

### Upload and Transcribe

```bash
curl -X POST "https://YOUR_USERNAME-whisper-uzbek-stt.hf.space/api/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "data": ["path/to/audio.mp3"]
  }'
```

### Using a File Upload

```bash
# Save audio file first
audio_file="sample.mp3"

# Make API request
curl -X POST "https://YOUR_USERNAME-whisper-uzbek-stt.hf.space/api/predict" \
  -F "data=@${audio_file}"
```

---

## Response Format

The API returns JSON with the following structure:

```json
{
  "data": ["Transcribed text in Uzbek"],
  "duration": 2.5,
  "is_generating": false
}
```

---

## Error Handling

Possible error responses:

### No Audio Provided
```json
{
  "data": ["⚠️ No audio provided. Please upload or record audio."]
}
```

### Processing Error
```json
{
  "data": ["❌ Error during transcription: <error message>"]
}
```

---

## Rate Limiting

Hugging Face Spaces may have rate limits. For production use:
- Implement retry logic with exponential backoff
- Consider caching results
- Monitor your Space's usage metrics

---

## Best Practices

1. **File Formats**: Supported formats include MP3, WAV, M4A, FLAC
2. **File Size**: Keep files under 25MB for best performance
3. **Sample Rate**: Any sample rate works (automatically resampled to 16kHz)
4. **Audio Quality**: Higher quality audio = better transcription
5. **Language**: Optimized for Uzbek language

---

## Troubleshooting

### Connection Issues
```python
# Add timeout
from gradio_client import Client

client = Client("YOUR_SPACE_URL", timeout=60)
```

### Large Files
```python
# Use file upload instead of path
with open("large_audio.mp3", "rb") as f:
    result = client.predict(f, api_name="/predict")
```

---

## Support

For issues or questions:
- Check the Space logs on Hugging Face
- Review the error messages in the response
- Ensure your audio file is valid and accessible