Spaces:

BinKhoaLe1812
/

WhisperAPI

Running on Zero

App Files Files Community

WhisperAPI / README.md

LiamKhoaLe

Upd name

0b9b474 2 months ago

preview code

raw

history blame contribute delete

3.41 kB

	---
	title: Whisper Large V3 Turbo
	emoji: 🎤
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.46.0
	pinned: false
	license: mit
	short_description: Whisper LV3Turbo for ASR -API
	---

	# Whisper Large V3 Turbo - Speech Recognition

	This Space provides a complete speech recognition solution using OpenAI's Whisper Large V3 Turbo model. Features a beautiful, modern web interface with direct transcription and download capabilities.

	## Features

	- 🚀 Fast Inference: Uses Whisper Large V3 Turbo (reduced from 32 to 4 decoding layers)
	- 🎯 High Accuracy: State-of-the-art speech recognition
	- 🌍 Multilingual: Supports 99 languages
	- 🎨 Modern UI: Beautiful, responsive Gradio interface
	- 📥 Download Support: Save transcriptions as text files
	- 🔄 API Endpoints: RESTful API for external integration
	- 🔒 CORS Enabled: Works with any frontend

	## Usage

	### Web Interface
	Simply visit the Space and use the intuitive web interface:
	1. Upload an audio/video file or record directly
	2. Click Transcribe Audio to process
	3. View results in the scrollable area
	4. Download the transcription as a text file

	### API Endpoints

	#### Transcribe Audio
	```bash
	POST /transcribe
	Content-Type: multipart/form-data

	# Send audio file as form data
	curl -X POST "https://your-space.hf.space/transcribe" \
	-F "file=@audio.mp3"
	```

	#### Health Check
	```bash
	GET /health

	curl "https://your-space.hf.space/health"
	```

	## API Endpoints

	### POST /transcribe
	Transcribe an audio/video file.

	Request:
	- Method: POST
	- Content-Type: multipart/form-data
	- Body: File upload

	Response:
	```json
	{
	"text": "Transcribed text here...",
	"success": true
	}
	```

	### GET /health
	Check API health status.

	Response:
	```json
	{
	"status": "healthy",
	"model_loaded": true
	}
	```

	## Usage Examples

	### cURL Example
	```bash
	curl -X POST "https://your-space.hf.space/transcribe" \
	-F "file=@audio.mp3"
	```

	### Python Example
	```python
	import requests

	# Transcribe audio file
	with open('audio.mp3', 'rb') as f:
	files = {'file': ('audio.mp3', f, 'audio/mpeg')}
	response = requests.post('https://your-space.hf.space/transcribe', files=files)
	result = response.json()

	if result['success']:
	print(result['text'])
	else:
	print(f"Error: {result['error']}")
	```

	### JavaScript Example
	```javascript
	const formData = new FormData();
	formData.append('file', audioFile);

	fetch('https://your-space.hf.space/transcribe', {
	method: 'POST',
	body: formData
	})
	.then(response => response.json())
	.then(result => {
	if (result.success) {
	console.log(result.text);
	} else {
	console.error(result.error);
	}
	});
	```

	## Supported File Formats

	### Audio Formats
	- MP3, WAV, FLAC, M4A, OGG

	### Video Formats
	- MP4, AVI, MOV, MKV

	Note: For video files, only the audio track will be processed.

	## Model Details

	- Model: Whisper Large V3 Turbo (809M parameters)
	- Speed: ~4x faster than standard Large V3
	- GPU: Uses ZeroGPU for efficient inference
	- Languages: Supports 99 languages
	- Accuracy: Maintains high accuracy despite speed optimizations

	## Example Response

	```json
	{
	"text": "Hello, this is a transcription of the audio file.",
	"success": true
	}
	```

	## Troubleshooting

	```json
	{
	"error": "Transcription failed: [error message]",
	"success": false
	}
	```