Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.3.0
metadata
title: Whisper German ASR
emoji: ποΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
ποΈ Whisper German ASR
Fine-tuned Whisper model for German Automatic Speech Recognition (ASR).
Description
This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition.
How to Use
- Upload Audio: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.)
- OR -
- Record Audio: Use the microphone button to record audio directly
- Transcribe: Click the "Transcribe" button to generate the transcription
- View Results: The transcription will appear on the right side
Model Details
- Base Model: OpenAI Whisper-small (242M parameters)
- Fine-tuned on: German MINDS14 dataset
- Language: German (de)
- Task: Transcription
- Performance: ~13% Word Error Rate (WER)
Features
- β Upload audio files in various formats
- β Record audio directly from microphone
- β Real-time transcription
- β Optimized for German language
- β Support for audio up to 30 seconds
Technical Specifications
- Sample Rate: 16kHz
- Max Duration: 30 seconds
- Beam Search: 5 beams
- Device: CPU/GPU auto-detection
Tips for Best Results
- Speak clearly and at a moderate pace
- Minimize background noise
- Ensure audio is in German language
- Keep audio clips between 1-30 seconds for optimal results
Links
License
MIT License
Acknowledgments
- OpenAI Whisper for the base model
- Hugging Face for Transformers library
- PolyAI for the MINDS14 dataset