Spaces:
Sleeping
Sleeping
| title: Whisper German ASR | |
| emoji: ποΈ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.0.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # ποΈ Whisper German ASR | |
| Fine-tuned Whisper model for German Automatic Speech Recognition (ASR). | |
| ## Description | |
| This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition. | |
| ## How to Use | |
| 1. **Upload Audio**: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.) | |
| - OR - | |
| 2. **Record Audio**: Use the microphone button to record audio directly | |
| 3. **Transcribe**: Click the "Transcribe" button to generate the transcription | |
| 4. **View Results**: The transcription will appear on the right side | |
| ## Model Details | |
| - **Base Model**: OpenAI Whisper-small (242M parameters) | |
| - **Fine-tuned on**: German MINDS14 dataset | |
| - **Language**: German (de) | |
| - **Task**: Transcription | |
| - **Performance**: ~13% Word Error Rate (WER) | |
| ## Features | |
| - β Upload audio files in various formats | |
| - β Record audio directly from microphone | |
| - β Real-time transcription | |
| - β Optimized for German language | |
| - β Support for audio up to 30 seconds | |
| ## Technical Specifications | |
| - **Sample Rate**: 16kHz | |
| - **Max Duration**: 30 seconds | |
| - **Beam Search**: 5 beams | |
| - **Device**: CPU/GPU auto-detection | |
| ## Tips for Best Results | |
| - Speak clearly and at a moderate pace | |
| - Minimize background noise | |
| - Ensure audio is in German language | |
| - Keep audio clips between 1-30 seconds for optimal results | |
| ## Links | |
| - [GitHub Repository](https://github.com/YOUR_USERNAME/whisper-german-asr) | |
| - [Model Card](https://huggingface.co/YOUR_USERNAME/whisper-small-german) | |
| ## License | |
| MIT License | |
| ## Acknowledgments | |
| - [OpenAI Whisper](https://github.com/openai/whisper) for the base model | |
| - [Hugging Face](https://huggingface.co/) for Transformers library | |
| - [PolyAI](https://huggingface.co/datasets/PolyAI/minds14) for the MINDS14 dataset | |