--- title: Whisper German ASR emoji: 🎙️ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: app.py pinned: false license: mit --- # 🎙️ Whisper German ASR Fine-tuned Whisper model for German Automatic Speech Recognition (ASR). ## Description This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition. ## How to Use 1. **Upload Audio**: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.) - OR - 2. **Record Audio**: Use the microphone button to record audio directly 3. **Transcribe**: Click the "Transcribe" button to generate the transcription 4. **View Results**: The transcription will appear on the right side ## Model Details - **Base Model**: OpenAI Whisper-small (242M parameters) - **Fine-tuned on**: German MINDS14 dataset - **Language**: German (de) - **Task**: Transcription - **Performance**: ~13% Word Error Rate (WER) ## Features - ✅ Upload audio files in various formats - ✅ Record audio directly from microphone - ✅ Real-time transcription - ✅ Optimized for German language - ✅ Support for audio up to 30 seconds ## Technical Specifications - **Sample Rate**: 16kHz - **Max Duration**: 30 seconds - **Beam Search**: 5 beams - **Device**: CPU/GPU auto-detection ## Tips for Best Results - Speak clearly and at a moderate pace - Minimize background noise - Ensure audio is in German language - Keep audio clips between 1-30 seconds for optimal results ## Links - [GitHub Repository](https://github.com/YOUR_USERNAME/whisper-german-asr) - [Model Card](https://huggingface.co/YOUR_USERNAME/whisper-small-german) ## License MIT License ## Acknowledgments - [OpenAI Whisper](https://github.com/openai/whisper) for the base model - [Hugging Face](https://huggingface.co/) for Transformers library - [PolyAI](https://huggingface.co/datasets/PolyAI/minds14) for the MINDS14 dataset