ASR-finetuning / README.md
saadmannan's picture
app file reviewed
b79357c
---
title: Whisper German ASR
emoji: πŸŽ™οΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
---
# πŸŽ™οΈ Whisper German ASR
Fine-tuned Whisper model for German Automatic Speech Recognition (ASR).
## Description
This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition.
## How to Use
1. **Upload Audio**: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.)
- OR -
2. **Record Audio**: Use the microphone button to record audio directly
3. **Transcribe**: Click the "Transcribe" button to generate the transcription
4. **View Results**: The transcription will appear on the right side
## Model Details
- **Base Model**: OpenAI Whisper-small (242M parameters)
- **Fine-tuned on**: German MINDS14 dataset
- **Language**: German (de)
- **Task**: Transcription
- **Performance**: ~13% Word Error Rate (WER)
## Features
- βœ… Upload audio files in various formats
- βœ… Record audio directly from microphone
- βœ… Real-time transcription
- βœ… Optimized for German language
- βœ… Support for audio up to 30 seconds
## Technical Specifications
- **Sample Rate**: 16kHz
- **Max Duration**: 30 seconds
- **Beam Search**: 5 beams
- **Device**: CPU/GPU auto-detection
## Tips for Best Results
- Speak clearly and at a moderate pace
- Minimize background noise
- Ensure audio is in German language
- Keep audio clips between 1-30 seconds for optimal results
## Links
- [GitHub Repository](https://github.com/YOUR_USERNAME/whisper-german-asr)
- [Model Card](https://huggingface.co/YOUR_USERNAME/whisper-small-german)
## License
MIT License
## Acknowledgments
- [OpenAI Whisper](https://github.com/openai/whisper) for the base model
- [Hugging Face](https://huggingface.co/) for Transformers library
- [PolyAI](https://huggingface.co/datasets/PolyAI/minds14) for the MINDS14 dataset