Spaces:

saadmannan
/

ASR-finetuning

Sleeping

File size: 2,027 Bytes

aec49a5
b79357c
 
 
 
aec49a5
b79357c
aec49a5
 
b79357c
aec49a5
 
b79357c

---
title: Whisper German ASR
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
---

# 🎙️ Whisper German ASR

Fine-tuned Whisper model for German Automatic Speech Recognition (ASR).

## Description

This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition.

## How to Use

1. **Upload Audio**: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.)
   - OR -
2. **Record Audio**: Use the microphone button to record audio directly
3. **Transcribe**: Click the "Transcribe" button to generate the transcription
4. **View Results**: The transcription will appear on the right side

## Model Details

- **Base Model**: OpenAI Whisper-small (242M parameters)
- **Fine-tuned on**: German MINDS14 dataset
- **Language**: German (de)
- **Task**: Transcription
- **Performance**: ~13% Word Error Rate (WER)

## Features

- ✅ Upload audio files in various formats
- ✅ Record audio directly from microphone
- ✅ Real-time transcription
- ✅ Optimized for German language
- ✅ Support for audio up to 30 seconds

## Technical Specifications

- **Sample Rate**: 16kHz
- **Max Duration**: 30 seconds
- **Beam Search**: 5 beams
- **Device**: CPU/GPU auto-detection

## Tips for Best Results

- Speak clearly and at a moderate pace
- Minimize background noise
- Ensure audio is in German language
- Keep audio clips between 1-30 seconds for optimal results

## Links

- [GitHub Repository](https://github.com/YOUR_USERNAME/whisper-german-asr)
- [Model Card](https://huggingface.co/YOUR_USERNAME/whisper-small-german)

## License

MIT License

## Acknowledgments

- [OpenAI Whisper](https://github.com/openai/whisper) for the base model
- [Hugging Face](https://huggingface.co/) for Transformers library
- [PolyAI](https://huggingface.co/datasets/PolyAI/minds14) for the MINDS14 dataset