File size: 2,027 Bytes
aec49a5
b79357c
 
 
 
aec49a5
b79357c
aec49a5
 
b79357c
aec49a5
 
b79357c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
title: Whisper German ASR
emoji: πŸŽ™οΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
---

# πŸŽ™οΈ Whisper German ASR

Fine-tuned Whisper model for German Automatic Speech Recognition (ASR).

## Description

This Space provides an interactive interface for transcribing German audio using a fine-tuned version of OpenAI's Whisper-small model. The model has been specifically optimized for German speech recognition.

## How to Use

1. **Upload Audio**: Click on the audio input area to upload an audio file (WAV, MP3, FLAC, etc.)
   - OR -
2. **Record Audio**: Use the microphone button to record audio directly
3. **Transcribe**: Click the "Transcribe" button to generate the transcription
4. **View Results**: The transcription will appear on the right side

## Model Details

- **Base Model**: OpenAI Whisper-small (242M parameters)
- **Fine-tuned on**: German MINDS14 dataset
- **Language**: German (de)
- **Task**: Transcription
- **Performance**: ~13% Word Error Rate (WER)

## Features

- βœ… Upload audio files in various formats
- βœ… Record audio directly from microphone
- βœ… Real-time transcription
- βœ… Optimized for German language
- βœ… Support for audio up to 30 seconds

## Technical Specifications

- **Sample Rate**: 16kHz
- **Max Duration**: 30 seconds
- **Beam Search**: 5 beams
- **Device**: CPU/GPU auto-detection

## Tips for Best Results

- Speak clearly and at a moderate pace
- Minimize background noise
- Ensure audio is in German language
- Keep audio clips between 1-30 seconds for optimal results

## Links

- [GitHub Repository](https://github.com/YOUR_USERNAME/whisper-german-asr)
- [Model Card](https://huggingface.co/YOUR_USERNAME/whisper-small-german)

## License

MIT License

## Acknowledgments

- [OpenAI Whisper](https://github.com/openai/whisper) for the base model
- [Hugging Face](https://huggingface.co/) for Transformers library
- [PolyAI](https://huggingface.co/datasets/PolyAI/minds14) for the MINDS14 dataset