File size: 2,663 Bytes
7e6d32b e339d4b e60b603 7e6d32b 29d6635 7e6d32b 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 80f1927 310e658 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 2af44d0 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e7df9a7 62f6cd5 e339d4b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
---
language: kl
license: mit
model_name: VoiceLessQ/whisper-tiny-kalaallisut
tags:
- whisper
- fine-tuning
- kalaallisut
- speech-recognition
- openai-whisper
model_type: speech-to-text
widget:
- src: path_to_sample_audio_file.wav
---
This model still spits gibberish and not good enough. Still gonna add more to this model for a while and see if its improving.
# Whisper Tiny Fine-Tuned on Kalaallisut (Greenlandic) π
This is a fine-tuned version of the [Whisper Tiny](https://huggingface.co/openai/whisper-tiny) model by OpenAI, adapted to the **Kalaallisut** (Greenlandic) language. The model has been trained and optimized to handle transcriptions specifically for this language, which is historically underrepresented in speech recognition models.
### π Training Process
This model was carefully trained on a dataset of **Kalaallisut** audio files paired with transcriptions. Special care was taken to avoid overfitting, which occurred in earlier versions of this fine-tuning process. After reworking the training approach, including tweaking hyperparameters and employing early stopping to monitor model performance, the final **Word Error Rate (WER)** was reduced significantly to:
1.81%
### βοΈ Features and Improvements
- **Reduced Overfitting**: This version addresses overfitting by employing early stopping with fine-tuned patience and threshold settings to halt training when improvements stalled, ensuring the model generalized better to unseen data.
- **Kalaallisut Language Support**: Whisper's multi-lingual capabilities are fine-tuned specifically for the unique phonetics and structure of Kalaallisut.
- **Optimized for Whisper Tiny**: Even though this model is based on the smallest variant of Whisper (Tiny), it still achieves strong performance in transcription tasks for Kalaallisut.
### π Performance Metrics
- **Word Error Rate (WER)**: 1.81%
- **Train Loss**: 0.77 after 50 epochs
Usually trigged by Early Stopping Criteria incoded to the code.
### How to Use
```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
# Load the processor and model
processor = WhisperProcessor.from_pretrained("VoiceLessQ/whisper-tiny-kalaallisut")
model = WhisperForConditionalGeneration.from_pretrained("VoiceLessQ/whisper-tiny-kalaallisut")
# Load audio (example usage)
audio_file = "path_to_audio_file.wav"
input_features = processor(audio_file, return_tensors="pt").input_features
# Generate transcription
with torch.no_grad():
generated_ids = model.generate(input_features)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(transcription) |