Model Card for Model ID
This Model Converts Afrikaans Speech to Afrikaans Text.
Model Details
Model Description
This model is a fine-tuned Whisper Tiny model optimized for Afrikaans automatic speech recognition (ASR). It was trained on Mozilla Common Voice Afrikaans data and later converted using faster-whisper / CTranslate2 for efficient inference.
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: rareRabbit
- Funded by [optional]: opensource
- Shared by [optional]: opensource
- Model type: WhisperForConditionalGeneration
- Language(s) (NLP): Afrikaans
- License: 2.0
- Finetuned from model [optional]: openai/whisper-tiny
Model Sources [optional]
- Repository: Hugging Face Hub
- Paper [optional]: https://arxiv.org/abs/2212.04356
- Demo [optional]: Not provided
Uses
This model transcribes spoken Afrikaans into text. It is suitable for accessibility tools, language documentation, subtitles, meeting transcription, and educational use cases involving Afrikaans speech.
Direct Use
Use the model directly for Afrikaans speech-to-text transcription with Hugging Face Transformers or faster-whisper for optimized inference.
Downstream Use [optional]
The model can be integrated into ASR pipelines, voice assistants, transcription platforms, or fine-tuned further for domain-specific Afrikaans speech.
Out-of-Scope Use
The model is not intended for real-time safety-critical systems, multilingual transcription, or highly accented/non-Afrikaans speech.
Bias, Risks, and Limitations
The model may underperform on strong regional accents, noisy audio, code-switching, or domains not represented in Common Voice. Biases present in the dataset may be reflected in transcription quality.
Recommendations
Use high-quality audio, apply text normalization, and validate outputs before production use—especially for legal or medical applications.
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Load the model using transformers or faster-whisper and enforce Afrikaans decoding for best results.
Training Details
Training Data
The model was trained on the Mozilla Common Voice Afrikaans dataset (cv-corpus-24.0) using only the train.tsv split.
Training Procedure
Preprocessing [optional]
Audio was resampled to 16 kHz. Text was normalized using Whisper’s BasicTextNormalizer during evaluation.
Training Hyperparameters
- Training regime:
- Precision: FP16 mixed precision
- Batch size: 8
- Gradient accumulation: 2
- Max steps: 500
Speeds, Sizes, Times [optional]
Evaluation
Testing Data, Factors & Metrics
Testing Data
Evaluation was performed on a subset of 100 samples from the training data.
Factors
Metrics
Word Error Rate (WER) after text normalization
Results
- Normalized WER: ~2.35%
- Raw WER: High (expected without normalization)
- Accuracy: ~97.6%
Summary
The model demonstrates strong Afrikaans transcription accuracy on clean speech but lacks independent test-set evaluation.
Model Examination [optional]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: Single GPU (consumer-grade)
- Hours used: <10
- Cloud Provider: Google Colab
- Compute Region: Unknown
- Carbon Emitted: Not measured
Technical Specifications [optional]
Model Architecture and Objective
Encoder-decoder Transformer trained for sequence-to-sequence speech recognition using cross-entropy loss.
Compute Infrastructure
Hardware
Single NVIDIA GPU (Colab)
Software
Python, PyTorch, Hugging Face Transformers, CTranslate2, faster-whisper
Citation [optional]
BibTeX:
APA:
Glossary [optional]
More Information [optional]
Model Card Authors [optional]
rareRabbit
Model Card Contact
Via Hugging Face profile
- Downloads last month
- 95
Model tree for rareRabbit/whisper-afrikaans-optimized
Base model
openai/whisper-tiny