Whisper Small FR - Radiologie (AfroRad)
This model is a fine-tuned version of openai/whisper-small adapted for medical radiology dictation in the Afro-French context. It was specifically optimized for French-speaking African regions.
Model Description
The model focuses on two main adaptations:
- Acoustic Adaptation: Capturing the phonetic nuances of French-speaking African regions to improve recognition of local accents.
- Medical Terminology: Stabilizing technical radiology terms (Spine, Shoulder, Thorax, Mammography, CT scans) in a dictation context.
It uses LoRA (Low-Rank Adaptation) via the adapters library, specifically targeting the first 4 layers of the Encoder (for acoustic/accent adaptation) and the full Decoder (for medical jargon and linguistic structure).
Training and Evaluation Data
- Training Dataset: ~4.5 hours of specialized radiology recordings (562 audios).
Training Procedure
Training Hyperparameters
- Learning Rate: 3e-5 (Global)
- Encoder (L0-L3): 8e-6
- Decoder: 4e-5
- Optimizer: AdamW 8-bit (
bnb.optim.AdamW8bit) - Batch Size: 12 (train), 8 (eval)
- Max Steps: 1,700
Training Results
| Training Loss | Epoch | Step | Validation Loss | WER (%) |
|---|---|---|---|---|
| No log | 3.03 | 100 | 0.0778 | 133.333 |
| No log | 6.06 | 200 | 0.0579 | 14.2827 |
| No log | 9.09 | 300 | 0.0542 | 15.6211 |
| No log | 12.12 | 400 | 0.0514 | 7.42367 |
| 0.0761 | 15.15 | 500 | 0.0465 | 8.42744 |
| No log | 18.18 | 600 | 0.0450 | 6.41991 |
| No log | 21.21 | 700 | 0.0457 | 6.44082 |
| No log | 24.24 | 800 | 0.0458 | 6.37808 |
| No log | 27.27 | 900 | 0.0458 | 6.29444 |
| 0.0003 | 30.30 | 1000 | 0.0464 | 8.26014 |
| No log | 33.33 | 1100 | 0.0466 | 8.30197 |
| No log | 36.36 | 1200 | 0.0466 | 8.23923 |
| No log | 39.39 | 1300 | 0.0468 | 8.19741 |
| 0.0001 | 42.42 | 1400 | 0.0468 | 8.19741 |
Final Performance:
- Best WER: 6.29% (Step 900)
- Final WER: 8.19% (Step 1400)
The model shows strong convergence with excellent generalization to unseen medical terminology and regional accent variations.
Framework Versions
- Transformers 4.47.0+
- Adapters 1.0.0+
- PyTorch 2.6.0+
- Datasets 3.6.0
- Python 3.10+
Citation
If you use this model in your research, please cite:
@misc{whisper-afrorad-fr,
author = {StephaneBah},
title = {Whisper-AfroRad-FR: Medical Radiology ASR for Afro-French Context},
year = {2026},
publisher = {Hugging Face},
howpublished = {\\url{https://huggingface.co/StephaneBah/Whisper-AfroRad-FR}}
}
- Downloads last month
- 124
Model tree for StephaneBah/Whisper-AfroRad-FR
Base model
openai/whisper-smallEvaluation results
- Word Error Rate (WER)self-reportednull
- WER (Greedy) on Common Voice 11.0test set self-reportednull
- WER (Greedy) on Multilingual LibriSpeech (MLS)test set self-reportednull
- WER (Greedy) on VoxPopulitest set self-reportednull
- WER (Greedy) on Fleurstest set self-reportednull
- WER (Greedy) on African Accented Frenchtest set self-reportednull