Whisper Small FR - Radiologie

This model is a fine-tuned version of leduckhai/MultiMed-ST/asr/whisper-small-french on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0488
Wer: 7.5491

Model Description

The model focuses on two main adaptations:

Acoustic Adaptation: Capturing the phonetic nuances of French-speaking African regions to improve recognition of local accents.
Medical Terminology: Stabilizing technical radiology terms (Spine, Shoulder, Thorax, Mammography, CT scans) in a dictation context.

It uses LoRA (Low-Rank Adaptation) via the adapters library, specifically targeting the first 4 layers of the Encoder (for acoustic/accent adaptation) and the full Decoder (for medical jargon and linguistic structure).

Training and Evaluation Data

Training Dataset: ~4.5 hours of specialized radiology recordings (562 audios).

Intended uses & limitations

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 12
eval_batch_size: 8
seed: 3407
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 1000

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	3.0303	100	0.1033	23.0029
No log	6.0606	200	0.0697	14.2200
No log	9.0909	300	0.0556	14.6173
No log	12.1212	400	0.0482	8.4065
0.0838	15.1515	500	0.0479	8.4483
0.0838	18.1818	600	0.0483	8.9502
0.0838	21.2121	700	0.0484	8.6784
0.0838	24.2424	800	0.0483	7.6328
0.0838	27.2727	900	0.0485	8.8666
0.0001	30.3030	1000	0.0488	7.5491

Framework versions

Transformers 4.51.3
Pytorch 2.8.0+cu126
Datasets 4.4.2
Tokenizers 0.21.4

Citation

If you use this model in your research, please cite:

@misc{med-whisper-afrorad-fr,
  author = {StephaneBah},
  title = {Med-Whisper-AfroRad-FR: Medical Radiology ASR for Afro-French Context},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\\url{https://huggingface.co/StephaneBah/Med-Whisper-AfroRad-FR}}

Downloads last month: 20

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for StephaneBah/Med-Whisper-AfroRad-FR

Base model

facebook/m2m100_418M

Finetuned

leduckhai/MultiMed-ST

Finetuned

(1)

this model

Dataset used to train StephaneBah/Med-Whisper-AfroRad-FR

Evaluation results

Word Error Rate (WER)
self-reported
WER (Greedy) on Common Voice 11.0
test set self-reported
WER (Greedy) on Multilingual LibriSpeech (MLS)
test set self-reported
WER (Greedy) on VoxPopuli
test set self-reported
WER (Greedy) on Fleurs
test set self-reported
WER (Greedy) on African Accented French
test set self-reported