Fine-Tuned Whisper-small Model for French ASR

This model is a fine-tuned version of openai/whisper-small, trained on french version of CV17 dataset

Live demo

Click here (press restart to run the space)

  • Then you have two options: Either upload a French audio or record yourself speaking French by clicking on the mic and then the orange dot.

  • Hit submit and the model will output the transcription.

Performance and Evaluation

  • WER (Word Error Rate): Measures the percentage of words incorrectly predicted.
  • CER (Character Error Rate): Measures the percentage of characters incorrectly predicted.

Test Set: CV17(16k samples)

Model WER (lower is better) CER (lower is better)
Whisper Small (baseline) 0.3405 0.1680
Whisper Medium (baseline) 0.2597 0.1264
My Model 0.1648 0.0676

Test Set: MLS (2426 samples)

Model WER (lower is better) CER (lower is better)
Whisper Small (baseline) 0.3271 0.1066
Whisper Medium (baseline) 0.2974 0.0919
My Model 0.3269 0.1013

Usage

import torch

from datasets import load_dataset
from transformers import pipeline

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Load pipeline
pipe = pipeline("automatic-speech-recognition", model="nambn0321/ASR_french_3", device=device)


pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language="fr", task="transcribe")

# Load data (this is an example but when you load your own data, make sure to use torchaudio or librosa to load the audio into the dataset)
ds_mcv_test = load_dataset("mozilla-foundation/common_voice_11_0", "fr", split="test", streaming=True)
test_segment = next(iter(ds_mcv_test))
waveform = test_segment["audio"]

# Run
generated_sentences = pipe(waveform, max_new_tokens=225)["text"]  # greedy
# generated_sentences = pipe(waveform, max_new_tokens=225, generate_kwargs={"num_beams": 5})["text"]  # beam search

NOM

Downloads last month
5
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nambn0321/ASR_french_3

Finetuned
(3347)
this model

Dataset used to train nambn0321/ASR_french_3

Spaces using nambn0321/ASR_french_3 2

Collection including nambn0321/ASR_french_3