Moonshine-Tiny-FR: French Speech Recognition Model

Fine-tuned Moonshine ASR model for French language

This is a fine-tuned version of UsefulSensors/moonshine-tiny specifically optimized for French speech recognition. The model achieves state-of-the-art performance for its size (27M parameters) on French ASR tasks.

Links:

Usage

Installation

pip install --upgrade pip
pip install --upgrade transformers datasets[audio]

Basic Usage

from transformers import MoonshineForConditionalGeneration, AutoProcessor
import torch
import torchaudio

# Load model and processor
model = MoonshineForConditionalGeneration.from_pretrained('Cornebidouil/moonshine-tiny-fr')
processor = AutoProcessor.from_pretrained('Cornebidouil/moonshine-tiny-fr')

# Load and resample audio to 16kHz
audio, sr = torchaudio.load("french_audio.wav")
if sr != 16000:
    audio = torchaudio.functional.resample(audio, sr, 16000)
audio = audio[0].numpy()  # Convert to mono

# Prepare inputs
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

# Generate transcription
# Calculate max_new_tokens to avoid truncation (5 tokens per second is optimal for French)
audio_duration = len(audio) / 16000
max_new_tokens = int(audio_duration * 5)

generated_ids = model.generate(**inputs, max_new_tokens=max_new_tokens)
transcription = processor.decode(generated_ids[0], skip_special_tokens=True)
print(transcription)

Advanced Usage

For production deployments with:

  • Live transcription with Voice Activity Detection
  • ONNX optimization (20-30% faster)
  • Batch processing scripts
  • Complete inference pipeline

See the included inference.py script in the fine-tuning guide.

Model Details

Model Description

  • Base Model: UsefulSensors/moonshine-tiny
  • Language: French (fr)
  • Model Size: 27M parameters
  • Fine-tuned on: Multilingual LibriSpeech (MLS) French dataset specifically segmented for the requirements of the moonshine model
  • Training Duration: 8,000 steps
  • Optimizer: Schedule-free AdamW
  • License: MIT

Model Architecture

Moonshine is a compact sequence-to-sequence ASR model designed for efficient on-device inference:

  • Encoder: Convolutional feature extraction + Transformer blocks
  • Decoder: Autoregressive Transformer decoder
  • Parameters: 27M (tiny variant)
  • Input: 16kHz mono audio
  • Output: French text transcription

Performance

Evaluation Metrics

Evaluated on Multilingual LibriSpeech (MLS) French test set:

Metric Score
Word Error Rate (WER) 21.8%
Character Error Rate (CER) ~10%
Real-Time Factor (RTF) 0.11x (CPU)

Inference Speed: ~9x faster than real-time on CPU, enabling live transcription.

Comparison

Model Size Language WER (MLS-FR)
Whisper-tiny 39M Multilingual ~25%
Moonshine-tiny-fr 27M French 21.8%
Whisper-base 74M Multilingual ~18%

Moonshine-tiny-fr achieves competitive performance with 30% fewer parameters than Whisper-tiny. While being a proof of concept. More work should be done to create a proper and robust dataset.

Training Details / Fine tuning

Please refer to my Github repo for the training procedure :

Use Cases

Primary Applications

French Speech Recognition

  • Real-time transcription
  • Audio file transcription
  • Voice commands
  • Accessibility tools

Resource-Constrained Environments

  • On-device transcription (mobile, edge devices)
  • Low-latency applications
  • Offline transcription

Hogwarts Legacy SpellCaster

Limitations and Biases

Known Limitations for this tiny model

  • Hallucination: Like all seq2seq models, may generate text not present in audio
  • Repetition: May repeat phrases, especially with greedy decoding (use beam search)
  • Short Segments: Performance may degrade on very short audio clips (<0.5s)
  • Domain Specificity: Trained primarily on audiobooks (read speech)
  • Accents: Best performance on metropolitan French; regional accents may have higher WER
  • Background Noise: Performance degrades with significant background noise

Model Card Author

Pierre Chéneau (Cornebidouil)

Geologist, Developer and maintainer of this fine-tuned French model.

Links:

Citations

This Model

@misc{cheneau2026moonshine-tiny-fr,
  author = {Pierre Chéneau (Cornebidouil)},
  title = {Moonshine-Tiny-FR: Fine-tuned French Speech Recognition},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Cornebidouil/moonshine-tiny-fr}
}

Fine tuning Guide

@misc{cheneau2026moonshine-finetune,
  author = {Pierre Chéneau (Cornebidouil)},
  title = {Moonshine ASR Fine-Tuning Guide},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/pierre-cheneau/finetune-moonshine-asr}
}

Original Moonshine Model

@misc{jeffries2024moonshinespeechrecognitionlive,
      title={Moonshine: Speech Recognition for Live Transcription and Voice Commands},
      author={Nat Jeffries and Evan King and Manjunath Kudlur and Guy Nicholson and James Wang and Pete Warden},
      year={2024},
      eprint={2410.15608},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2410.15608},
}

Multilingual LibriSpeech Dataset

@inproceedings{panayotov2015librispeech,
  title={Multilingual LibriSpeech: A Corpus for Speech Recognition in Multiple Languages},
  author={Pratap, Vineel and Xu, Qiantong and Sriram, Anuroop and Synnaeve, Gabriel and Collobert, Ronan},
  booktitle={Interspeech},
  year={2020}
}

Additional Resources

License

This model is released under the MIT License, consistent with the base Moonshine model.

MIT License

Copyright (c) 2026 Pierre Chéneau (Cornebidouil)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction...

Acknowledgments

  • Useful Sensors for the original Moonshine architecture and pre-trained model
  • Meta AI for the Multilingual LibriSpeech dataset
  • HuggingFace for the transformers library and model hosting
  • Schedule-Free Learning for the optimizer implementation

Questions? Open an issue on the fine-tuning guide repository or check the documentation.

Want to fine-tune for your language? See the complete fine-tuning guide.

Downloads last month
10
Safetensors
Model size
27.1M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Cornebidouil/moonshine-tiny-fr

Finetuned
(2)
this model

Dataset used to train Cornebidouil/moonshine-tiny-fr

Paper for Cornebidouil/moonshine-tiny-fr