Instructions to use willopcbeta/lite-whisper-small-fast-ONNX-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use willopcbeta/lite-whisper-small-fast-ONNX-v2 with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('automatic-speech-recognition', 'willopcbeta/lite-whisper-small-fast-ONNX-v2');
lite-whisper-small-fast (ONNX)
This is an ONNX version of willopcbeta/lite-whisper-small-fast. It was automatically converted and uploaded using this Hugging Face Space.
This model library is compatible with transformers.js v4, exported using optimum-cli, and is backward compatible with transformers.js v3.
Old version lite-whisper-small-fast-ONNX will be deprecated.
The optimal Q4 quantitative configuration: using decoder_model with q4f16 results in less ambiguous or nonsensical outputs.
quantization: {
encoder_model: 'q4f16',
decoder_model_merged: 'q4',
},
Usage with Transformers.js
See the pipeline documentation for automatic-speech-recognition: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.AutomaticSpeechRecognitionPipeline
Lite-Whisper is a compressed version of OpenAI Whisper with LiteASR. See our GitHub repository and paper for details.
This model revised the name of “efficient-speech/lite-whisper-small-fast” and merged any missing files to facilitate subsequent conversion to ONNX format or for other purposes.
Benchmark Results
Following is the average word error rate (WER) evaluated on the ESB datasets:
| Model | Average WER (↓) | Encoder Size | Decoder Size |
|---|---|---|---|
| whisper-tiny | 22.01 | 7.63M | 29.55M |
| lite-whisper-tiny-acc | 22.97 | 7.41M | 29.55M |
| lite-whisper-tiny | 23.95 | 7.00M | 29.55M |
| lite-whisper-tiny-fast | 27.09 | 6.48M | 29.55M |
| whisper-base | 17.67 | 19.82M | 52.00M |
| lite-whisper-base-acc | 19.07 | 18.64M | 52.00M |
| lite-whisper-base | 19.71 | 17.44M | 52.00M |
| lite-whisper-base-fast | 23.05 | 16.07M | 52.00M |
| whisper-small | 15.89 | 87.00M | 153.58M |
| lite-whisper-small-acc | 15.37 | 76.99M | 153.58M |
| lite-whisper-small | 14.96 | 70.16M | 153.58M |
| lite-whisper-small-fast | 14.92 | 63.11M | 153.58M |
| whisper-medium | 15.12 | 305.68M | 456.64M |
| lite-whisper-medium-acc | 13.46 | 269.93M | 456.64M |
| lite-whisper-medium | 14.50 | 239.99M | 456.64M |
| lite-whisper-medium-fast | 14.52 | 215.31M | 456.64M |
Citation
If you use LiteASR in your research, please cite the following paper:
@misc{kamahori2025liteasrefficientautomaticspeech,
title={LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation},
author={Keisuke Kamahori and Jungo Kasai and Noriyuki Kojima and Baris Kasikci},
year={2025},
eprint={2502.20583},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2502.20583},
}
- Downloads last month
- 83
Model tree for willopcbeta/lite-whisper-small-fast-ONNX-v2
Base model
openai/whisper-small