OmniASR v2 (300M) - Optimized 4-bit ONNX
This repository contains the first standalone, 4-bit quantized version of Meta's OmniASR v2, specifically optimized for local mobile inference.
Key Improvements by Edison dos Santos:
- Zero Dependencies: Unlike other distributions, this model does NOT require specialized ASR libraries. It runs on pure
onnxruntime. - Mobile Optimized: The 4-bit quantization targets the
MatMulandTransformerlayers with a block size of 32, tailored for ARM-based chipsets (tested on Dimensity 6300).
Usage
Run inference_example.py (make sure to have an audio.wav file). Check the Ghost Assistant's Technical Report for more info and benchmarks.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Edison2ST/omniASR_CTC_300M_v2_Q4_ONNX
Base model
bezzam/omniasr-ctc-300m-v2