| --- |
| license: cc-by-4.0 |
| tags: |
| - disfluency-detection |
| - token-classification |
| - speech-processing |
| - onnx |
| - quantized |
| base_model: arielcerdap/modernbert-base-multiclass-disfluency-v2 |
| --- |
| |
| # Mów Disfluency Classifier (ONNX INT8) |
|
|
| Quantized ONNX export of [arielcerdap/modernbert-base-multiclass-disfluency-v2](https://huggingface.co/arielcerdap/modernbert-base-multiclass-disfluency-v2) for on-device disfluency removal in [Mów](https://github.com/krokoko/mow), a macOS voice-to-text app. |
|
|
| ## What it does |
|
|
| Tags each word in a speech transcript as one of: |
|
|
| | Label | Meaning | Action | |
| |-------|---------|--------| |
| | **O** | Fluent | Keep | |
| | **FP** | Filled pause (um, uh, er) | Remove | |
| | **RP** | Repetition (the the) | Remove | |
| | **RV** | Revision / self-correction | Remove | |
| | **PW** | Partial word | Remove | |
|
|
| ## Model details |
|
|
| - **Base model**: ModernBERT-base (150M parameters) |
| - **Task**: Token classification (5 classes) |
| - **Training data**: FluencyBank corpus |
| - **Accuracy**: 93.2%, F1 0.99 on filled pauses, 0.90 on repetitions |
| - **Format**: ONNX, INT8 dynamic quantization |
| - **Size**: ~143 MB |
| - **Inference**: ~5-50ms per sentence on Apple Silicon via ONNX Runtime |
|
|
| ## Files |
|
|
| - `DisfluencyClassifier.onnx` — quantized ONNX model |
| - `tokenizer.json` — HuggingFace tokenizer configuration |
| - `tokenizer_config.json` — tokenizer metadata |
| - `label_map.json` — class ID to label mapping |
|
|
| ## How to regenerate |
|
|
| From the Mów repo root: |
|
|
| ```bash |
| ./scripts/export-disfluency-model.sh |
| ``` |
|
|
| This downloads the original PyTorch model from HuggingFace, exports to ONNX, and quantizes to INT8. |
|
|
| ## License |
|
|
| CC BY 4.0 (same as the base model). Attribution: [arielcerdap/modernbert-base-multiclass-disfluency-v2](https://huggingface.co/arielcerdap/modernbert-base-multiclass-disfluency-v2). |
|
|