sam-castro's picture
Upload README.md with huggingface_hub
e1f59b4 verified
---
license: cc-by-4.0
tags:
- disfluency-detection
- token-classification
- speech-processing
- onnx
- quantized
base_model: arielcerdap/modernbert-base-multiclass-disfluency-v2
---
# Mów Disfluency Classifier (ONNX INT8)
Quantized ONNX export of [arielcerdap/modernbert-base-multiclass-disfluency-v2](https://huggingface.co/arielcerdap/modernbert-base-multiclass-disfluency-v2) for on-device disfluency removal in [Mów](https://github.com/krokoko/mow), a macOS voice-to-text app.
## What it does
Tags each word in a speech transcript as one of:
| Label | Meaning | Action |
|-------|---------|--------|
| **O** | Fluent | Keep |
| **FP** | Filled pause (um, uh, er) | Remove |
| **RP** | Repetition (the the) | Remove |
| **RV** | Revision / self-correction | Remove |
| **PW** | Partial word | Remove |
## Model details
- **Base model**: ModernBERT-base (150M parameters)
- **Task**: Token classification (5 classes)
- **Training data**: FluencyBank corpus
- **Accuracy**: 93.2%, F1 0.99 on filled pauses, 0.90 on repetitions
- **Format**: ONNX, INT8 dynamic quantization
- **Size**: ~143 MB
- **Inference**: ~5-50ms per sentence on Apple Silicon via ONNX Runtime
## Files
- `DisfluencyClassifier.onnx` — quantized ONNX model
- `tokenizer.json` — HuggingFace tokenizer configuration
- `tokenizer_config.json` — tokenizer metadata
- `label_map.json` — class ID to label mapping
## How to regenerate
From the Mów repo root:
```bash
./scripts/export-disfluency-model.sh
```
This downloads the original PyTorch model from HuggingFace, exports to ONNX, and quantizes to INT8.
## License
CC BY 4.0 (same as the base model). Attribution: [arielcerdap/modernbert-base-multiclass-disfluency-v2](https://huggingface.co/arielcerdap/modernbert-base-multiclass-disfluency-v2).