sam-castro's picture
Upload README.md with huggingface_hub
e1f59b4 verified
metadata
license: cc-by-4.0
tags:
  - disfluency-detection
  - token-classification
  - speech-processing
  - onnx
  - quantized
base_model: arielcerdap/modernbert-base-multiclass-disfluency-v2

Mów Disfluency Classifier (ONNX INT8)

Quantized ONNX export of arielcerdap/modernbert-base-multiclass-disfluency-v2 for on-device disfluency removal in Mów, a macOS voice-to-text app.

What it does

Tags each word in a speech transcript as one of:

Label Meaning Action
O Fluent Keep
FP Filled pause (um, uh, er) Remove
RP Repetition (the the) Remove
RV Revision / self-correction Remove
PW Partial word Remove

Model details

  • Base model: ModernBERT-base (150M parameters)
  • Task: Token classification (5 classes)
  • Training data: FluencyBank corpus
  • Accuracy: 93.2%, F1 0.99 on filled pauses, 0.90 on repetitions
  • Format: ONNX, INT8 dynamic quantization
  • Size: ~143 MB
  • Inference: ~5-50ms per sentence on Apple Silicon via ONNX Runtime

Files

  • DisfluencyClassifier.onnx — quantized ONNX model
  • tokenizer.json — HuggingFace tokenizer configuration
  • tokenizer_config.json — tokenizer metadata
  • label_map.json — class ID to label mapping

How to regenerate

From the Mów repo root:

./scripts/export-disfluency-model.sh

This downloads the original PyTorch model from HuggingFace, exports to ONNX, and quantizes to INT8.

License

CC BY 4.0 (same as the base model). Attribution: arielcerdap/modernbert-base-multiclass-disfluency-v2.