arielcerdap
/

modernbert-disfluency-expA-realonly

Token Classification

disfluency-detection

speech-pathology

Model card Files Files and versions

Metrics Training metrics Community

ModernBERT Disfluency Detection — Exp A (Real Only)

Fine-tuned from answerdotai/ModernBERT-base on real-only FluencyBank Timestamped data.

Dataset

Config: real_only de arielcerdap/disfluency-fluencybank
Train: 2737 segmentos (100% reales)
Val/Test: idénticos a Exp B para comparación directa

Labels

O · FP (filled pause) · RP (repetition) · RV (revision) · PW (partial word)

Test Results

Label	P	R	F1	Support
Label	P	R	F1	Support
---	---	---	---	---
O	0.9783	0.9860	0.9821	3704
FP	0.9888	1.0000	0.9944	176
RP	0.7784	0.8276	0.8022	174
RV	0.3425	0.2907	0.3145	86
PW	0.9510	0.8326	0.8879	233

Macro F1 (4 disfluencias): 0.7497
Binary F1: 0.8902

Hyperparameters

learning_rate: 5e-05
epochs: 15
warmup_steps: 191
weight_decay: 0.1
focal_loss_gamma: 3.0 (adaptive)
class_weights: O=1.0, FP=3.0, RP=6.0, RV=20.0, PW=5.0

Downloads last month: 9

Safetensors

Model size

0.1B params

Tensor type

F32

·

Dataset used to train arielcerdap/modernbert-disfluency-expA-realonly