speaker diarization less 1s seems not good

#6
by linlinsong - opened

Excellent work, thank u! the timestamps are very precise, and the speaker diarization is also quite good. However, it doesn't seem to be very accurate, that the speaker diarization for brief interjections of approximately one second within sentences. Would it be possible to address this issue by adjusting the inference parameters?

Thanks for your interest!

Short, frequent speaker interchanges (e.g., ~1s interjections within an ongoing sentence) are still challenging for the current model and typically require targeted training data to improve. From an inference-only perspective, parameter tuning is unlikely to reliably fix this issue at the moment—sorry about that. o(╥﹏╥)o

That said, we’re actively iterating on the model, and upcoming versions should deliver better performance for this scenario. ヾ(◍°∇°◍)ノ゙

Sign up or log in to comment