metadata
title: Diarized Speaker Segments Community-1
emoji: 🎙️
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
suggested_hardware: t4-small
Diarized Speaker Segments Community-1
This Space uses repo-style transcription logic from your attached codebase plus pyannote/speaker-diarization-community-1.
What changed
- transcription now uses the attached repo's logic:
- custom speech-window detection
- ffmpeg audio enhancement
- repo-style Hindi/Hinglish/English prompt
- repo thresholds and dedupe flow
- repo segment splitting by words
- diarization remains
community-1 - cleanup remains exactly:
- merge only adjacent same-speaker segments
- otherwise do not touch
Notes
- default ASR model is medium
large-v3is available as a dropdown for evaluation- default language is hi because the attached repo logic is Hindi-biased for Hinglish handling
- you can still test
autoanden