banao-tech's picture
Update README.md
b32e5c0 verified
---
title: Diarized Speaker Segments Community-1
emoji: 🎙️
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
suggested_hardware: t4-small
---
# Diarized Speaker Segments Community-1
This Space uses **repo-style transcription logic** from your attached codebase plus **pyannote/speaker-diarization-community-1**.
## What changed
- transcription now uses the attached repo's logic:
- custom speech-window detection
- ffmpeg audio enhancement
- repo-style Hindi/Hinglish/English prompt
- repo thresholds and dedupe flow
- repo segment splitting by words
- diarization remains `community-1`
- cleanup remains exactly:
- merge **only adjacent same-speaker** segments
- otherwise do not touch
## Notes
- default ASR model is **medium**
- `large-v3` is available as a dropdown for evaluation
- default language is **hi** because the attached repo logic is Hindi-biased for Hinglish handling
- you can still test `auto` and `en`