banao-tech's picture
Update README.md
b32e5c0 verified
metadata
title: Diarized Speaker Segments Community-1
emoji: 🎙️
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
suggested_hardware: t4-small

Diarized Speaker Segments Community-1

This Space uses repo-style transcription logic from your attached codebase plus pyannote/speaker-diarization-community-1.

What changed

  • transcription now uses the attached repo's logic:
    • custom speech-window detection
    • ffmpeg audio enhancement
    • repo-style Hindi/Hinglish/English prompt
    • repo thresholds and dedupe flow
    • repo segment splitting by words
  • diarization remains community-1
  • cleanup remains exactly:
    • merge only adjacent same-speaker segments
    • otherwise do not touch

Notes

  • default ASR model is medium
  • large-v3 is available as a dropdown for evaluation
  • default language is hi because the attached repo logic is Hindi-biased for Hinglish handling
  • you can still test auto and en