Update README.md
Browse files
README.md
CHANGED
|
@@ -10,24 +10,22 @@ suggested_hardware: t4-small
|
|
| 10 |
|
| 11 |
# Diarized Speaker Segments Community-1
|
| 12 |
|
| 13 |
-
This Space uses **pyannote/speaker-diarization-community-1**
|
| 14 |
|
| 15 |
-
##
|
| 16 |
-
-
|
| 17 |
-
-
|
| 18 |
-
-
|
| 19 |
-
-
|
| 20 |
-
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
| 26 |
-
-
|
| 27 |
-
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
- ASR uses GPU when available
|
| 31 |
-
- diarization uses GPU when available
|
| 32 |
-
- ASR and diarization run sequentially
|
| 33 |
-
- diarization is fed an in-memory waveform dict to avoid file-decoding issues
|
|
|
|
| 10 |
|
| 11 |
# Diarized Speaker Segments Community-1
|
| 12 |
|
| 13 |
+
This Space uses **repo-style transcription logic** from your attached codebase plus **pyannote/speaker-diarization-community-1**.
|
| 14 |
|
| 15 |
+
## What changed
|
| 16 |
+
- transcription now uses the attached repo's logic:
|
| 17 |
+
- custom speech-window detection
|
| 18 |
+
- ffmpeg audio enhancement
|
| 19 |
+
- repo-style Hindi/Hinglish/English prompt
|
| 20 |
+
- repo thresholds and dedupe flow
|
| 21 |
+
- repo segment splitting by words
|
| 22 |
+
- diarization remains `community-1`
|
| 23 |
+
- cleanup remains exactly:
|
| 24 |
+
- merge **only adjacent same-speaker** segments
|
| 25 |
+
- otherwise do not touch
|
| 26 |
|
| 27 |
+
## Notes
|
| 28 |
+
- default ASR model is **medium**
|
| 29 |
+
- `large-v3` is available as a dropdown for evaluation
|
| 30 |
+
- default language is **hi** because the attached repo logic is Hindi-biased for Hinglish handling
|
| 31 |
+
- you can still test `auto` and `en`
|
|
|
|
|
|
|
|
|
|
|
|