Voice Activity Detection
pyannote.audio
PyTorch
pyannote
pyannote-audio-model
audio
voice
speech
speaker
speaker-diarization
speaker-change-detection
speaker-segmentation
overlapped-speech-detection
resegmentation
Instructions to use objects76/speaker-diarization-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use objects76/speaker-diarization-v1 with pyannote.audio:
from pyannote.audio import Model, Inference model = Model.from_pretrained("objects76/speaker-diarization-v1") inference = Inference(model) # inference on the whole file inference("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) inference.crop("file.wav", excerpt) - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -28,7 +28,7 @@ Consider switching to [pyannoteAI](https://www.pyannote.ai) for better and faste
|
|
| 28 |
|
| 29 |
This model ingests 10 seconds of mono audio sampled at 16kHz and outputs speaker diarization as a (num_frames, num_classes) matrix where the 7 classes are _non-speech_, _speaker #1_, _speaker #2_, _speaker #3_, _speakers #1 and #2_, _speakers #1 and #3_, and _speakers #2 and #3_.
|
| 30 |
|
| 31 |
-

|
| 32 |
|
| 33 |
```python
|
| 34 |
# waveform (first row)
|
|
|
|
| 28 |
|
| 29 |
This model ingests 10 seconds of mono audio sampled at 16kHz and outputs speaker diarization as a (num_frames, num_classes) matrix where the 7 classes are _non-speech_, _speaker #1_, _speaker #2_, _speaker #3_, _speakers #1 and #2_, _speakers #1 and #3_, and _speakers #2 and #3_.
|
| 30 |
|
| 31 |
+

|
| 32 |
|
| 33 |
```python
|
| 34 |
# waveform (first row)
|