Voice Activity Detection
pyannote.audio
PyTorch
pyannote
pyannote-audio-model
audio
voice
speech
speaker
speaker-segmentation
overlapped-speech-detection
resegmentation
Instructions to use zermok/segmentation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use zermok/segmentation with pyannote.audio:
from pyannote.audio import Model, Inference model = Model.from_pretrained("zermok/segmentation") inference = Inference(model) # inference on the whole file inference("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) inference.crop("file.wav", excerpt) - Notebooks
- Google Colab
- Kaggle
Hervé BREDIN commited on
Commit ·
36f7729
1
Parent(s): 49b252b
feat: add basic config.yaml
Browse files- config.yaml +16 -0
config.yaml
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
task:
|
| 2 |
+
_target_: pyannote.audio.tasks.Segmentation
|
| 3 |
+
duration: 5.0
|
| 4 |
+
max_num_speakers: 3
|
| 5 |
+
model:
|
| 6 |
+
_target_: pyannote.audio.models.segmentation.PyanNet
|
| 7 |
+
sincnet:
|
| 8 |
+
stride: 10
|
| 9 |
+
lstm:
|
| 10 |
+
hidden_size: 128
|
| 11 |
+
num_layers: 4
|
| 12 |
+
bidirectional: true
|
| 13 |
+
monolithic: true
|
| 14 |
+
linear:
|
| 15 |
+
hidden_size: 128
|
| 16 |
+
num_layers: 2
|