Silero VAD โ ONNX
ONNX export of Silero VAD, a lightweight and fast voice activity detection model. Detects speech segments in audio with high accuracy and low latency.
Mirrored for use with inference4j, an inference-only AI library for Java.
Original Source
- Repository: Silero Team (snakers4)
- License: mit
Usage with inference4j
try (SileroVAD vad = SileroVAD.fromPretrained("models/silero-vad")) {
List<VoiceSegment> segments = vad.detect(Path.of("meeting.wav"));
for (VoiceSegment segment : segments) {
System.out.printf("Speech: %.2fs - %.2fs%n", segment.start(), segment.end());
}
}
Model Details
| Property | Value |
|---|---|
| Architecture | Silero VAD (lightweight CNN + LSTM) |
| Task | Voice activity detection |
| Input | 16kHz mono audio (float32 waveform, 512-sample chunks) |
| Output | Speech probability per chunk |
| Model size | ~2 MB |
| Original source | snakers4/silero-vad |
License
This model is licensed under the MIT License. Original model by Silero Team.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support