Silero VAD โ€” ONNX

ONNX export of Silero VAD, a lightweight and fast voice activity detection model. Detects speech segments in audio with high accuracy and low latency.

Mirrored for use with inference4j, an inference-only AI library for Java.

Original Source

Usage with inference4j

try (SileroVAD vad = SileroVAD.fromPretrained("models/silero-vad")) {
    List<VoiceSegment> segments = vad.detect(Path.of("meeting.wav"));
    for (VoiceSegment segment : segments) {
        System.out.printf("Speech: %.2fs - %.2fs%n", segment.start(), segment.end());
    }
}

Model Details

Property Value
Architecture Silero VAD (lightweight CNN + LSTM)
Task Voice activity detection
Input 16kHz mono audio (float32 waveform, 512-sample chunks)
Output Speech probability per chunk
Model size ~2 MB
Original source snakers4/silero-vad

License

This model is licensed under the MIT License. Original model by Silero Team.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support