Silero VAD — ONNX

ONNX export of Silero VAD, a lightweight and fast voice activity detection model. Detects speech segments in audio with high accuracy and low latency.

Mirrored for use with inference4j, an inference-only AI library for Java.

Original Source

Repository: Silero Team (snakers4)
License: mit

Usage with inference4j

try (SileroVAD vad = SileroVAD.fromPretrained("models/silero-vad")) {
    List<VoiceSegment> segments = vad.detect(Path.of("meeting.wav"));
    for (VoiceSegment segment : segments) {
        System.out.printf("Speech: %.2fs - %.2fs%n", segment.start(), segment.end());
    }
}

Model Details

Property	Value
Architecture	Silero VAD (lightweight CNN + LSTM)
Task	Voice activity detection
Input	16kHz mono audio (float32 waveform, 512-sample chunks)
Output	Speech probability per chunk
Model size	~2 MB
Original source	snakers4/silero-vad

License

This model is licensed under the MIT License. Original model by Silero Team.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Voice Activity Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support