| | --- |
| | library_name: onnx |
| | tags: |
| | - silero |
| | - voice-activity-detection |
| | - vad |
| | - audio |
| | - onnx |
| | - inference4j |
| | license: mit |
| | pipeline_tag: voice-activity-detection |
| | --- |
| | |
| | # Silero VAD — ONNX |
| |
|
| | ONNX export of [Silero VAD](https://github.com/snakers4/silero-vad), a lightweight and fast voice activity detection model. Detects speech segments in audio with high accuracy and low latency. |
| |
|
| | Mirrored for use with [inference4j](https://github.com/inference4j/inference4j), an inference-only AI library for Java. |
| |
|
| | ## Original Source |
| |
|
| | - **Repository:** [Silero Team (snakers4)](https://github.com/snakers4/silero-vad) |
| | - **License:** mit |
| |
|
| | ## Usage with inference4j |
| |
|
| | ```java |
| | try (SileroVAD vad = SileroVAD.fromPretrained("models/silero-vad")) { |
| | List<VoiceSegment> segments = vad.detect(Path.of("meeting.wav")); |
| | for (VoiceSegment segment : segments) { |
| | System.out.printf("Speech: %.2fs - %.2fs%n", segment.start(), segment.end()); |
| | } |
| | } |
| | ``` |
| |
|
| | ## Model Details |
| |
|
| | | Property | Value | |
| | |----------|-------| |
| | | Architecture | Silero VAD (lightweight CNN + LSTM) | |
| | | Task | Voice activity detection | |
| | | Input | 16kHz mono audio (float32 waveform, 512-sample chunks) | |
| | | Output | Speech probability per chunk | |
| | | Model size | ~2 MB | |
| | | Original source | [snakers4/silero-vad](https://github.com/snakers4/silero-vad) | |
| |
|
| | ## License |
| |
|
| | This model is licensed under the [MIT License](https://opensource.org/licenses/MIT). Original model by [Silero Team](https://github.com/snakers4/silero-vad). |
| |
|