| license: mit | |
| pipeline_tag: voice-activity-detection | |
| tags: | |
| - onnx | |
| - audio | |
| - silero-vad | |
| - transformers.js | |
| library_name: transformers.js | |
| # BricksDisplay/silero-vad-6.2 | |
| Transformers.js-compatible Silero VAD v6.2 packaged for ONNX inference. | |
| ## Files | |
| - `onnx/model.onnx` β fp32 | |
| - `onnx/model_fp16.onnx` β fp16 | |
| - `onnx/model_int8.onnx` β dynamic int8 | |
| - `onnx/model_uint8.onnx` β dynamic uint8 | |
| - `onnx/model_quantized.onnx` β alias of uint8 for Transformers.js `dtype: "q8"` | |
| ## Source | |
| - Upstream assets: `snakers4/silero-vad` tag `v6.2` | |
| - Reference packaging layout: `BricksDisplay/silero-vad` | |
| ## Transformers.js | |
| ```js | |
| import { AutoModel } from '@huggingface/transformers'; | |
| const model = await AutoModel.from_pretrained('BricksDisplay/silero-vad-6.2'); | |
| ``` | |
| Inputs expected by the ONNX session: | |
| - `input`: float32 PCM chunk, shape `[1, num_samples]` | |
| - `state`: float32 recurrent state, shape `[2, 1, 128]` | |
| - `sr`: int64 scalar sample rate (`8000` or `16000`) | |