Reza2kn/Shenava-Rizeh-0.9-onnx-fp16

ONNX FP16 export of Reza2kn/Shenava-Rizeh-0.9 for VisualEars browser/WebGPU inference.

Runtime Contract

  • Input processed_signal: float16, shape [batch, 80, 2005]
  • Input processed_signal_length: int64, shape [batch]
  • Output logits: float16
  • Output encoded_lengths: int64
  • Streaming attention context: [70, 1]
  • Tokenizer: SentencePiece BPE, 1024 tokens plus CTC blank

Validation

Parity smoke on the VisualEars 269 benchmark slice:

40/40 exact transcript matches, 1.000000 frame argmax agreement, 0.950 non-empty ONNX rate.

The comparison is source NeMo fp32 encoder plus auxiliary CTC head versus exported ONNX FP16 using the same preprocessed features.

Files

  • shenava_rizeh_0_9_ctc_fixed2005_len_att70_1_fp16_full_io.onnx β€” 6,171,937 bytes β€” sha256 114ba96038eb45d92f449fcc5fdbf67e4c09aa96cbc5b4a4fddb77a52e09df9e
  • shenava_rizeh_0_9_ctc_fixed2005_len_att70_1_fp16_full_io.onnx.data β€” 52,731,906 bytes β€” sha256 3261ed46ccef096a213e29c10f4f7a7a629c2a5895412003ada4db0c1fa9940f
  • shenava_rizeh_0_9_ctc_fixed2005_len_att70_1_fp16_full_io_embedded.onnx β€” 58,875,455 bytes β€” sha256 7490c7d364f869912c57c1d57f8ee8688fa284af1305ff8d333ac69c65a74c25
  • tokens.json β€” 15,115 bytes β€” sha256 895dc224c2b726fa988bdb9f4bd61a5334d594d9f37174399a2f97e68cad91ff
  • preprocessor.json β€” 1,795 bytes β€” sha256 b08dcbb46aa54e4414c55306571530b05730b9df0472f7a13a6bcce19cdfaa61
  • mel_filters_slaney_80x257.json β€” 91,115 bytes β€” sha256 327ad485dfcf1cbd9405ea6512aa0a788990a5c98ef14ba8585896cdc9749866

Generated at 2026-06-23T02:04:27Z.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Reza2kn/Shenava-Rizeh-0.9-onnx-fp16