A newer version of the Gradio SDK is available: 6.19.0
metadata
title: VisualEars FA32M FastConformer FP16 WebGPU
emoji: 🎙️
colorFrom: green
colorTo: indigo
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: FA32M Persian ASR FP16 WebGPU demo
VisualEars FA32M FastConformer Persian ASR WebGPU Demo
Browser WebGPU demo for Reza2kn/visualears-fastconformer-fa32m-streaming-bpe1024-onnx-fp16.
This revision uses the corrected FA32M contract: NeMo-compatible fbank features (normalize=NA, preemphasis, centered STFT, Slaney mel filters) plus the processed_signal_length ONNX input. That fixes the prior symptom where the app detected speech/finished utterances but decoded mostly empty transcripts.
Parity gate before switch: 267/269 exact transcript matches (99.26%) vs the source PyTorch NeMo preprocessor + encoder + auxiliary CTC path; ONNX non-empty transcript rate 266/269 (98.88%) on the short/noisy VisualEars269 set.