| title: VisualEars FA32M FastConformer FP16 WebGPU | |
| emoji: 🎙️ | |
| colorFrom: green | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 5.35.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: FA32M Persian ASR FP16 WebGPU demo | |
| # VisualEars FA32M FastConformer Persian ASR WebGPU Demo | |
| Browser WebGPU demo for [`Reza2kn/visualears-fastconformer-fa32m-streaming-bpe1024-onnx-fp16`](https://huggingface.co/Reza2kn/visualears-fastconformer-fa32m-streaming-bpe1024-onnx-fp16). | |
| This revision uses the corrected FA32M contract: NeMo-compatible fbank features (`normalize=NA`, preemphasis, centered STFT, Slaney mel filters) plus the `processed_signal_length` ONNX input. That fixes the prior symptom where the app detected speech/finished utterances but decoded mostly empty transcripts. | |
| Parity gate before switch: **267/269 exact transcript matches (99.26%)** vs the source PyTorch NeMo preprocessor + encoder + auxiliary CTC path; ONNX non-empty transcript rate **266/269 (98.88%)** on the short/noisy VisualEars269 set. | |