visualears-fastconformer-fa-full-ab-fp16

FP16 reduced-precision NeMo variant of Reza2kn/visualears-fastconformer-fa-full-ab.

Eval โ€” Reza2kn/persian-asr-eval-v0 FLEURS-fa Slice

Comparison uses the same 200 clips as the FP8/NVFP4 checks and compares against the uncompressed FP base outputs.

Variant WER CER Exact transcript match vs base Rough word-position agreement Peak VRAM
FP base 18.38% 6.58% 100.0% 100.00% 588 MiB
FP16 18.42% 6.60% 98.0% 99.92% 301 MiB

WER retention vs base: 99.79%. CER retention vs base: 99.69%.

Files

  • visualears-fastconformer-fa-full-ab-FP16.nemo: FP16 NeMo checkpoint.
  • validation/fp16_vs_base_eval_summary.json: comparison summary.
  • validation/fp16_eval_predictions.jsonl: FP16 predictions for the 200-clip eval slice.

Usage

import nemo.collections.asr as nemo_asr

model = nemo_asr.models.ASRModel.restore_from("visualears-fastconformer-fa-full-ab-FP16.nemo").cuda().eval()
transcripts = model.transcribe(["clip.wav"])
print(transcripts[0])

Notes

This is a reduced-precision checkpoint for NVIDIA/NeMo experimentation. The exact transcript-parity metric is normalized transcript equality on the fixed 200-clip FLEURS-fa slice, not logit-level equality.

Downloads last month
76
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Reza2kn/visualears-fastconformer-fa-full-ab-fp16