visualears-fastconformer-fa-full-ab-fp16

FP16 reduced-precision NeMo variant of Reza2kn/visualears-fastconformer-fa-full-ab.

Eval — `Reza2kn/persian-asr-eval-v0` FLEURS-fa Slice

Comparison uses the same 200 clips as the FP8/NVFP4 checks and compares against the uncompressed FP base outputs.

Variant	WER	CER	Exact transcript match vs base	Rough word-position agreement	Peak VRAM
FP base	18.38%	6.58%	100.0%	100.00%	588 MiB
FP16	18.42%	6.60%	98.0%	99.92%	301 MiB

WER retention vs base: 99.79%. CER retention vs base: 99.69%.

Files

visualears-fastconformer-fa-full-ab-FP16.nemo: FP16 NeMo checkpoint.
validation/fp16_vs_base_eval_summary.json: comparison summary.
validation/fp16_eval_predictions.jsonl: FP16 predictions for the 200-clip eval slice.

Usage

import nemo.collections.asr as nemo_asr

model = nemo_asr.models.ASRModel.restore_from("visualears-fastconformer-fa-full-ab-FP16.nemo").cuda().eval()
transcripts = model.transcribe(["clip.wav"])
print(transcripts[0])

Notes

This is a reduced-precision checkpoint for NVIDIA/NeMo experimentation. The exact transcript-parity metric is normalized transcript equality on the fixed 200-clip FLEURS-fa slice, not logit-level equality.

Downloads last month: 76

Model tree for Reza2kn/visualears-fastconformer-fa-full-ab-fp16

Base model

nvidia/stt_fa_fastconformer_hybrid_large

Finetuned

Reza2kn/visualears-fastconformer-fa-full-ab

Quantized

(12)