Instructions to use Reza2kn/visualears-fastconformer-fa-full-ab-fp16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use Reza2kn/visualears-fastconformer-fa-full-ab-fp16 with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("Reza2kn/visualears-fastconformer-fa-full-ab-fp16") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
visualears-fastconformer-fa-full-ab-fp16
FP16 reduced-precision NeMo variant of Reza2kn/visualears-fastconformer-fa-full-ab.
Eval โ Reza2kn/persian-asr-eval-v0 FLEURS-fa Slice
Comparison uses the same 200 clips as the FP8/NVFP4 checks and compares against the uncompressed FP base outputs.
| Variant | WER | CER | Exact transcript match vs base | Rough word-position agreement | Peak VRAM |
|---|---|---|---|---|---|
| FP base | 18.38% | 6.58% | 100.0% | 100.00% | 588 MiB |
| FP16 | 18.42% | 6.60% | 98.0% | 99.92% | 301 MiB |
WER retention vs base: 99.79%. CER retention vs base: 99.69%.
Files
visualears-fastconformer-fa-full-ab-FP16.nemo: FP16 NeMo checkpoint.validation/fp16_vs_base_eval_summary.json: comparison summary.validation/fp16_eval_predictions.jsonl: FP16 predictions for the 200-clip eval slice.
Usage
import nemo.collections.asr as nemo_asr
model = nemo_asr.models.ASRModel.restore_from("visualears-fastconformer-fa-full-ab-FP16.nemo").cuda().eval()
transcripts = model.transcribe(["clip.wav"])
print(transcripts[0])
Notes
This is a reduced-precision checkpoint for NVIDIA/NeMo experimentation. The exact transcript-parity metric is normalized transcript equality on the fixed 200-clip FLEURS-fa slice, not logit-level equality.
- Downloads last month
- 76
Model tree for Reza2kn/visualears-fastconformer-fa-full-ab-fp16
Base model
nvidia/stt_fa_fastconformer_hybrid_large