Base model to copy decoder/encoder weights is nvidia/stt_en_fastconformer_transducer_xlarge

Best achieve validation WER is 6%

Hardware

1x H100 SXM

Training

training on huge dataset, running for 11 epochs, and about 210K global steps.

Inference

run single audio transcribe or batch mode inference, see inference.ipynb

Adapters

training adapter on ganji dataset including 10 hours of training with max_steps of 10000 training (about 40 epochs), 1 hour of validation and 1 hour of test.

Hardware to train adapter module, 1 RTX 6000Ada

Metric Without Adapter With Adapter
WER 31.20% 28.22%
CER 8.81% 6.21%
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SadeghK/stt_fa_fastconformer_transducer_xlarge

Finetuned
(1)
this model