--- datasets: - DavronSherbaev/uzbekvoice-filtered - mozilla-foundation/common_voice_17_0 language: - uz metrics: - wer base_model: - nvidia/stt_en_fastconformer_transducer_large pipeline_tag: automatic-speech-recognition --- WER Uzbekvoice 6.23 ``` pip install nemo_toolkit['asr'] wget https://huggingface.co/Saidakmal/ASR_nvidia_fastconformer/resolve/main/fast_conf_uz_1024_tokens.nemo?download=true ``` ``` import nemo.collections.asr as nemo_asr from omegaconf import OmegaConf asr_model = nemo_asr.models.EncDecHybridRNNTCTCBPEModel.restore_from( "fast_conf_uz_1024_tokens.nemo" ) output = asr_model.transcribe(["path_audio"]) print(output[0] if isinstance(output[0], str) else output[0].text) ```