| | --- |
| | datasets: |
| | - DavronSherbaev/uzbekvoice-filtered |
| | - mozilla-foundation/common_voice_17_0 |
| | language: |
| | - uz |
| | metrics: |
| | - wer |
| | base_model: |
| | - nvidia/stt_en_fastconformer_transducer_large |
| | pipeline_tag: automatic-speech-recognition |
| | --- |
| | WER |
| | Uzbekvoice 6.23 |
| |
|
| | ``` |
| | pip install nemo_toolkit['asr'] |
| | wget https://huggingface.co/Saidakmal/ASR_nvidia_fastconformer/resolve/main/fast_conf_uz_1024_tokens.nemo?download=true |
| | ``` |
| |
|
| | ``` |
| | import nemo.collections.asr as nemo_asr |
| | from omegaconf import OmegaConf |
| | |
| | asr_model = nemo_asr.models.EncDecHybridRNNTCTCBPEModel.restore_from( |
| | "fast_conf_uz_1024_tokens.nemo" |
| | ) |
| | |
| | output = asr_model.transcribe(["path_audio"]) |
| | |
| | print(output[0] if isinstance(output[0], str) else output[0].text) |
| | ``` |