Automatic Speech Recognition
NeMo
PyTorch
NeMo

NB ASR North Sámi Parakeet

How to Use this Model:

To train, fine-tune or play with the model you will need to install NVIDIA NeMo. We recommend you install it after you've installed latest PyTorch version.

pip install -U nemo_toolkit['asr']

The model is available for use in the NeMo toolkit [5], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Automatically instantiate the model

import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("NbAiLab/nb-asr-north-saami-parakeet")

Transcribing using Python

First, let's get a sample

wget https://huggingface.co/NbAiLab/nb-asr-north-saami-parakeet/resolve/main/00000_000003.wav

Then simply do:

output = asr_model.transcribe(['00000_000003.wav'])
print(output[0].text)

Transcribing with timestamps

To transcribe with timestamps:

output = asr_model.transcribe(['00000_000003.wav'], timestamps=True)
# by default, timestamps are enabled for char, word and segment level
word_timestamps = output[0].timestamp['word'] # word level timestamps for first sample
segment_timestamps = output[0].timestamp['segment'] # segment level timestamps
char_timestamps = output[0].timestamp['char'] # char level timestamps

for stamp in segment_timestamps:
    print(f"{stamp['start']}s - {stamp['end']}s : {stamp['segment']}")

Transcribing long-form audio

#updating self-attention model of fast-conformer encoder
#setting attention left and right context sizes to 256
asr_model.change_attention_model(self_attention_model="rel_pos_local_attn", att_context_size=[256, 256])

output = asr_model.transcribe(['00000_000003.wav'])

print(output[0].text)

Streaming with Parakeet models

To use parakeet models in streaming mode use this script as shown below:

python NeMo/main/examples/asr/asr_chunked_inference/rnnt/speech_to_text_streaming_infer_rnnt.py \
    pretrained_name="NbAiLab/nb-asr-north-saami-parakeet" \
    model_path=null \
    audio_dir="<optional path to folder of audio files>" \
    dataset_manifest="<optional path to manifest>" \
    output_filename="<optional output filename>" \
    right_context_secs=2.0 \
    chunk_secs=2 \
    left_context_secs=10.0 \
    batch_size=32 \
    clean_groundtruth_text=False

License

License to use this model is covered by the CC-BY-4.0. By downloading the public and release version of the model, you accept the terms and conditions of the CC-BY-4.0 license.

Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NbAiLab/nb-asr-north-saami-parakeet

Finetuned
(21)
this model