NB ASR North Sámi Parakeet
How to Use this Model:
To train, fine-tune or play with the model you will need to install NVIDIA NeMo. We recommend you install it after you've installed latest PyTorch version.
pip install -U nemo_toolkit['asr']
The model is available for use in the NeMo toolkit [5], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
Automatically instantiate the model
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("NbAiLab/nb-asr-north-saami-parakeet")
Transcribing using Python
First, let's get a sample
wget https://huggingface.co/NbAiLab/nb-asr-north-saami-parakeet/resolve/main/00000_000003.wav
Then simply do:
output = asr_model.transcribe(['00000_000003.wav'])
print(output[0].text)
Transcribing with timestamps
To transcribe with timestamps:
output = asr_model.transcribe(['00000_000003.wav'], timestamps=True)
# by default, timestamps are enabled for char, word and segment level
word_timestamps = output[0].timestamp['word'] # word level timestamps for first sample
segment_timestamps = output[0].timestamp['segment'] # segment level timestamps
char_timestamps = output[0].timestamp['char'] # char level timestamps
for stamp in segment_timestamps:
print(f"{stamp['start']}s - {stamp['end']}s : {stamp['segment']}")
Transcribing long-form audio
#updating self-attention model of fast-conformer encoder
#setting attention left and right context sizes to 256
asr_model.change_attention_model(self_attention_model="rel_pos_local_attn", att_context_size=[256, 256])
output = asr_model.transcribe(['00000_000003.wav'])
print(output[0].text)
Streaming with Parakeet models
To use parakeet models in streaming mode use this script as shown below:
python NeMo/main/examples/asr/asr_chunked_inference/rnnt/speech_to_text_streaming_infer_rnnt.py \
pretrained_name="NbAiLab/nb-asr-north-saami-parakeet" \
model_path=null \
audio_dir="<optional path to folder of audio files>" \
dataset_manifest="<optional path to manifest>" \
output_filename="<optional output filename>" \
right_context_secs=2.0 \
chunk_secs=2 \
left_context_secs=10.0 \
batch_size=32 \
clean_groundtruth_text=False
License
License to use this model is covered by the CC-BY-4.0. By downloading the public and release version of the model, you accept the terms and conditions of the CC-BY-4.0 license.
- Downloads last month
- 44
Model tree for NbAiLab/nb-asr-north-saami-parakeet
Base model
nvidia/parakeet-tdt-0.6b-v3