Cannot use the model on language other than Englih

Mar 31

Hi nemo team
I wanted to transcribe french using your model because we could set the source language (btw the Parakeet TDT 0.6B model is accurate and fast but sometimes the transcription switches to english)
Unfortunately I couldn't make it work in another language than English.
I tried successfully with the same code but with canary-1b-flash
here is the command that is used :
python .\speech_to_text_aed_chunked_infer.py pretrained_name="nvidia/canary-180m-flash" dataset_manifest="test_manifest.json" chunk_len_in_secs=40.0 batch_size=1 decoding.beam.beam_size=1 timestamps=False
The manifest :
{"audio_filepath": "output_mono_16k.wav", "duration": 10000.0, "taskname": "asr","source_lang": "fr", "target_lang": "fr"}
Do you have any insight on why French transcription works with canary‑1b‑flash but not with canary‑180m‑flash using the same settings and script?
Thank you for your answer

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment