Automatic Speech Recognition
NeMo
Safetensors
PyTorch
fastconformer
automatic-speech-translation
speech
audio
Transformer
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Eval Results (legacy)
Eval Results
Instructions to use nvidia/canary-1b-flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use nvidia/canary-1b-flash with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/canary-1b-flash") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
update compute_timestamps flag
Browse files
README.md
CHANGED
|
@@ -366,7 +366,7 @@ python scripts/speech_to_text_aed_chunked_infer.py \
|
|
| 366 |
chunk_len_in_secs=40.0 \
|
| 367 |
batch_size=1 \
|
| 368 |
decoding.beam.beam_size=1 \
|
| 369 |
-
|
| 370 |
```
|
| 371 |
|
| 372 |
**Note** that for longform inference with timestamps, it is recommended to use `chunk_len_in_secs` of 10 seconds.
|
|
|
|
| 366 |
chunk_len_in_secs=40.0 \
|
| 367 |
batch_size=1 \
|
| 368 |
decoding.beam.beam_size=1 \
|
| 369 |
+
timestamps=False
|
| 370 |
```
|
| 371 |
|
| 372 |
**Note** that for longform inference with timestamps, it is recommended to use `chunk_len_in_secs` of 10 seconds.
|