camenduru
/

NeMo

Model card Files Files and versions

NeMo / examples /asr /README.md

camenduru's picture

thanks to NVIDIA ❤

7934b29 almost 3 years ago

|

history blame contribute delete

1.15 kB

	# Automatic Speech Recognition

	This directory contains example scripts to train ASR models using various methods such as Connectionist Temporal Classification loss, RNN Transducer Loss.

	Speech pre-training via self supervised learning, voice activity detection and other sub-domains are also included as part of this domain's examples.

	# ASR Model inference execution overview

	The inference scripts in this directory execute in the following order. When preparing your own inference scripts, please follow this order for correct inference.

	```mermaid

	graph TD
	A[Hydra Overrides + Config Dataclass] --> B{Config}
	B --> \|Init\| C[Model]
	B --> \|Init\| D[Trainer]
	C & D --> E[Set trainer]
	E --> \|Optional\| F[Change Transducer Decoding Strategy]
	F --> H[Load Manifest]
	E --> \|Skip\| H
	H --> I["model.transcribe(...)"]
	I --> J[Write output manifest]
	K[Ground Truth Manifest]
	J & K --> \|Optional\| L[Evaluate CER/WER]

	```

	During restoration of the model, you may pass the Trainer to the restore_from / from_pretrained call, or set it after the model has been initialized by using `model.set_trainer(Trainer)`.