| Checkpoints | |
| =========== | |
| There are two main ways to load pretrained checkpoints in NeMo as introduced in `loading ASR checkpoints <../results.html#checkpoints>`__. | |
| In speaker diarization, the diarizer loads checkpoints that are passed through the config file. For example: | |
| Loading Local Checkpoints | |
| --------------------------- | |
| Load VAD models | |
| .. code-block:: bash | |
| pretrained_vad_model='/path/to/vad_multilingual_marblenet.nemo' # local .nemo or pretrained vad model name | |
| ... | |
| # pass with hydra config | |
| config.diarizer.vad.model_path=pretrained_vad_model | |
| Load speaker embedding models | |
| .. code-block:: bash | |
| pretrained_speaker_model='/path/to/titanet-l.nemo' # local .nemo or pretrained speaker embedding model name | |
| ... | |
| # pass with hydra config | |
| config.diarizer.speaker_embeddings.model_path=pretrained_speaker_model | |
| Load neural diarizer models | |
| .. code-block:: bash | |
| pretrained_neural_diarizer_model='/path/to/diarizer_msdd_telephonic.nemo' # local .nemo or pretrained neural diarizer model name | |
| ... | |
| # pass with hydra config | |
| config.diarizer.msdd_model.model_path=pretrained_neural_diarizer_model | |
| NeMo will automatically save checkpoints of a model you are training in a `.nemo` format. | |
| You can also manually save your models at any point using :code:`model.save_to(<checkpoint_path>.nemo)`. | |
| Inference | |
| --------- | |
| .. note:: | |
| For details and deep understanding, please refer to ``<NeMo_git_root>/tutorials/speaker_tasks/Speaker_Diarization_Inference.ipynb``. | |
| Check out :doc:`Datasets <./datasets>` for preparing audio files and optional label files. | |
| Run and evaluate speaker diarizer with below command: | |
| .. code-block:: bash | |
| # Have a look at the instruction inside the script and pass the arguments you might need. | |
| python <NeMo_git_root>/examples/speaker_tasks/diarization/offline_diarization.py | |
| NGC Pretrained Checkpoints | |
| -------------------------- | |
| The ASR collection has checkpoints of several models trained on various datasets for a variety of tasks. | |
| These checkpoints are obtainable via NGC `NeMo Automatic Speech Recognition collection <https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels>`_. | |
| The model cards on NGC contain more information about each of the checkpoints available. | |
| In general, you can load models with model name in the following format, | |
| .. code-block:: python | |
| pretrained_vad_model='vad_multilingual_marblenet' | |
| pretrained_speaker_model='titanet_large' | |
| pretrained_neural_diarizer_model='diar_msdd_telephonic' | |
| ... | |
| config.diarizer.vad.model_path=retrained_vad_model \ | |
| config.diarizer.speaker_embeddings.model_path=pretrained_speaker_model \ | |
| config.diarizer.msdd_model.model_path=pretrained_neural_diarizer_model | |
| where the model name is the value under "Model Name" entry in the tables below. | |
| Models for Speaker Diarization Pipeline | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| .. csv-table:: | |
| :file: data/diarization_results.csv | |
| :align: left | |
| :widths: 30, 30, 40 | |
| :header-rows: 1 | |