nvidia
/

low-frame-rate-speech-codec-22khz

Feature Extraction

Model card Files Files and versions

CasanovaE commited on Dec 6, 2024

Commit

3e02d67

·

verified ·

1 Parent(s): b61673d

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -52,9 +52,10 @@ For more details please check [our paper](https://arxiv.org/abs/2409.12117).
 The model is available for use in the [NVIDIA NeMo](https://github.com/NVIDIA/NeMo), and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
 ### Inference
-For inference, you can follow our [Audio Codec Inference Tutorial](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tts/Audio_Codec_Inference.ipynb) which automatically downloads the model checkpoint. Note that you will need to set the ```model_name``` parameter to "nvidia/low-frame-rate-speech-codec-22khz".
-In addition, you can use the code bellow that automatically download the checkpoint as well:
 ```
 import librosa
@@ -86,7 +87,7 @@ sf.write(path_to_output_audio, output_audio, nemo_codec_model.sample_rate)
 ```
-Alternatively, you can manually download the [checkpoint](https://huggingface.co/nvidia/low-frame-rate-speech-codec-22khz/resolve/main/low-frame-rate-speech-codec-22khz.nemo) and use the code below to make an inference on the model:
 ```
 import librosa

 The model is available for use in the [NVIDIA NeMo](https://github.com/NVIDIA/NeMo), and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
 ### Inference
+For inference, you can refer to our [Audio Codec Inference Tutorial](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tts/Audio_Codec_Inference.ipynb), which automatically downloads the model checkpoint. Ensure that you set the model_name parameter to "nvidia/low-frame-rate-speech-codec-22khz".
+Alternatively, you can use the code below, which also handles the automatic checkpoint download:
 ```
 import librosa
 ```
+If preferred, you can manually download the [checkpoint](https://huggingface.co/nvidia/low-frame-rate-speech-codec-22khz/resolve/main/low-frame-rate-speech-codec-22khz.nemo) and use the provided code to run inference on the model:
 ```
 import librosa