Jenthe
/

ECAPA2

Jenthe commited on Oct 16, 2023

Commit

f8b0f89

1 Parent(s): 1f55acd

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -27,6 +27,16 @@ Or with Conda:
 conda install -c conda-forge huggingface_hub
 ```
 ### Speaker Embedding Extraction
 Extracting speaker embeddings is easy and only requires a few lines of code:
@@ -34,17 +44,28 @@ Extracting speaker embeddings is easy and only requires a few lines of code:
 ```python
 import torch
 import torchaudio
-from huggingface_hub import hf_hub_download
-# automatically checks for cached file
-model_file = hf_hub_download(repo_id='Jenthe/ECAPA2', filename='model.pt')
-# change map_location to 'cuda' for GPU inference (recommended)
 ecapa2_model = torch.jit.load(model_file, map_location='cpu')
-# note: input audio should have a sample rate of 16 kHz
-audio, sr = torchaudio.load('sample.wav')
-embedding = ecapa2_model(audio)
 ```
 ### Hierarchical Feature Extraction

 conda install -c conda-forge huggingface_hub
 ```
+Download model:
+```python
+from huggingface_hub import hf_hub_download
+# automatically checks for cached file
+model_file = hf_hub_download(repo_id='Jenthe/ECAPA2', filename='model.pt')
+```
 ### Speaker Embedding Extraction
 Extracting speaker embeddings is easy and only requires a few lines of code:
 ```python
 import torch
 import torchaudio
 ecapa2_model = torch.jit.load(model_file, map_location='cpu')
+audio, sr = torchaudio.load('sample.wav') # sample rate of 16 kHz expected
+with torch.no_grad():
+  embedding = ecapa2_model(audio)
+```
+For faster, 16-bit half-precision CUDA inference (recommended):
+```python
+import torch
+import torchaudio
+ecapa2_model = torch.jit.load(model_file, map_location='cuda')
+ecapa2_model.half() # optional, but results in faster inference
+audio, sr = torchaudio.load('sample.wav') # sample rate of 16 kHz expected
+with torch.no_grad():
+  embedding = ecapa2_model(audio)
 ```
 ### Hierarchical Feature Extraction