Update README.md
Browse files
README.md
CHANGED
|
@@ -85,11 +85,12 @@ feature = ecapa2_model(audio, label='gfe_1|pool|embedding')
|
|
| 85 |
feature = ecapa2_model(audio, label='embedding|gfe_1|pool')
|
| 86 |
```
|
| 87 |
|
| 88 |
-
The following table describes the available features
|
| 89 |
|
| 90 |
| Feature ID| Dimension | Description |
|
| 91 |
| ----------- | ----------- | ----------- |
|
| 92 |
-
| gfe_1
|
|
|
|
| 93 |
| pool | 3072 | Pooled statistics (mean and variance) before the bottleneck speaker embedding layer, extracted before ReLU layer.
|
| 94 |
| attention | 3072 | Same as the pooled statistics but with the attention weights applied.
|
| 95 |
| embedding | 192 | The standard ECAPA2 speaker embedding.
|
|
|
|
| 85 |
feature = ecapa2_model(audio, label='embedding|gfe_1|pool')
|
| 86 |
```
|
| 87 |
|
| 88 |
+
The following table describes the available features. All features consists of the mean and variance of the frame-level encodings at the indicated layer, expect for the speaker embedding.
|
| 89 |
|
| 90 |
| Feature ID| Dimension | Description |
|
| 91 |
| ----------- | ----------- | ----------- |
|
| 92 |
+
| gfe_1 | 2048 | Mean and variance of frame-level features as indicated in Figure 1, extracted before ReLU and BatchNorm layer.
|
| 93 |
+
| gfe_2 | 2048 | Mean and variance of frame-level features as indicated in Figure 1, extracted before ReLU and BatchNorm layer.
|
| 94 |
| pool | 3072 | Pooled statistics (mean and variance) before the bottleneck speaker embedding layer, extracted before ReLU layer.
|
| 95 |
| attention | 3072 | Same as the pooled statistics but with the attention weights applied.
|
| 96 |
| embedding | 192 | The standard ECAPA2 speaker embedding.
|