Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,12 @@
|
|
| 1 |
---
|
| 2 |
license: cc-by-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: cc-by-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
tags:
|
| 6 |
+
- code
|
| 7 |
+
- audio
|
| 8 |
+
- voice
|
| 9 |
---
|
| 10 |
+
|
| 11 |
+
This model is used to extract the speaker-agnostic content representation of an audio file. The model leverages AutoVC from Qian et al. [2019]..
|
| 12 |
+
The AutoVC network utilizes an LSTM-based encoder that compresses the input audio into a compact representation (bottleneck) trained to abandon the original speaker identity but preserve content. In our case, we extract a content embedding A โ R๐ร๐ท from AutoVC network, where ๐ is the total number of input audio frames, and ๐ท is the content dimension.
|