Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# <a name="introduction"></a> XPhoneBERT : A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech
|
| 2 |
|
| 3 |
XPhoneBERT is the first pre-trained multilingual model for phoneme representations for text-to-speech(TTS). XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approach on 330M phoneme-level sentences from nearly 100 languages and locales. Experimental results show that employing XPhoneBERT as an input phoneme encoder significantly boosts the performance of a strong neural TTS model in terms of naturalness and prosody and also helps produce fairly high-quality speech with limited training data.
|
|
@@ -62,5 +65,4 @@ input_ids = tokenizer(input_phonemes, return_tensors="pt")
|
|
| 62 |
|
| 63 |
with torch.no_grad():
|
| 64 |
features = xphonebert(**input_ids)
|
| 65 |
-
```
|
| 66 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
# <a name="introduction"></a> XPhoneBERT : A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech
|
| 5 |
|
| 6 |
XPhoneBERT is the first pre-trained multilingual model for phoneme representations for text-to-speech(TTS). XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approach on 330M phoneme-level sentences from nearly 100 languages and locales. Experimental results show that employing XPhoneBERT as an input phoneme encoder significantly boosts the performance of a strong neural TTS model in terms of naturalness and prosody and also helps produce fairly high-quality speech with limited training data.
|
|
|
|
| 65 |
|
| 66 |
with torch.no_grad():
|
| 67 |
features = xphonebert(**input_ids)
|
| 68 |
+
```
|
|
|