cristinae commited on
Commit
569f00e
·
verified ·
1 Parent(s): a0f0166

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -88,7 +88,7 @@ Note: Although this example uses StyleTTS2, the model is compatible with other T
88
 
89
  ### Training data
90
 
91
- The model was trained on a phonemized Catalan corpus (any phonemizer can be used). The dataset includes sentences from speakers across Catalonia, Balearic Islands, and Valencia. It uses a consistent phoneme token set with boundary markers and masking tokens.
92
 
93
  Tokenizer: custom (split using whitespaces)
94
  Phoneme masking strategy: word-level and phoneme-level masking and replacement
 
88
 
89
  ### Training data
90
 
91
+ The model was trained on a phonemized Catalan corpus (any phonemizer can be used) extracted from the [CATalog](https://huggingface.co/datasets/projecte-aina/CATalog) corpus. The dataset includes sentences from speakers across Catalonia, Balearic Islands, and Valencia. It uses a consistent phoneme token set with boundary markers and masking tokens.
92
 
93
  Tokenizer: custom (split using whitespaces)
94
  Phoneme masking strategy: word-level and phoneme-level masking and replacement