Question about the Russian dataset?
#2
by rraaww - opened
Now the model often generates И(vowel i) instead of Й (Voiced palatal approximant) during synthesis. This is a very strong distinction in many cases, for example the plural: герой [hero] and герои [heroes]
Similarly with the letter Ё (jo), which is synthesized as E (jɛ). For example, Всё | vsʲˈo [everything] and Все | vsʲˈe [Everyone / Everybody] — these words should be pronounced differently so that a native Russian speaker doesn't feel unnatural when listening to synthesized speech. There are also many other words with the letter Ё.
Was there over-normalization during the data preparation stage? Or what else could be causing this problem?