Thank you but german umlauts not working - fixed

#8
by hendrik289 - opened

Thank you very much for the model. It is a real shame, though, because it sounds very good in itself, but it "swallows" German umlauts—making it, unfortunately, impractical for use in German.

Supertone org

Hi @hendrik289 ,

Thank you so much for trying the model and for taking the time to share this feedback. You are right that the current umlaut behavior is a real limitation for German, and we are sorry that the experience falls short there, especially since ä/ö/ü are so important for natural-sounding German.

A bit of context on what is happening:

Supertonic 3 uses a shared character vocabulary across many languages and scripts. To support this, the current text pipeline applies Unicode NFKD normalization. This means that characters with diacritics are decomposed into a base character plus a combining mark. For example, ä becomes a + a combining diaeresis mark, ö becomes o + the same mark, and so on.

So the information is not intentionally removed, but the model has to learn that the combining mark changes the pronunciation of the preceding vowel. In practice, that signal can be much weaker than the base letters themselves, especially in a multilingual model where some languages have much more data than others. As a result, the model may sometimes pronounce the base vowel and under-realize the umlaut, which is the issue you are hearing.

We are reviewing these language-specific pronunciation issues as part of our next improvements. A more robust fix would likely involve changes to tokenization or text representation, such as preserving important language-specific characters like umlauts as single tokens rather than relying only on decomposed combining marks.

At the moment, we do not have a near-term plan to release a German-specific checkpoint or a quick German-only retraining update. But we agree this is an important issue, and if we make improvements to German pronunciation in a future release, we will share an update in the model card and discussions.

Thanks again for the kind words about the overall sound quality, and for the honest feedback. Reports like this are very helpful for deciding what to improve next.

juheon2 changed discussion status to closed
Supertone org

Hi @hendrik289 ,

A quick correction and update from our side.

We investigated this again by comparing the Hugging Face Space and the Python package behavior, and it turns out my earlier explanation was not the actual root cause in this case.

The model itself can handle German umlauts correctly. The issue was in the Python package preprocessing path: after Unicode normalization, the package was removing combining diacritic marks. As a result, words such as schön / Küche could effectively be passed to the model without the umlaut distinction, which caused the pronunciation problem in local Python usage.

We have fixed this in the Python package so that decomposed diacritics are preserved properly. The Hugging Face Space was already using a different preprocessing path, which is why the issue may not reproduce there.

Once the updated package is released, please try again with:

pip install -U supertonic

Sorry for the earlier inaccurate diagnosis, and thank you again for reporting this. Your feedback helped us find and fix a real SDK-side bug.

juheon2 changed discussion status to open

Wow great, thanks for your response! I can confirm that it solved the problem. Nice work!

hendrik289 changed discussion title from Thank you but german umlauts not working to Thank you but german umlauts not working - fixed
juheon2 changed discussion status to closed

Sign up or log in to comment