Alignment output?
#11
by
alistim
- opened
Really awesome work, folks! Wondering if you have any plans for character or word-level alignment outputs? Would be helpful for supporting interruptions in my AI voice agent context.
I'd love to know what letter or word was spoken when relative to start of speech, rather than just using linear interpolation. π
Hello! Since Supertonic is not based on phoneme-level duration modeling, providing character or word-level alignments is not straightforward at this time. We appreciate the suggestion and will keep this feature in mind for future consideration.