XLS-R-tamil-phoneme / README.md
speech31's picture
Create README.md
c2af6a8 verified
|
raw
history blame
645 Bytes

This model is fine-tuned on the Tamil dataset from Common Voice 16.1, preprocessed using Epitran for transliterating text into IPA. The 'tam-Taml' code was employed to generate a precise phoneme list, crucial for capturing the nuances of Tamil phonetics:

  • Vowels: 'a', 'aj', 'aʋ', 'aː', 'e', 'eː', 'i', 'iː', 'o', 'oː', 'u', 'uː'
  • Consonants:
    • Nasals: 'm', 'n', 'n̪', 'ŋ', 'ɲ', 'ɳ'
    • Stops: 'p', 't̪', 'ʈ', 'k',
    • Affricates: 'd͡ʒ', 't͡ʃ'
    • Fricatives: 'ʋ', 's', 'ʂ', 'ʃ', 'h'
    • Approximants: 'j', 'ɻ', 'ɾ', 'l', 'ɭ'
    • Consonant cluster: 'kʂ'
  • Special Symbols: '்' (denotes absence of inherent vowel)