TinyLlama-1.1B-Phonetic-Liaison-Katakana-Generator

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 designed to predict connected phoneme sequences and rhythm-optimized Katakana. It focuses on capturing real-world auditory phenomena like liaison, reduction, and flapping.

๐ŸŒŸ The Concept: "Phonetic Bridge for Natural Speech"

Traditional G2P (Grapheme-to-Phoneme) converters often treat words in isolation. This model serves as a Phonetic Bridge, predicting how sounds change in continuous speech.

For Global Developers (The "Connected Phonemes" Advantage)

While the model outputs Katakana, its core intelligence lies in generating Connected Phoneme Sequences (ARPAbet).

  • TTS Frontend: Use the linked phoneme output to improve the prosody of your Text-to-Speech engines.
  • ESL Tools: Visualize for learners how "Take it" becomes /t ey1 k ih1 t/ instead of two separate words.

For Japanese Learners ("The Training Wheels")

I am a firm believer that English should ideally be learned through ears, not Katakana. However, beginners often face a "fear of the written word." This model provides **"Supportive Katakana"**โ€”not a translation, but a phonetic map that mimics native rhythm, acting as training wheels for the ear.

โœจ Key Features

  • Connected Phonemes (ARPAbet): Outputs the exact phonetic string including liaison (e.g., a little bit -> AH0 L IH1 D AH0 L B IH1 T).
  • Liaison & Flapping: Naturally handles T to D transformations and word-to-word connections.
  • Silent Letters: Intelligently ignores non-vocalized consonants.
  • Modern ESL Approach: Designed for high-speed inference on mobile devices (ready for GGUF/on-device PoC).

๐Ÿ“Š Comparison: Beyond Dictionary Rules

English Phrase Dictionary Phonemes This Model (Linked Phonemes) Supportive Katakana
A little bit [AH0] [L IH1 T AH0 L] [B IH1 T] AH0 L IH1 D AH0 L B IH1 T ใ‚ขใƒชใƒญใƒ“ใƒƒ
Check it out [CH EH1 K] [IH1 T] [AW1 T] CH EH1 K IH1 T AW1 T ใƒใ‚งใ‚ญใƒฉใƒƒ
Middle of the night [M IH1 D AH0 L] [AH1 V]... M IH1 D AH0 L AH1 V DH AH0 N AY1 T ใƒŸใƒ‰ใƒญใƒดใ‚ถใƒŠใ‚คใƒƒ

๐Ÿš€ Prompt Format

To extract both Katakana and the connected phoneme sequence, use the following format:

่‹ฑ่ชžใจใใฎๅ˜่ชžๅ˜ไฝใฎ้Ÿณ็ด ใ‹ใ‚‰ใ€ใƒชใ‚จใ‚พใƒณใ‚’่€ƒๆ…ฎใ—ใŸใ‚ซใ‚ฟใ‚ซใƒŠใจ็น‹ใŒใฃใŸ้Ÿณ็ด ๅˆ—ใ‚’็”Ÿๆˆใ—ใฆใใ ใ•ใ„ใ€‚

่‹ฑ่ชž: take it easy
ๅ˜่ชž้Ÿณ็ด : [T EY1 K] [IH1 T] [IY1 Z IY0]
ใ‚ซใ‚ฟใ‚ซใƒŠ: ใƒ†ใ‚คใ‚ญใƒƒใƒˆใ‚คใƒผใ‚ธใƒผ
็น‹ใŒใฃใŸ้Ÿณ็ด : T EY1 K IH1 T IY1 Z IY0

่‹ฑ่ชž: {Your Phrase}
ๅ˜่ชž้Ÿณ็ด : {Standard G2P Output}
ใ‚ซใ‚ฟใ‚ซใƒŠ: 

๐Ÿ›  Technical Specs & Dataset

  • Dataset: 1,200+ hand-curated pairs of English phrases and their auditory-correct phonetic mappings.
  • Evaluation: Currently being benchmarked against the speechocean762 dataset for pronunciation scoring PoC.
  • Architecture: LoRA fine-tuning on TinyLlama 1.1B.
  • Optimization: Highly compatible with GGUF for ultra-lightweight mobile app integration (MFCC/DTW based evaluation).

โš ๏ธ Limitations & Bias

  • Model Size: 1.1B parameters. While fast, it may hallucinate on rare proper nouns.
  • Accent: Optimized for General American English (GenAm) commonly found in global pop music and media.

๐Ÿ“œ License

Apache 2.0


Downloads last month
19
Safetensors
Model size
1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pyon0024/tinyllama-katakana-converter

Finetuned
(497)
this model

Space using pyon0024/tinyllama-katakana-converter 1