pyon0024 commited on
Commit
96dbac5
·
verified ·
1 Parent(s): fa35dce

Update version of model using phonemes

Browse files
Files changed (1) hide show
  1. README.md +47 -57
README.md CHANGED
@@ -3,91 +3,81 @@ license: apache-2.0
3
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4
  tags:
5
  - lyrics
 
 
6
  - katakana
 
7
  - english-to-katakana
8
- - katakana-english
9
- - english2katakana
10
  - tinyllama
11
  ---
12
 
13
- # TinyLlama-1.1B-Katakana-Lyrics-Liaison
14
 
15
- This model is a fine-tuned version of `TinyLlama/TinyLlama-1.1B-Chat-v1.0` using LoRA. It is specifically designed to convert English lyrics and phrases into **Phonetic Katakana**, prioritizing real-world pronunciation, linking (liaison), and rhythm over literal dictionary spelling.
16
 
17
- ## 🌟 Concept: "The Training Wheels for English Rhythm"
18
 
19
- As the creator, I generally believe that English should be learned without Katakana. However, for children and beginners, the "fear of the written word" often stops them from speaking entirely.
20
 
21
- This model was built to provide **"Supportive Katakana"** — not just a translation, but a phonetic guide that helps learners mimic the actual rhythm and flow of native speakers, serving as temporary "training wheels" until they are ready to rely solely on their ears.
 
 
 
22
 
23
- ## Key Features
24
-
25
- * **Liaison & Linking:** Handles word connections naturally (e.g., `hold your` `ホージョ`, `take it` `テイキッ`).
26
- * **Silent Letters:** Trained to ignore silent consonants (e.g., `honest` → `オネス`, `hour` → `アワー`).
27
- * **Lyric-focused Reductions:** Strong support for informal contractions like `gonna`, `wanna`, and `gotta`.
28
- * **Complex Phonetics:** Specifically trained to handle difficult phonetic mappings like `Scarborough Fair` → `スカーブラフェア`.
29
 
30
- ## 📊 Comparison Examples
31
 
32
- | English Phrase | Dictionary-style (Standard) | **This Model (Phonetic)** |
33
- | --- | --- | --- |
34
- | I wanna hold your hand | アイ ウォナ ホールド ユア ハンド | **アイワナホージョハン** |
35
- | I gotta be honest with you | アイ ガッタ ビー オネスト ... | **アイガラビーオネスウィズユー** |
36
- | Scarborough Fair | スカーバラ フェア | **スカーブラフェア** |
37
- | Take it anymore | テイク イット エニモア | **テイキッエニモー** |
38
 
39
- ## 🚀 How to Use
40
 
41
- To get the best results, use the following prompt format:
42
 
43
- ```text
44
- 英語を歌いやすいように、音のつながり(リエゾン)を考慮してカタカナに変換してください。
45
 
46
- 英語: take it easy
47
- カタカナ: テイキッイージー
 
 
 
48
 
49
- 英語: I wanna hold you
50
- カタカナ: アイワナホージュー
51
 
52
- 英語: [Your English Phrase Here]
53
- カタカナ:
54
- ```
55
 
56
- ### Example Code (Python / Transformers)
57
-
58
- ```python
59
- from transformers import AutoModelForCausalLM, AutoTokenizer
60
- from peft import PeftModel
61
-
62
- base_model_path = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
63
- lora_model_path = "YOUR_USERNAME/TinyLlama-1.1B-Katakana-Lyrics-Liaison"
64
 
65
- tokenizer = AutoTokenizer.from_pretrained(base_model_path)
66
- model = AutoModelForCausalLM.from_pretrained(base_model_path)
67
- model = PeftModel.from_pretrained(model, lora_model_path)
 
68
 
69
- prompt = "英語を歌いやすいように、音のつながり(リエゾン)を考慮してカタカナに変換してください。\n\n英語: take it easy
70
- カタカナ: テイキッイージー\n\n英語: I wanna hold you\nカタカナ: アイワナホージュー\n\n英語: I love the way you lie\nカタカナ:"
71
- inputs = tokenizer(prompt, return_tensors="pt")
72
- outputs = model.generate(**inputs, max_new_tokens=50)
73
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
74
 
75
  ```
76
 
77
- ## 🛠 Training Details
78
 
79
- - **Dataset:** 1,200+ samples of **custom-curated phonetic pairs**.
80
- - **Methodology:** Developed using a "human-in-the-loop" approach, focusing on capturing real-world auditory experiences rather than robotic dictionary rules.
81
- - **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
82
- - **Base Model:** TinyLlama-1.1B-Chat-v1.0
83
 
84
- ## ⚠️ Limitations
85
 
86
- * **Model Size:** As a 1.1B model, it may occasionally hallucinate or misinterpret extremely long or rare technical terms.
87
- * **Dialect:** Primarily targets General American/Standard English pronunciation as heard in global pop music.
88
 
89
  ## 📜 License
90
 
91
- This model is licensed under the **Apache 2.0 License**, consistent with the base TinyLlama model.
92
 
93
- ---
 
3
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4
  tags:
5
  - lyrics
6
+ - phonetics
7
+ - g2p
8
  - katakana
9
+ - english-to-phoneme
10
  - english-to-katakana
11
+ - liaison
 
12
  - tinyllama
13
  ---
14
 
15
+ # TinyLlama-1.1B-Phonetic-Liaison-Katakana-Generator
16
 
17
+ This model is a fine-tuned version of `TinyLlama/TinyLlama-1.1B-Chat-v1.0` designed to predict **connected phoneme sequences** and **rhythm-optimized Katakana**. It focuses on capturing real-world auditory phenomena like liaison, reduction, and flapping.
18
 
19
+ ## 🌟 The Concept: "Phonetic Bridge for Natural Speech"
20
 
21
+ Traditional G2P (Grapheme-to-Phoneme) converters often treat words in isolation. This model serves as a **Phonetic Bridge**, predicting how sounds change in continuous speech.
22
 
23
+ ### For Global Developers (The "Connected Phonemes" Advantage)
24
+ While the model outputs Katakana, its core intelligence lies in generating **Connected Phoneme Sequences (ARPAbet)**.
25
+ - **TTS Frontend:** Use the linked phoneme output to improve the prosody of your Text-to-Speech engines.
26
+ - **ESL Tools:** Visualize for learners how "Take it" becomes `/t ey1 k ih1 t/` instead of two separate words.
27
 
28
+ ### For Japanese Learners ("The Training Wheels")
29
+ I am a firm believer that English should ideally be learned through ears, not Katakana. However, beginners often face a "fear of the written word."
30
+ This model provides **"Supportive Katakana"**—not a translation, but a phonetic map that mimics native rhythm, acting as training wheels for the ear.
 
 
 
31
 
32
+ ## Key Features
33
 
34
+ * **Connected Phonemes (ARPAbet):** Outputs the exact phonetic string including liaison (e.g., `a little bit` -> `AH0 L IH1 D AH0 L B IH1 T`).
35
+ * **Liaison & Flapping:** Naturally handles `T` to `D` transformations and word-to-word connections.
36
+ * **Silent Letters:** Intelligently ignores non-vocalized consonants.
37
+ * **Modern ESL Approach:** Designed for high-speed inference on mobile devices (ready for GGUF/on-device PoC).
 
 
38
 
39
+ ## 📊 Comparison: Beyond Dictionary Rules
40
 
 
41
 
 
 
42
 
43
+ | English Phrase | Dictionary Phonemes | **This Model (Linked Phonemes)** | **Supportive Katakana** |
44
+ | --- | --- | --- | --- |
45
+ | **A little bit** | `[AH0] [L IH1 T AH0 L] [B IH1 T]` | `AH0 L IH1 D AH0 L B IH1 T` | **アリロビッ** |
46
+ | **Check it out** | `[CH EH1 K] [IH1 T] [AW1 T]` | `CH EH1 K IH1 T AW1 T` | **チェキラッ** |
47
+ | **Middle of the night**| `[M IH1 D AH0 L] [AH1 V]...` | `M IH1 D AH0 L AH1 V DH AH0 N AY1 T`| **ミドロヴザナイッ** |
48
 
49
+ ## 🚀 Prompt Format
 
50
 
51
+ To extract both Katakana and the connected phoneme sequence, use the following format:
 
 
52
 
53
+ ```text
54
+ 英語とその単語単位の音素から、リエゾンを考慮したカタカナと繋がった音素列を生成してください。
 
 
 
 
 
 
55
 
56
+ 英語: take it easy
57
+ 単語音素: [T EY1 K] [IH1 T] [IY1 Z IY0]
58
+ カタカナ: テイキットイージー
59
+ 繋がった音素: T EY1 K IH1 T IY1 Z IY0
60
 
61
+ 英語: {Your Phrase}
62
+ 音素: {Standard G2P Output}
63
+ カタカナ:
 
 
64
 
65
  ```
66
 
67
+ ## 🛠 Technical Specs & Dataset
68
 
69
+ * **Dataset:** 1,200+ hand-curated pairs of English phrases and their auditory-correct phonetic mappings.
70
+ * **Evaluation:** Currently being benchmarked against the `speechocean762` dataset for pronunciation scoring PoC.
71
+ * **Architecture:** LoRA fine-tuning on TinyLlama 1.1B.
72
+ * **Optimization:** Highly compatible with **GGUF** for ultra-lightweight mobile app integration (MFCC/DTW based evaluation).
73
 
74
+ ## ⚠️ Limitations & Bias
75
 
76
+ * **Model Size:** 1.1B parameters. While fast, it may hallucinate on rare proper nouns.
77
+ * **Accent:** Optimized for General American English (GenAm) commonly found in global pop music and media.
78
 
79
  ## 📜 License
80
 
81
+ Apache 2.0
82
 
83
+ ```