mmcauliffe commited on
Commit
04a2961
·
verified ·
1 Parent(s): 9b604e0

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,160 +1,485 @@
1
- ---
2
- language:
3
- - en
4
- thumbnail: null
5
- pipeline_tag: forced-alignment
6
- tags:
7
- - montreal-forced-aligner
8
- - forced-alignment
9
- license: cc-by-4.0
10
- ---
11
- # English MFA model
12
-
13
- ## Model details
14
-
15
- - **Maintainer:** [Montreal Corpus Tools](https://huggingface.co/MontrealCorpusTools)
16
- - **Language:** [English](https://en.wikipedia.org/wiki/English_language)
17
- - **Dialect:** N/A
18
- - **Phone set:** [MFA](https://mfa-models.readthedocs.io/en/refactor/mfa_phone_set.html#english)
19
- - **Features:** `MFCC`
20
- - **Architecture:** `gmm-hmm`
21
- - **Model version:** `v3.1.0`
22
- - **Trained date:** `2024-06-12`
23
- - **Compatible MFA version:** `v3.1.0`
24
- - **License:** [CC BY 4.0](https://huggingface.co/MontrealCorpusTools/english_mfa/LICENSE)
25
- - **Citation:**
26
-
27
- ```bibtex
28
- @techreport{mfa_english_mfa_acoustic_2024,
29
- author={McAuliffe, Michael and Sonderegger, Morgan},
30
- title={English MFA acoustic model v3.1.0},
31
- address={\url{https://huggingface.co/MontrealCorpusTools/english_mfa}},
32
- year={2024},
33
- month={Jun},
34
- }
35
- ```
36
-
37
- - If you have comments or questions about this model, you can check [previous MFA model discussion posts](https://github.com/MontrealCorpusTools/mfa-models/discussions?discussions_q=English+MFA+acoustic+model+v3.1.0) or create [a new one](https://github.com/MontrealCorpusTools/mfa-models/discussions/new).
38
-
39
- ## Installation
40
-
41
- Install from the [MFA command line](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/models/index.html):
42
-
43
- ```
44
- mfa model download acoustic english_mfa
45
- ```
46
-
47
- Or download from [the release page](https://github.com/MontrealCorpusTools/mfa-models/releases/tag/acoustic-english_mfa-v3.1.0).
48
-
49
- ## Intended use
50
-
51
- This model is intended for forced alignment of [English](https://en.wikipedia.org/wiki/English_language) transcripts.
52
-
53
- This model uses the [MFA](https://mfa-models.readthedocs.io/en/refactor/mfa_phone_set.html#english) phone set for English, and was trained with the pronunciation dictionaries above. Pronunciations can be added on top of the dictionary, as long as no additional phones are introduced.
54
-
55
- ## Performance Factors
56
-
57
- As forced alignment is a relatively well-constrained problem (given accurate transcripts), this model should be applicable to a range of recording conditions and speakers. However, please note that it was trained on read speech in low-noise environments, so as your data diverges from that, you may run into alignment issues or need to [increase the beam size of MFA](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/configuration/#configuring-specific-commands) or see other recommendations in the [troubleshooting section below](#troubleshooting-issues).
58
-
59
- Please note as well that MFA does not use state-of-the-art ASR models for forced alignment. You may get better performance (especially on speech-to-text tasks) using other frameworks like [Coqui](https://coqui.ai/).
60
-
61
- ## Metrics
62
-
63
- Acoustic models are typically generated as one component of a larger ASR system where the metric is word error rate (WER). For forced alignment, there is typically not the same sort of gold standard measure for most languages.
64
-
65
- As a rough approximation of the acoustic model quality, we evaluated it against the corpus it was trained on alongside a language model trained from the same data. Key caveat here is that this is not a typical WER measure on held out data, so it should not be taken as a hard measure of how well an acoustic model will generalize to your data, but rather is more of a sanity check that the training data quality was sufficiently high.
66
-
67
- Using the pronunciation dictionaries and language models above:
68
-
69
- - **WER:** `0%`
70
- - **CER:** `0%`
71
-
72
- ## Ethical considerations
73
-
74
- Deploying any Speech-to-Text model into any production setting has ethical implications. You should consider these implications before use.
75
-
76
- ### Demographic Bias
77
-
78
- You should assume every machine learning model has demographic bias unless proven otherwise. For STT models, it is often the case that transcription accuracy is better for men than it is for women. If you are using this model in production, you should acknowledge this as a potential issue.
79
-
80
- ### Surveillance
81
-
82
- Speech-to-Text technologies may be misused to invade the privacy of others by recording and mining information from private conversations. This kind of individual privacy is protected by law in many countries. You should not assume consent to record and analyze private speech.
83
-
84
-
85
- ## Troubleshooting issues
86
-
87
- Machine learning models (like this acoustic model) perform best on data that is similar to the data on which they were trained.
88
-
89
- The primary sources of variability in forced alignment will be the applicability of the pronunciation dictionary and how similar the speech, demographics, and recording conditions are. If you encounter issues in alignment, there are couple of avenues to improve performance:
90
-
91
- 1. [Increase the beam size of MFA](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/configuration/#configuring-specific-commands)
92
-
93
- * MFA defaults to a narrow beam to ensure quick alignment and also as a way to detect potential issues in your dataset, but depending on your data, you might benefit from boosting the beam to 100 or higher.
94
-
95
- 2. Add pronunciations to the pronunciation dictionary
96
-
97
- * This model was trained a particular dialect/style, and so adding pronunciations more representative of the variety spoken in your dataset will help alignment.
98
-
99
- 3. Check the quality of your data
100
-
101
- * MFA includes a [validator utility](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/data_validation.html), which aims to detect issues in the dataset.
102
- * Use MFA's [anchor utility](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/anchor.html) to visually inspect your data as MFA sees it and correct issues in transcription or OOV items.
103
-
104
- 4. Adapt the model to your data
105
-
106
- * MFA has an [adaptation command](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/adapt_acoustic_model.html) to adapt some of the model to your data based on an initial alignment, and then run another alignment with the adapted model.
107
-
108
- ## Training data
109
-
110
- This model was trained on the following corpora:
111
-
112
- * [Common Voice English v17](https://datacollective.mozillafoundation.org/datasets/cmj8u3p1w0075nxxbe8bedl00):
113
- * **Hours:** `2322.80`
114
- * **Speakers:** `71,160`
115
- * **Utterances:** `1,625,987`
116
-
117
- * [LibriSpeech English](https://openslr.org/12/):
118
- * **Hours:** `982.10`
119
- * **Speakers:** `2,484`
120
- * **Utterances:** `292,367`
121
-
122
- * [Corpus of Regional African American Language](https://oraal.github.io/coraal):
123
- * **Hours:** `124.31`
124
- * **Speakers:** `193`
125
- * **Utterances:** `236,792`
126
-
127
- * [Google Nigerian English](https://openslr.org/70/):
128
- * **Hours:** `5.77`
129
- * **Speakers:** `31`
130
- * **Utterances:** `3,359`
131
-
132
- * [Google UK and Ireland English](https://openslr.org/83/):
133
- * **Hours:** `31.29`
134
- * **Speakers:** `120`
135
- * **Utterances:** `17,877`
136
-
137
- * [NCHLT English](https://repo.sadilar.org/items/d944b028-6a86-4edf-a7d9-7f5e21544a41):
138
- * **Hours:** `56.43`
139
- * **Speakers:** `210`
140
- * **Utterances:** `77,412`
141
-
142
- * [ARU English corpus](https://datacat.liverpool.ac.uk/681/):
143
- * **Hours:** `7.13`
144
- * **Speakers:** `12`
145
- * **Utterances:** `8,640`
146
-
147
- * [ICE-Nigeria](https://sourceforge.net/projects/ice-nigeria/):
148
- * **Hours:** `52.86`
149
- * **Speakers:** `1,276`
150
- * **Utterances:** `113,664`
151
-
152
- * [A Scripted Pakistani English Daily-use Speech Corpus](https://magichub.com/datasets/pakistani-english-scripted-speech-corpus-daily-use-sentence/):
153
- * **Hours:** `4.00`
154
- * **Speakers:** `7`
155
- * **Utterances:** `2,191`
156
-
157
- * [L2-ARCTIC](https://psi.engr.tamu.edu/l2-arctic-corpus/):
158
- * **Hours:** `27.51`
159
- * **Speakers:** `24`
160
- * **Utterances:** `27,042`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: forced-alignment
5
+ library_name: montreal-forced-aligner
6
+ tags:
7
+ - montreal-forced-aligner
8
+ - forced-alignment
9
+ license: cc-by-4.0
10
+ ---
11
+ # Model Card for english_mfa
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+ This MFA model is for aligning Global English across multiple dialects.
16
+
17
+ - [Model details](#model-details)
18
+ - [Uses](#uses)
19
+ - [Performance Factors](#how-to-get-started-with-the-model)
20
+ - [Dictionary Details](#dictionary-details)
21
+ - [Training Details](#training-details)
22
+ - [Evaluation](#evaluation)
23
+ - [Contact](#contact)
24
+
25
+ ## Model Details
26
+
27
+ ### Model Description
28
+
29
+ <!-- Provide a longer summary of what this model is. -->
30
+
31
+
32
+
33
+ - **Developed by:** Michael McAuliffe
34
+ - **Funded by:** N/A
35
+ - **Model type:** Montreal Forced Aligner model
36
+ - **Language(s) (NLP):** English
37
+ - **License:** cc-by-4.0
38
+
39
+ ## Uses
40
+
41
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
42
+
43
+ ### Direct Use
44
+
45
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
46
+
47
+ This model is intended to be used for forced alignment of English varieties that it was trained on (US, UK, Indian, Nigerian Englishes). Please see https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html for details on common fixes.
48
+
49
+ ### Out-of-Scope Use
50
+
51
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
52
+
53
+ This model cannot provide accurate assessments of goodness of pronunciations or provide transcripts as it is trained to be accepting of variation in pronunciation to provide a reasonable alignment across many varieties of English.
54
+
55
+ ## Bias, Risks, and Limitations
56
+
57
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
58
+
59
+ This model will perform best on the variety of speech that it was trained on, with a bias towards US English. The speakers in the training data are all adult speakers, so child speech alignment may not be accurate.
60
+
61
+ ### Recommendations
62
+
63
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
64
+
65
+ When using this model on a variety that it was not trained on, better results can be attained by adapting the model to the data to be aligned first. See https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/adapt_acoustic_model.html and https://github.com/mmcauliffe/mfa-adaptation for example usage and scripts.
66
+
67
+ ## How to Get Started with the Model
68
+
69
+ Use the code below to get started with the model.
70
+
71
+ To get started, follow the instructions for [installing MFA](https://montreal-forced-aligner.readthedocs.io/en/latest/getting_started.html). To align files using this model, use the [mfa align](https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/alignment.html) command.
72
+
73
+ ## Dictionary Details
74
+
75
+ #### Details for english_nigeria_mfa dictionary and G2P model
76
+
77
+ - **Source:** wikipron
78
+ - **Orthography:** Latin
79
+ - **Phone set:** MFA
80
+ - **Words:** 77,407
81
+ * **Phones:** 66
82
+ * **Graphemes:** 46
83
+
84
+ ##### IPA chart
85
+
86
+ ###### Consonants
87
+
88
+ | Manner | Labial | Labiodental | Dental | Alveolar | Alveopalatal | Palatal | Velar | Glottal |
89
+ | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
90
+ | **Nasal** | m | ɱ | | n | | ɲ | ŋ | |
91
+ | Palatalized | | | | | | | | |
92
+ | **Stop** | p b | | t̪ d̪ | t d | | c ɟ | k ɡ | |
93
+ | Aspirated | | | | | | | | |
94
+ | Labialized | | | | tʷ | | cʷ ɟʷ | kʷ ɡʷ | |
95
+ | Palatalized | pʲ bʲ | | | tʲ dʲ | | | | |
96
+ | **Affricate** | | | | | tʃ dʒ | | | |
97
+ | **Sibilant** | | | | s z | ʃ | | | |
98
+ | **Fricative** | | f v | θ ð | | | ç | | h|
99
+ | Palatalized | | fʲ vʲ | | | | | | |
100
+ | **Approximant** | w | | | ɹ | | j | | |
101
+ | **Lateral** | | | | l ɫ | | ʎ | | |
102
+
103
+ ###### Vowels
104
+
105
+ | | Front | Near-Front | Central | Near-Back | Back |
106
+ | :----: | :----: | :----: | :----: | :----: | :----: |
107
+ | **Close** | i iː | | | | u uː|
108
+ | | | | | ʊ | |
109
+ | **Close-Mid** | e | | | | o|
110
+ | | | | | | |
111
+ | **Open-Mid** | ɛ ɛː | | ɜ ɜː | | ɔ|
112
+ | | | | | | |
113
+ | **Open** | | | a aː | | |
114
+
115
+ ##### Diphthongs
116
+ * aj
117
+ * ɔj
118
+ * aw
119
+
120
+
121
+ #### Other
122
+ * kp
123
+
124
+
125
+ #### Details for english_india_mfa dictionary and G2P model
126
+
127
+ - **Source:** wikipron
128
+ - **Orthography:** Latin
129
+ - **Phone set:** MFA
130
+ - **Words:** 123,926
131
+ * **Phones:** 64
132
+ * **Graphemes:** 62
133
+
134
+ ##### IPA chart
135
+
136
+ ###### Consonants
137
+
138
+ | Manner | Labial | Labiodental | Dental | Alveolar | Alveopalatal | Retroflex | Palatal | Velar | Glottal |
139
+ | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
140
+ | **Nasal** | m | ɱ | | n | | | ɲ | ŋ | |
141
+ | Palatalized | mʲ | | | | | | | | |
142
+ | **Stop** | p b | | t̪ d̪ | | | ʈ ɖ | c ɟ | k ɡ | ʔ|
143
+ | Labialized | pʷ | | | | | ʈʷ | cʷ ɟʷ | kʷ ɡʷ | |
144
+ | Palatalized | pʲ | | | | | ʈʲ | | | |
145
+ | **Affricate** | | | | | tʃ dʒ | | | | |
146
+ | **Sibilant** | | | | s z | ʃ ʒ | | | | |
147
+ | **Fricative** | | f | | | | | ç | | h|
148
+ | Palatalized | | fʲ | | | | | | | |
149
+ | **Approximant** | | ʋ | | ɹ | | | j | | |
150
+ | **Tap** | | | | ɾ | | | | | |
151
+ | **Lateral** | | | | l | | | ʎ | | |
152
+
153
+ ###### Vowels
154
+
155
+ | | Front | Near-Front | Central | Near-Back | Back |
156
+ | :----: | :----: | :----: | :----: | :----: | :----: |
157
+ | **Close** | i iː | | ʉ ʉː | | |
158
+ | | | ɪ | | ʊ | |
159
+ | **Close-Mid** | eː | | | | oː|
160
+ | | | | ə | | |
161
+ | **Open-Mid** | ɛ ɛː | | ɜ ɜː | | |
162
+ | | | | | | |
163
+ | **Open** | | | a aː | | ɑ ɑː ɒ ɒː |
164
+
165
+ ##### Diphthongs
166
+ * aj
167
+ * ɔj
168
+ * aw
169
+
170
+
171
+ #### Details for english_uk_mfa dictionary and G2P model
172
+
173
+ - **Source:** wikipron
174
+ - **Orthography:** Latin
175
+ - **Phone set:** MFA
176
+ - **Words:** 186,963
177
+ * **Phones:** 78
178
+ * **Graphemes:** 88
179
+
180
+ ##### IPA chart
181
+
182
+ ###### Consonants
183
+
184
+ | Manner | Labial | Labiodental | Dental | Alveolar | Alveopalatal | Palatal | Velar | Glottal |
185
+ | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
186
+ | **Nasal** | m m̩ | ɱ | | n n̩ | | ɲ | ŋ | |
187
+ | Palatalized | mʲ | | | | | | | |
188
+ | **Stop** | p b | | t̪ d̪ | t d | | c ɟ | k ɡ | ʔ|
189
+ | Aspirated | pʰ | | | tʰ | | cʰ | kʰ | |
190
+ | Labialized | pʷ | | | tʷ | | cʷ ɟʷ | kʷ ɡʷ | |
191
+ | Palatalized | pʲ bʲ | | | tʲ dʲ | | | | |
192
+ | **Affricate** | | | | | tʃ dʒ | | | |
193
+ | **Sibilant** | | | | s z | ʃ ʒ | | | |
194
+ | **Fricative** | | f v | θ ð | | | ç | | h|
195
+ | Labialized | | vʷ | | | | | | |
196
+ | Palatalized | | fʲ vʲ | | | | | | |
197
+ | **Approximant** | w | | | ɹ | | j | | |
198
+ | **Lateral** | | | | l ɫ ɫ̩ | | ʎ | | |
199
+
200
+ ###### Vowels
201
+
202
+ | | Front | Near-Front | Central | Near-Back | Back |
203
+ | :----: | :----: | :----: | :----: | :----: | :----: |
204
+ | **Close** | i iː | | ʉ ʉː | | |
205
+ | | | ɪ | | ʊ | |
206
+ | **Close-Mid** | e ej | | | | |
207
+ | | | | ə | | |
208
+ | **Open-Mid** | ɛ ɛː | | ɜ ɜː | | |
209
+ | | | | ɐ | | |
210
+ | **Open** | | | a | | ɑ ɑː ɒ ɒː |
211
+
212
+ ##### Diphthongs
213
+ * ej
214
+ * əw
215
+ * ɔj
216
+ * aw
217
+ * aj
218
+
219
+
220
+ #### Details for english_us_mfa dictionary and G2P model
221
+
222
+ - **Source:** wikipron
223
+ - **Orthography:** Latin
224
+ - **Phone set:** MFA
225
+ - **Words:** 251,336
226
+ * **Phones:** 78
227
+ * **Graphemes:** 87
228
+
229
+ ##### IPA chart
230
+
231
+ ###### Consonants
232
+
233
+ | Manner | Labial | Labiodental | Dental | Alveolar | Alveopalatal | Palatal | Velar | Glottal |
234
+ | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
235
+ | **Nasal** | m m̩ | ɱ | | n n̩ | | ɲ | ŋ | |
236
+ | Palatalized | mʲ | | | | | | | |
237
+ | **Stop** | p b | | t̪ d̪ | t d | | c ɟ | k ɡ | ʔ|
238
+ | Aspirated | pʰ | | | tʰ | | cʰ | kʰ | |
239
+ | Labialized | pʷ | | | tʷ | | cʷ ɟʷ | kʷ ɡʷ | |
240
+ | Palatalized | pʲ bʲ | | | tʲ dʲ | | | | |
241
+ | **Affricate** | | | | | tʃ dʒ | | | |
242
+ | **Sibilant** | | | | s z | ʃ ʒ | | | |
243
+ | **Fricative** | | f v | θ ð | | | ç | | h|
244
+ | Palatalized | | fʲ vʲ | | | | | | |
245
+ | **Approximant** | w | | | ɹ | | j | | |
246
+ | **Tap** | | | | ɾ ɾ̃ | | | | |
247
+ | Palatalized | | | | ɾʲ | | | | |
248
+ | **Lateral** | | | | l ɫ ɫ̩ | | ʎ | | |
249
+
250
+ ###### Vowels
251
+
252
+ | | Front | Near-Front | Central | Near-Back | Back |
253
+ | :----: | :----: | :----: | :----: | :----: | :----: |
254
+ | **Close** | i iː | | ʉ ʉː | | |
255
+ | | | ɪ | | ʊ | |
256
+ | **Close-Mid** | ej | | | | ow|
257
+ | | | | ə ɚ | | |
258
+ | **Open-Mid** | ɛ | | ɝ | | |
259
+ | | æ | | ɐ | | |
260
+ | **Open** | | | | | ɑ ɑː ɒ ɒː |
261
+
262
+ ##### Diphthongs
263
+ * ej
264
+ * ɔj
265
+ * aw
266
+ * ow
267
+ * aj
268
+
269
+
270
+ ## Training Details
271
+
272
+ ### Training Data
273
+
274
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
275
+
276
+ #### ARU English corpus
277
+
278
+ - **Source:** http://datacat.liverpool.ac.uk/681/
279
+ - **License:** CC BY 3.0
280
+ - **Dialects:** uk
281
+ - **Number of hours:** 7.13
282
+ - **Number of utterances:** 8,640
283
+ - **Number of speakers:** 12
284
+ - **Female speakers:** 0
285
+ - **Male speakers:** 0
286
+ - **Unknown speakers:** 0
287
+
288
+
289
+ #### Common Voice English
290
+
291
+ - **Source:** https://voice.mozilla.org/en/datasets
292
+ - **License:** CC-0
293
+ - **Dialects:** us, uk, nigeria, india
294
+ - **Number of hours:** 2,322.80
295
+ - **Number of utterances:** 1,625,987
296
+ - **Number of speakers:** 71,160
297
+ - **Female speakers:** 3,750
298
+ - **Male speakers:** 14,586
299
+ - **Unknown speakers:** 52,824
300
+
301
+
302
+ #### Corpus of Regional African American Language
303
+
304
+ - **Source:** https://oraal.uoregon.edu/coraal
305
+ - **License:** CC BY-NC-SA 4.0
306
+ - **Dialects:** us
307
+ - **Number of hours:** 124.31
308
+ - **Number of utterances:** 236,792
309
+ - **Number of speakers:** 193
310
+ - **Female speakers:** 88
311
+ - **Male speakers:** 94
312
+ - **Unknown speakers:** 11
313
+
314
+
315
+ #### Google Nigerian English
316
+
317
+ - **Source:** https://openslr.org/70/
318
+ - **License:** CC BY-SA 4.0
319
+ - **Dialects:** nigeria
320
+ - **Number of hours:** 5.77
321
+ - **Number of utterances:** 3,359
322
+ - **Number of speakers:** 31
323
+ - **Female speakers:** 19
324
+ - **Male speakers:** 12
325
+ - **Unknown speakers:** 0
326
+
327
+
328
+ #### Google UK and Ireland English
329
+
330
+ - **Source:** https://openslr.org/83/
331
+ - **License:** CC BY-SA 4.0
332
+ - **Dialects:** uk
333
+ - **Number of hours:** 31.29
334
+ - **Number of utterances:** 17,877
335
+ - **Number of speakers:** 120
336
+ - **Female speakers:** 49
337
+ - **Male speakers:** 71
338
+ - **Unknown speakers:** 0
339
+
340
+
341
+ #### ICE-Nigeria
342
+
343
+ - **Source:** https://sourceforge.net/projects/ice-nigeria/
344
+ - **License:** CC BY-NC-SA 3.0
345
+ - **Dialects:** nigeria
346
+ - **Number of hours:** 52.86
347
+ - **Number of utterances:** 113,664
348
+ - **Number of speakers:** 1,276
349
+ - **Female speakers:** 293
350
+ - **Male speakers:** 853
351
+ - **Unknown speakers:** 130
352
+
353
+
354
+ #### L2-ARCTIC
355
+
356
+ - **Source:** https://psi.engr.tamu.edu/l2-arctic-corpus/
357
+ - **License:** CC BY-NC 4.0
358
+ - **Dialects:** N/A
359
+ - **Number of hours:** 27.51
360
+ - **Number of utterances:** 27,042
361
+ - **Number of speakers:** 24
362
+ - **Female speakers:** 12
363
+ - **Male speakers:** 12
364
+ - **Unknown speakers:** 0
365
+
366
+
367
+ #### LibriSpeech English
368
+
369
+ - **Source:** https://openslr.org/12/
370
+ - **License:** CC BY 4.0
371
+ - **Dialects:** us
372
+ - **Number of hours:** 982.10
373
+ - **Number of utterances:** 292,367
374
+ - **Number of speakers:** 2,484
375
+ - **Female speakers:** 1,283
376
+ - **Male speakers:** 1,201
377
+ - **Unknown speakers:** 0
378
+
379
+
380
+ #### NCHLT English
381
+
382
+ - **Source:** https://repo.sadilar.org/handle/20.500.12185/274
383
+ - **License:** CC BY 3.0
384
+ - **Dialects:** N/A
385
+ - **Number of hours:** 56.43
386
+ - **Number of utterances:** 77,412
387
+ - **Number of speakers:** 210
388
+ - **Female speakers:** 0
389
+ - **Male speakers:** 0
390
+ - **Unknown speakers:** 210
391
+
392
+
393
+ #### A Scripted Pakistani English Daily-use Speech Corpus
394
+
395
+ - **Source:** https://magichub.com/datasets/pakistani-english-scripted-speech-corpus-daily-use-sentence/
396
+ - **License:** CC BY-NC-ND 4.0
397
+ - **Dialects:** india
398
+ - **Number of hours:** 4.00
399
+ - **Number of utterances:** 2,191
400
+ - **Number of speakers:** 7
401
+ - **Female speakers:** 3
402
+ - **Male speakers:** 4
403
+ - **Unknown speakers:** 0
404
+
405
+
406
+ ### Training Procedure
407
+
408
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
409
+
410
+ #### Preprocessing
411
+
412
+ Preprocessing include fixes and orthographic standardization to various corpora.
413
+
414
+
415
+ #### Training Hyperparameters
416
+
417
+ - **Training regime:** [Training configuration](config.yaml)
418
+
419
+ ## Evaluation
420
+
421
+ <!-- This section describes the evaluation protocols and provides the results. -->
422
+
423
+ ### Testing Data, Factors & Metrics
424
+
425
+ #### Testing Data
426
+
427
+ <!-- This should link to a Dataset Card if possible. -->
428
+
429
+ N/A
430
+
431
+ #### Factors
432
+
433
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
434
+
435
+ N/A
436
+
437
+ #### Metrics
438
+
439
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
440
+
441
+ N/A
442
+
443
+ ### Results
444
+
445
+ N/A
446
+
447
+ #### Summary
448
+
449
+
450
+
451
+ ## Technical Specifications
452
+
453
+ ### Model Architecture and Objective
454
+
455
+ HMM-GMM model
456
+
457
+ #### Software
458
+
459
+ This model was trained via the [Montreal Forced Aligner](https://montreal-forced-aligner.readthedocs.io/).
460
+
461
+ ## Citation
462
+
463
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
464
+
465
+ **BibTeX:**
466
+
467
+ ```
468
+ @techreport{mfa_english_mfa_acoustic_2026,
469
+ author={McAuliffe, Michael and Sonderegger, Morgan},
470
+ title={Global English MFA acoustic model v3.3.0},
471
+ address={\url{https://huggingface.co/MontrealCorpusTools/english_mfa}},
472
+ year={2026},
473
+ month={Jun},
474
+ }
475
+ ```
476
+
477
+ **APA:**
478
+
479
+ ```
480
+ McAuliffe, M. & Sonderegger, M. (2026). Global English MFA acoustic model v3.3.0. Available at https://huggingface.co/MontrealCorpusTools/english_mfa.
481
+ ```
482
+
483
+ ## Contact
484
+
485
+ For questions and issues, please file an issue either for this model at https://huggingface.co/MontrealCorpusTools/english_mfa/discussions or for larger MFA issues at https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues.
acoustic/final.alimdl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1780c629972e968a2fc4a7808c0076945df62865a0acb6ad09a02feed8d74edf
3
- size 50106787
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5ef61577c9af4205263a5e56f4c7e3cfc84d153f4617939d70bfb96ff0b670d
3
+ size 50089826
acoustic/final.mdl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1d3138688bfab925e108dcc9519d9efe53584aa935b41eb130450cb8ac3d48e8
3
- size 50106787
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ea871f37206b681b82de9bd9d2f8035273ef7e0259f073f8c94a81a34ec8349
3
+ size 50089826
acoustic/lda.mat CHANGED
Binary files a/acoustic/lda.mat and b/acoustic/lda.mat differ
 
acoustic/meta.json CHANGED
@@ -1,487 +1 @@
1
- {
2
- "phones": [
3
- "a",
4
- "aj",
5
- "aw",
6
- "aː",
7
- "b",
8
- "bʲ",
9
- "c",
10
- "cʰ",
11
- "cʷ",
12
- "d",
13
- "dʒ",
14
- "dʲ",
15
- "d̪",
16
- "e",
17
- "ej",
18
- "eː",
19
- "f",
20
- "fʲ",
21
- "fʷ",
22
- "h",
23
- "i",
24
- "iː",
25
- "j",
26
- "k",
27
- "kp",
28
- "kʰ",
29
- "kʷ",
30
- "l",
31
- "m",
32
- "mʲ",
33
- "m̩",
34
- "n",
35
- "n̩",
36
- "o",
37
- "ow",
38
- "oː",
39
- "p",
40
- "pʰ",
41
- "pʲ",
42
- "pʷ",
43
- "s",
44
- "t",
45
- "tʃ",
46
- "tʰ",
47
- "tʲ",
48
- "tʷ",
49
- "t̪",
50
- "u",
51
- "uː",
52
- "v",
53
- "vʲ",
54
- "vʷ",
55
- "w",
56
- "z",
57
- "æ",
58
- "ç",
59
- "ð",
60
- "ŋ",
61
- "ɐ",
62
- "ɑ",
63
- "ɑː",
64
- "ɒ",
65
- "ɒː",
66
- "ɔ",
67
- "ɔj",
68
- "ɖ",
69
- "ə",
70
- "əw",
71
- "ɚ",
72
- "ɛ",
73
- "ɛː",
74
- "ɜ",
75
- "ɜː",
76
- "ɝ",
77
- "ɟ",
78
- "ɟʷ",
79
- "ɡ",
80
- "ɡb",
81
- "ɡʷ",
82
- "ɪ",
83
- "ɫ",
84
- "ɫ̩",
85
- "ɱ",
86
- "ɲ",
87
- "ɹ",
88
- "ɾ",
89
- "ɾʲ",
90
- "ɾ̃",
91
- "ʃ",
92
- "ʈ",
93
- "ʈʲ",
94
- "ʈʷ",
95
- "ʉ",
96
- "ʉː",
97
- "ʊ",
98
- "ʋ",
99
- "ʎ",
100
- "ʒ",
101
- "ʔ",
102
- "θ"
103
- ],
104
- "phone_mapping": {
105
- "<eps>": 0,
106
- "sil": 1,
107
- "spn": 2,
108
- "a": 3,
109
- "aj": 4,
110
- "aw": 5,
111
- "aː": 6,
112
- "b": 7,
113
- "bʲ": 8,
114
- "c": 9,
115
- "cʰ": 10,
116
- "cʷ": 11,
117
- "d": 12,
118
- "dʒ": 13,
119
- "dʲ": 14,
120
- "d̪": 15,
121
- "e": 16,
122
- "ej": 17,
123
- "eː": 18,
124
- "f": 19,
125
- "fʲ": 20,
126
- "fʷ": 21,
127
- "h": 22,
128
- "i": 23,
129
- "iː": 24,
130
- "j": 25,
131
- "k": 26,
132
- "kp": 27,
133
- "kʰ": 28,
134
- "kʷ": 29,
135
- "l": 30,
136
- "m": 31,
137
- "mʲ": 32,
138
- "m̩": 33,
139
- "n": 34,
140
- "n̩": 35,
141
- "o": 36,
142
- "ow": 37,
143
- "oː": 38,
144
- "p": 39,
145
- "pʰ": 40,
146
- "pʲ": 41,
147
- "pʷ": 42,
148
- "s": 43,
149
- "t": 44,
150
- "tʃ": 45,
151
- "tʰ": 46,
152
- "tʲ": 47,
153
- "tʷ": 48,
154
- "t̪": 49,
155
- "u": 50,
156
- "uː": 51,
157
- "v": 52,
158
- "vʲ": 53,
159
- "vʷ": 54,
160
- "w": 55,
161
- "z": 56,
162
- "æ": 57,
163
- "ç": 58,
164
- "ð": 59,
165
- "ŋ": 60,
166
- "ɐ": 61,
167
- "ɑ": 62,
168
- "ɑː": 63,
169
- "ɒ": 64,
170
- "ɒː": 65,
171
- "ɔ": 66,
172
- "ɔj": 67,
173
- "ɖ": 68,
174
- "ə": 69,
175
- "əw": 70,
176
- "ɚ": 71,
177
- "ɛ": 72,
178
- "ɛː": 73,
179
- "ɜ": 74,
180
- "ɜː": 75,
181
- "ɝ": 76,
182
- "ɟ": 77,
183
- "ɟʷ": 78,
184
- "ɡ": 79,
185
- "ɡb": 80,
186
- "ɡʷ": 81,
187
- "ɪ": 82,
188
- "ɫ": 83,
189
- "ɫ̩": 84,
190
- "ɱ": 85,
191
- "ɲ": 86,
192
- "ɹ": 87,
193
- "ɾ": 88,
194
- "ɾʲ": 89,
195
- "ɾ̃": 90,
196
- "ʃ": 91,
197
- "ʈ": 92,
198
- "ʈʲ": 93,
199
- "ʈʷ": 94,
200
- "ʉ": 95,
201
- "ʉː": 96,
202
- "ʊ": 97,
203
- "ʋ": 98,
204
- "ʎ": 99,
205
- "ʒ": 100,
206
- "ʔ": 101,
207
- "θ": 102
208
- },
209
- "phone_groups": {
210
- "0": [
211
- "kp",
212
- "p",
213
- "pʰ",
214
- "pʲ",
215
- "pʷ"
216
- ],
217
- "1": [
218
- "b",
219
- "bʲ",
220
- "ɡb"
221
- ],
222
- "2": [
223
- "f",
224
- "fʲ",
225
- "fʷ"
226
- ],
227
- "3": [
228
- "v",
229
- "vʲ",
230
- "vʷ"
231
- ],
232
- "4": [
233
- "θ"
234
- ],
235
- "5": [
236
- "t̪"
237
- ],
238
- "6": [
239
- "ð"
240
- ],
241
- "7": [
242
- "d̪"
243
- ],
244
- "8": [
245
- "t",
246
- "tʰ",
247
- "tʲ",
248
- "tʷ",
249
- "ʈ",
250
- "ʈʲ",
251
- "ʈʷ"
252
- ],
253
- "9": [
254
- "ʔ"
255
- ],
256
- "10": [
257
- "d",
258
- "dʲ",
259
- "ɖ"
260
- ],
261
- "11": [
262
- "ɾ",
263
- "ɾʲ"
264
- ],
265
- "12": [
266
- "tʃ"
267
- ],
268
- "13": [
269
- "dʒ"
270
- ],
271
- "14": [
272
- "ʃ"
273
- ],
274
- "15": [
275
- "ʒ"
276
- ],
277
- "16": [
278
- "s"
279
- ],
280
- "17": [
281
- "z"
282
- ],
283
- "18": [
284
- "ɹ"
285
- ],
286
- "19": [
287
- "m",
288
- "m̩"
289
- ],
290
- "20": [
291
- "mʲ"
292
- ],
293
- "21": [
294
- "ɱ"
295
- ],
296
- "22": [
297
- "n",
298
- "n̩"
299
- ],
300
- "23": [
301
- "ɲ"
302
- ],
303
- "24": [
304
- "ɾ̃"
305
- ],
306
- "25": [
307
- "ŋ"
308
- ],
309
- "26": [
310
- "l"
311
- ],
312
- "27": [
313
- "ɫ",
314
- "ɫ̩"
315
- ],
316
- "28": [
317
- "ʎ"
318
- ],
319
- "29": [
320
- "ɟ",
321
- "ɟʷ",
322
- "ɡ",
323
- "ɡʷ"
324
- ],
325
- "30": [
326
- "c",
327
- "cʰ",
328
- "cʷ"
329
- ],
330
- "31": [
331
- "k",
332
- "kʰ",
333
- "kʷ"
334
- ],
335
- "32": [
336
- "ç"
337
- ],
338
- "33": [
339
- "h"
340
- ],
341
- "34": [
342
- "ɐ",
343
- "ə"
344
- ],
345
- "35": [
346
- "ɜ",
347
- "ɜː"
348
- ],
349
- "36": [
350
- "ɚ",
351
- "ɝ"
352
- ],
353
- "37": [
354
- "ʊ"
355
- ],
356
- "38": [
357
- "ɪ"
358
- ],
359
- "39": [
360
- "ɑ",
361
- "ɑː"
362
- ],
363
- "40": [
364
- "ɒ",
365
- "ɒː",
366
- "ɔ"
367
- ],
368
- "41": [
369
- "a",
370
- "aː"
371
- ],
372
- "42": [
373
- "æ"
374
- ],
375
- "43": [
376
- "aj"
377
- ],
378
- "44": [
379
- "aw"
380
- ],
381
- "45": [
382
- "i",
383
- "iː"
384
- ],
385
- "46": [
386
- "j"
387
- ],
388
- "47": [
389
- "ɛ",
390
- "ɛː"
391
- ],
392
- "48": [
393
- "e",
394
- "ej",
395
- "eː"
396
- ],
397
- "49": [
398
- "ʉ",
399
- "ʉː"
400
- ],
401
- "50": [
402
- "u",
403
- "uː"
404
- ],
405
- "51": [
406
- "w"
407
- ],
408
- "52": [
409
- "ʋ"
410
- ],
411
- "53": [
412
- "ɔj"
413
- ],
414
- "54": [
415
- "o",
416
- "ow",
417
- "oː",
418
- "əw"
419
- ]
420
- },
421
- "version": "3.1.0",
422
- "architecture": "gmm-hmm",
423
- "train_date": "2024-06-12 12:16:18.584033",
424
- "training": {
425
- "audio_duration": 12862940.052134357,
426
- "num_speakers": 75018,
427
- "num_utterances": 2374755,
428
- "num_oovs": 0,
429
- "average_log_likelihood": -0.08382050453507844
430
- },
431
- "dictionaries": {
432
- "names": [
433
- "default",
434
- "english_india_mfa",
435
- "english_nigeria_mfa",
436
- "english_uk_mfa",
437
- "english_us_mfa",
438
- "nonnative"
439
- ],
440
- "default": "default",
441
- "silence_word": "<eps>",
442
- "use_g2p": false,
443
- "oov_word": "<unk>",
444
- "bracketed_word": "[bracketed]",
445
- "laughter_word": "[laughter]",
446
- "clitic_marker": "'",
447
- "position_dependent_phones": false
448
- },
449
- "language": "unknown",
450
- "features": {
451
- "type": "mfcc",
452
- "use_energy": true,
453
- "frame_shift": 10,
454
- "frame_length": 25,
455
- "snip_edges": false,
456
- "low_frequency": 20,
457
- "high_frequency": 7800,
458
- "sample_frequency": 16000,
459
- "dither": 0.0001,
460
- "energy_floor": 1.0,
461
- "num_coefficients": 13,
462
- "num_mel_bins": 23,
463
- "cepstral_lifter": 22,
464
- "preemphasis_coefficient": 0.97,
465
- "uses_cmvn": true,
466
- "uses_deltas": true,
467
- "uses_voiced": false,
468
- "uses_splices": false,
469
- "uses_speaker_adaptation": true,
470
- "use_pitch": false,
471
- "use_voicing": false,
472
- "min_f0": 50,
473
- "max_f0": 800,
474
- "delta_pitch": 0.005,
475
- "penalty_factor": 0.1,
476
- "silence_weight": 0.0,
477
- "splice_left_context": 3,
478
- "splice_right_context": 3
479
- },
480
- "oov_phone": "spn",
481
- "optional_silence_phone": "sil",
482
- "phone_set_type": "UNKNOWN",
483
- "silence_probability": 0.17,
484
- "initial_silence_probability": 0.17,
485
- "final_silence_correction": 0.99,
486
- "final_non_silence_correction": 0.2966666666666667
487
- }
 
1
+ {"phones": ["a", "aj", "aw", "aː", "b", "bʲ", "c", "cʰ", "cʷ", "d", "dʒ", "dʲ", "d̪", "e", "ej", "eː", "f", "fʲ", "fʷ", "h", "i", "iː", "j", "k", "kp", "kʰ", "kʷ", "l", "m", "mʲ", "m̩", "n", "n̩", "o", "ow", "oː", "p", "pʰ", "pʲ", "pʷ", "s", "t", "tʃ", "tʰ", "tʲ", "tʷ", "t̪", "u", "uː", "v", "vʲ", "vʷ", "w", "z", "æ", "ç", "ð", "ŋ", "ɐ", "ɑ", "ɑː", "ɒ", "ɒː", "ɔ", "ɔj", "ɖ", "ə", "əw", "ɚ", "ɛ", "ɛː", "ɜ", "ɜː", "ɝ", "ɟ", "ɟʷ", "ɡ", "ɡb", "ɡʷ", "ɪ", "ɫ", "ɫ̩", "ɱ", "ɲ", "ɹ", "ɾ", "ɾʲ", "ɾ̃", "ʃ", "ʈ", "ʈʲ", "ʈʷ", "ʉ", "ʉː", "ʊ", "ʋ", "ʎ", "ʒ", "ʔ", "θ"], "phone_mapping": {"<eps>": 0, "fʷ": 21, "a": 3, "ɐ": 61, "ɑ": 62, "ɒ": 64, "aː": 6, "ɑː": 63, "ɒː": 65, "æ": 57, "aj": 4, "aw": 5, "b": 7, "bʲ": 8, "c": 9, "ç": 58, "ɔ": 66, "cʰ": 10, "ɔj": 67, "cʷ": 11, "d": 12, "d̪": 15, "ð": 59, "dʲ": 14, "dʒ": 13, "ɖ": 68, "e": 16, "ə": 69, "ɚ": 71, "eː": 18, "ej": 17, "əw": 70, "ɛ": 72, "ɜ": 74, "ɝ": 76, "ɛː": 73, "ɜː": 75, "f": 19, "fʲ": 20, "ɡ": 79, "ɡb": 80, "ɡʷ": 81, "h": 22, "i": 23, "ɪ": 82, "iː": 24, "j": 25, "ɟ": 77, "ɟʷ": 78, "k": 26, "kʰ": 28, "kp": 27, "kʷ": 29, "l": 30, "ɫ": 83, "ɫ̩": 84, "m": 31, "ɱ": 85, "m̩": 33, "mʲ": 32, "n": 34, "ɲ": 86, "n̩": 35, "ŋ": 60, "o": 36, "oː": 38, "ow": 37, "p": 39, "pʰ": 40, "pʲ": 41, "pʷ": 42, "ɹ": 87, "ɾ": 88, "ɾ̃": 90, "ɾʲ": 89, "s": 43, "ʃ": 91, "sil": 1, "spn": 2, "t": 44, "ʈ": 92, "t̪": 49, "tʰ": 46, "tʲ": 47, "ʈʲ": 93, "tʃ": 45, "tʷ": 48, "ʈʷ": 94, "u": 50, "ʉ": 95, "ʊ": 97, "uː": 51, "ʉː": 96, "v": 52, "ʋ": 98, "vʲ": 53, "vʷ": 54, "w": 55, "ʎ": 99, "z": 56, "ʒ": 100, "ʔ": 101, "θ": 102}, "phone_groups": {}, "version": "3.3.0", "architecture": "gmm-hmm", "train_date": "2026-05-12 11:52:46.825571", "training": {"audio_duration": 11980290.129802473, "num_speakers": 74978, "num_utterances": 2211211, "num_oovs": 0, "average_log_likelihood": -52.032182980375495}, "dictionaries": {"names": ["english_india_mfa", "english_nigeria_mfa", "english_uk_mfa", "english_us_mfa"], "default": "english_us_mfa", "silence_word": "<eps>", "oov_word": "<unk>", "bracketed_word": "[bracketed]", "laughter_word": "[laughter]", "clitic_marker": null, "position_dependent_phones": false}, "language": "unknown", "tokenization": "simple", "features": {"type": "mfcc", "use_energy": false, "frame_shift": 10, "frame_length": 25, "snip_edges": false, "low_frequency": 20, "high_frequency": 7800, "sample_frequency": 16000, "dither": 0.0, "energy_floor": 0.0, "num_coefficients": 13, "num_mel_bins": 23, "cepstral_lifter": 22, "preemphasis_coefficient": 0.97, "uses_cmvn": true, "uses_deltas": true, "uses_voiced": false, "uses_splices": false, "uses_speaker_adaptation": false, "use_pitch": false, "use_voicing": false, "min_f0": 50, "max_f0": 800, "delta_pitch": 0.005, "penalty_factor": 0.1, "silence_weight": 0.0, "splice_left_context": 3, "splice_right_context": 3}, "oov_phone": "spn", "optional_silence_phone": "sil", "phone_set_type": "UNKNOWN", "silence_probability": 0.2866666666666666, "initial_silence_probability": 0.5, "final_silence_correction": null, "final_non_silence_correction": null, "duration_information": {"a": [0.10157499840622884, 0.06526069721451937], "ɐ": [0.08419964690109855, 0.046451834288303115], "ɑ": [0.11397424938770868, 0.053800949407109736], "ɒ": [0.09418769642235683, 0.04989197811590047], "aː": [0.15759416868532616, 0.12329133599948873], "ɑː": [0.1479302146800924, 0.061662796965765314], "ɒː": [0.1308269390097794, 0.06553694029126998], "æ": [0.10972953492614249, 0.055037496058891584], "aj": [0.1605449259587612, 0.06832224689398055], "aw": [0.17068080083182305, 0.06707574657828476], "b": [0.06923437282529099, 0.033919223278294705], "bʲ": [0.06527491981799272, 0.028623866766111474], "c": [0.09471972187051056, 0.033771833083609364], "ç": [0.07605847099327952, 0.040207307334823576], "ɔ": [0.09587455806201205, 0.057413876759444926], "cʰ": [0.11311451090428523, 0.03489236221323744], "ɔj": [0.19674860926956175, 0.07664456634816674], "cʷ": [0.11167146285909513, 0.03602714346714023], "d": [0.06965091614674732, 0.0373797606209854], "d̪": [0.057547479218109494, 0.16896180877276346], "ð": [0.0524053282164514, 0.025117123636817736], "dʲ": [0.06104562350963658, 0.02811272466144499], "dʒ": [0.10446355279476079, 0.04957242124147662], "ɖ": [0.07430240802121195, 0.043919906831057634], "e": [0.11454031986679995, 0.07046190359117653], "ə": [0.06154001640961594, 0.040201429226667494], "ɚ": [0.09721156907846097, 0.05642884744739512], "eː": [0.13428823101374412, 0.062386132804498375], "ej": [0.1276118848789818, 0.05971441305256562], "əw": [0.12553140106952618, 0.06115567852942176], "ɛ": [0.08491697183975094, 0.04096971948583678], "ɜ": [0.11962579292869427, 0.04537435458370203], "ɝ": [0.12737584340263244, 0.05708733657624939], "ɛː": [0.14009794837754147, 0.07996128711341692], "ɜː": [0.13470095026408815, 0.054349981360241044], "f": [0.10697421624716288, 0.04418166979324304], "fʲ": [0.11284702780596867, 0.0411054049444692], "ɡ": [0.07526050445142977, 0.03444226523471279], "ɡb": [0.09751554448052314, 0.05409676711699122], "ɡʷ": [0.08809184081852436, 0.04692488250644544], "h": [0.08660672118361584, 0.040624122449439075], "i": [0.09459434856090432, 0.06015131439302873], "ɪ": [0.06830450724231624, 0.03511234010039717], "iː": [0.10143787406028586, 0.05840759309624196], "j": [0.0825459297483352, 0.05326654640445931], "ɟ": [0.0766474878501772, 0.030192791961081236], "ɟʷ": [0.08722258877511356, 0.03186477349812215], "k": [0.08933208146763405, 0.04287748206167747], "kʰ": [0.10296721379074025, 0.03476324983444999], "kp": [0.09711637636562725, 0.04301044261560203], "kʷ": [0.11531015026312741, 0.03714559101863666], "l": [0.06830065852495266, 0.034652915474955776], "ɫ": [0.0878066565093525, 0.05523077344492323], "ɫ̩": [0.08832689508076336, 0.05855201626003051], "m": [0.07980226949780918, 0.041548323512223796], "ɱ": [0.070740150968006, 0.024138439855181663], "m̩": [0.08006235701331495, 0.0670923349263587], "mʲ": [0.07335477171111972, 0.029321946324644205], "n": [0.07191813138139301, 0.04374751873121665], "ɲ": [0.06542148750363078, 0.03419373040175059], "n̩": [0.09225788769292897, 0.062406962859662296], "ŋ": [0.08762470355792983, 0.043849409890137026], "o": [0.10849475992278206, 0.06840596862200177], "oː": [0.13336331816723984, 0.07385015010242489], "ow": [0.12764519862481935, 0.06854029448733062], "p": [0.09071366695531337, 0.03622525982991457], "pʰ": [0.10506242983899099, 0.03773000343602282], "pʲ": [0.09607966126381594, 0.032876109461520774], "pʷ": [0.10605321510841972, 0.04248433790804683], "ɹ": [0.07086134772860174, 0.038514792414387045], "ɾ": [0.053939316512832514, 0.07740569380208917], "ɾ̃": [0.09539373628990916, 0.23970420480132448], "ɾʲ": [0.03950175780476856, 0.01660296108814163], "s": [0.11357816423507326, 0.054458969770306564], "ʃ": [0.12859709282561968, 0.041161968949410946], "t": [0.0749092220579157, 0.046336768722317706], "ʈ": [0.08623666792238285, 0.058134635933384075], "t̪": [0.10996077656380078, 0.11498626245042727], "tʰ": [0.10183123845830042, 0.03867925802672462], "tʲ": [0.07898593631839401, 0.036588163501002496], "ʈʲ": [0.07832758219293608, 0.03324351901965623], "tʃ": [0.12145799685538634, 0.04858541703805097], "tʷ": [0.12272614546458566, 0.034461749511271254], "ʈʷ": [0.10453061121019647, 0.046062961025250954], "u": [0.08646152768529218, 0.10835995528660126], "ʉ": [0.0905707652836935, 0.060791946075206084], "ʊ": [0.06686625861635767, 0.04234201970320256], "uː": [0.10373890756678743, 0.11768627806624649], "ʉː": [0.09494554586736738, 0.062031901782881045], "v": [0.07185915859841337, 0.03433136275028505], "ʋ": [0.08608389455452445, 0.05491973579307793], "vʲ": [0.0665468357060222, 0.026873985261574713], "vʷ": [0.09806724356002167, 0.02614422550551663], "w": [0.073223942847704, 0.034440441683268076], "ʎ": [0.06968119679049356, 0.03262046397465663], "z": [0.0982185644916374, 0.06412919934151423], "ʒ": [0.08995120454225623, 0.02588555134596066], "ʔ": [0.1421603767868905, 0.27385289152961306], "θ": [0.1000807769403581, 0.049646023502314875]}}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
acoustic/tree CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:71d2741b42fe55707ca41d908d6157bc94c6171623b156f71274b13ecb6dade7
3
- size 468787
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c34756df29372f4d253be21f485c6d6393c3638403ef1adcec552cb458db9842
3
+ size 428123
dictionary/english_india_mfa.dict CHANGED
The diff for this file is too large to render. See raw diff
 
dictionary/english_nigeria_mfa.dict CHANGED
The diff for this file is too large to render. See raw diff
 
dictionary/english_uk_mfa.dict CHANGED
The diff for this file is too large to render. See raw diff
 
dictionary/english_us_mfa.dict CHANGED
The diff for this file is too large to render. See raw diff
 
dictionary/rules.yaml ADDED
@@ -0,0 +1,1382 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dialects:
2
+ india:
3
+ - following_context: $
4
+ non_silence_before_correction: 0.06
5
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
6
+ probability: 0.01
7
+ replacement: ''
8
+ segment: '[tʈ]'
9
+ silence_after_probability: 2.93
10
+ silence_before_correction: -0.12
11
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
12
+ non_silence_before_correction: 0.1
13
+ preceding_context: ''
14
+ probability: 0.01
15
+ replacement: p
16
+ segment: b
17
+ silence_after_probability: 64.0
18
+ silence_before_correction: -0.19
19
+ - following_context: '[tpkcsʃf][ʲʷ]?'
20
+ non_silence_before_correction: 0.15
21
+ preceding_context: ''
22
+ probability: 0.01
23
+ replacement: k
24
+ segment: ɡ
25
+ silence_after_probability: 67.0
26
+ silence_before_correction: -0.24
27
+ - following_context: '[vʋf]ʲ?'
28
+ non_silence_before_correction: 0.12
29
+ preceding_context: ''
30
+ probability: 0.01
31
+ replacement: ɱ
32
+ segment: m
33
+ silence_after_probability: 16.0
34
+ silence_before_correction: -0.21
35
+ - following_context: '[vʋf]ʲ?'
36
+ non_silence_before_correction: 0.16
37
+ preceding_context: ''
38
+ probability: 0.01
39
+ replacement: ɱ
40
+ segment: n
41
+ silence_after_probability: 37.5
42
+ silence_before_correction: -0.26
43
+ - following_context: '[tʈdɖ]$'
44
+ non_silence_before_correction: 0.12
45
+ preceding_context: m
46
+ probability: 0.01
47
+ replacement: ''
48
+ segment: '[pb]'
49
+ silence_after_probability: 6.25
50
+ silence_before_correction: -0.2
51
+ - following_context: '[sz]$'
52
+ non_silence_before_correction: 0.11
53
+ preceding_context: n
54
+ probability: 0.01
55
+ replacement: ''
56
+ segment: '[tʈdɖ]'
57
+ silence_after_probability: 1.35
58
+ silence_before_correction: -0.18
59
+ - following_context: '[dɖʈtcɟɡk][ʲʷ]?'
60
+ non_silence_before_correction: 0.11
61
+ preceding_context: ''
62
+ probability: 0.01
63
+ replacement: ''
64
+ segment: p
65
+ silence_after_probability: 9.14
66
+ silence_before_correction: -0.2
67
+ - following_context: '[pbcɟɡk][ʲʷ]?'
68
+ non_silence_before_correction: 0.03
69
+ preceding_context: ''
70
+ probability: 0.01
71
+ replacement: ''
72
+ segment: '[tʈ]'
73
+ silence_after_probability: 25.0
74
+ silence_before_correction: -0.08
75
+ - following_context: '[dɖtʈcɟɡk][ʲʷ]?'
76
+ non_silence_before_correction: 0.14
77
+ preceding_context: ''
78
+ probability: 0.01
79
+ replacement: ''
80
+ segment: b
81
+ silence_after_probability: 16.67
82
+ silence_before_correction: -0.24
83
+ - following_context: '[dɖtʈpb][ʲʷ]?'
84
+ non_silence_before_correction: 0.12
85
+ preceding_context: ''
86
+ probability: 0.01
87
+ replacement: ''
88
+ segment: k
89
+ silence_after_probability: 4.29
90
+ silence_before_correction: -0.2
91
+ - following_context: ([tʈpkc][ʲʷ]?)? ɹ
92
+ non_silence_before_correction: 0.16
93
+ preceding_context: ''
94
+ probability: 0.01
95
+ replacement: ʃ
96
+ segment: s
97
+ silence_after_probability: 31.0
98
+ silence_before_correction: -0.25
99
+ - following_context: ɹ
100
+ non_silence_before_correction: 0.11
101
+ preceding_context: ''
102
+ probability: 0.01
103
+ replacement: tʃ
104
+ segment: '[tʈ][ʲʷ]?'
105
+ silence_after_probability: 44.0
106
+ silence_before_correction: -0.19
107
+ - following_context: ''
108
+ non_silence_before_correction: 0.15
109
+ preceding_context: ''
110
+ probability: 0.01
111
+ replacement: tʃ
112
+ segment: '[tʈ][ʲʷ]? ɹ'
113
+ silence_after_probability: 42.0
114
+ silence_before_correction: -0.23
115
+ - following_context: ə n
116
+ non_silence_before_correction: 0.11
117
+ preceding_context: ''
118
+ probability: 0.01
119
+ replacement: ʔ
120
+ segment: '[tʈ]'
121
+ silence_after_probability: 3.58
122
+ silence_before_correction: -0.17
123
+ - following_context: $
124
+ non_silence_before_correction: 0.07
125
+ preceding_context: ɪ
126
+ probability: 0.01
127
+ replacement: n
128
+ segment: ŋ
129
+ silence_after_probability: 3.44
130
+ silence_before_correction: -0.14
131
+ - following_context: z$
132
+ non_silence_before_correction: 0.1
133
+ preceding_context: ɪ
134
+ probability: 0.01
135
+ replacement: n
136
+ segment: ŋ
137
+ silence_after_probability: 4.61
138
+ silence_before_correction: -0.16
139
+ - following_context: ''
140
+ non_silence_before_correction: -0.06
141
+ preceding_context: ''
142
+ probability: 0.01
143
+ replacement: l ə
144
+ segment: ə l ə
145
+ silence_after_probability: 8.33
146
+ silence_before_correction: 0.04
147
+ - following_context: ''
148
+ non_silence_before_correction: 0.1
149
+ preceding_context: ''
150
+ probability: 0.01
151
+ replacement: n ə
152
+ segment: ə n ə
153
+ silence_after_probability: 26.33
154
+ silence_before_correction: -0.19
155
+ - following_context: ''
156
+ non_silence_before_correction: 0.29
157
+ preceding_context: ''
158
+ probability: 0.01
159
+ replacement: m ə
160
+ segment: ə m ə
161
+ silence_after_probability: 49.5
162
+ silence_before_correction: -0.38
163
+ - following_context: ''
164
+ non_silence_before_correction: 0.15
165
+ preceding_context: ''
166
+ probability: 0.01
167
+ replacement: ɹ ə
168
+ segment: ə ɹ ə
169
+ silence_after_probability: 11.75
170
+ silence_before_correction: -0.24
171
+ - following_context: ''
172
+ non_silence_before_correction: 0.09
173
+ preceding_context: ''
174
+ probability: 0.01
175
+ replacement: ɾ
176
+ segment: ɹ
177
+ silence_after_probability: 7.8
178
+ silence_before_correction: -0.17
179
+ - following_context: ''
180
+ non_silence_before_correction: 0.06
181
+ preceding_context: ''
182
+ probability: 0.01
183
+ replacement: a
184
+ segment: ɒ
185
+ silence_after_probability: 7.0
186
+ silence_before_correction: -0.12
187
+ - following_context: ''
188
+ non_silence_before_correction: -0.1
189
+ preceding_context: ''
190
+ probability: 0.01
191
+ replacement: a
192
+ segment: ɑ
193
+ silence_after_probability: 10.0
194
+ silence_before_correction: -0.01
195
+ - following_context: ''
196
+ non_silence_before_correction: 0.09
197
+ preceding_context: ''
198
+ probability: 0.01
199
+ replacement: aː
200
+ segment: ɒː
201
+ silence_after_probability: 4.5
202
+ silence_before_correction: -0.16
203
+ - following_context: ''
204
+ non_silence_before_correction: 0.09
205
+ preceding_context: ''
206
+ probability: 0.01
207
+ replacement: aː
208
+ segment: ɑː
209
+ silence_after_probability: 4.43
210
+ silence_before_correction: -0.16
211
+ - following_context: ''
212
+ non_silence_before_correction: 0.09
213
+ preceding_context: ''
214
+ probability: 0.01
215
+ replacement: dʒ
216
+ segment: z
217
+ silence_after_probability: 3.7
218
+ silence_before_correction: -0.16
219
+ - following_context: ''
220
+ non_silence_before_correction: 0.17
221
+ preceding_context: ''
222
+ probability: 0.01
223
+ replacement: dʒ
224
+ segment: ʒ
225
+ silence_after_probability: 6.09
226
+ silence_before_correction: -0.27
227
+ - following_context: ''
228
+ non_silence_before_correction: 0.22
229
+ preceding_context: ''
230
+ probability: 0.01
231
+ replacement: z
232
+ segment: ʒ
233
+ silence_after_probability: 3.64
234
+ silence_before_correction: -0.31
235
+ - following_context: ''
236
+ non_silence_before_correction: 0.1
237
+ preceding_context: ''
238
+ probability: 0.01
239
+ replacement: ʃ
240
+ segment: ʒ
241
+ silence_after_probability: 4.82
242
+ silence_before_correction: -0.18
243
+ - following_context: .*(ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
244
+ non_silence_before_correction: 0.15
245
+ preceding_context: ^
246
+ probability: 0.01
247
+ replacement: ''
248
+ segment: ə
249
+ silence_after_probability: 5.5
250
+ silence_before_correction: -0.24
251
+ - following_context: $
252
+ non_silence_before_correction: 0.09
253
+ preceding_context: '[sʃn]'
254
+ probability: 0.01
255
+ replacement: ''
256
+ segment: '[tʈ]'
257
+ silence_after_probability: 1.45
258
+ silence_before_correction: -0.16
259
+ - following_context: $
260
+ non_silence_before_correction: 0.09
261
+ preceding_context: '[zʒn]'
262
+ probability: 0.01
263
+ replacement: ''
264
+ segment: '[dɖ]'
265
+ silence_after_probability: 1.93
266
+ silence_before_correction: -0.16
267
+ - following_context: ''
268
+ non_silence_before_correction: 0.12
269
+ preceding_context: n
270
+ probability: 0.01
271
+ replacement: ''
272
+ segment: '[dɖ]'
273
+ silence_after_probability: 3.0
274
+ silence_before_correction: -0.21
275
+ - following_context: ə|ɚ
276
+ non_silence_before_correction: 0.07
277
+ preceding_context: ''
278
+ probability: 0.01
279
+ replacement: j
280
+ segment: i
281
+ silence_after_probability: 11.2
282
+ silence_before_correction: -0.15
283
+ - following_context: ''
284
+ non_silence_before_correction: 0.11
285
+ preceding_context: ''
286
+ probability: 0.01
287
+ replacement: dʒ
288
+ segment: '[dɖ][ʲʷ]? ɹ'
289
+ silence_after_probability: 16.67
290
+ silence_before_correction: -0.18
291
+ - following_context: ɹ
292
+ non_silence_before_correction: 0.13
293
+ preceding_context: ''
294
+ probability: 0.01
295
+ replacement: dʒ
296
+ segment: '[dɖ][ʲʷ]?'
297
+ silence_after_probability: 12.0
298
+ silence_before_correction: -0.21
299
+ nigeria:
300
+ - following_context: $
301
+ non_silence_before_correction: 0.11
302
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
303
+ probability: 0.01
304
+ replacement: ''
305
+ segment: '[tʈ]'
306
+ silence_after_probability: 0.14
307
+ silence_before_correction: -0.28
308
+ - following_context: ''
309
+ non_silence_before_correction: 0.13
310
+ preceding_context: ''
311
+ probability: 0.01
312
+ replacement: d̪
313
+ segment: ð
314
+ silence_after_probability: 1.0
315
+ silence_before_correction: -0.42
316
+ - following_context: ''
317
+ non_silence_before_correction: 0.1
318
+ preceding_context: ''
319
+ probability: 0.01
320
+ replacement: t̪
321
+ segment: θ
322
+ silence_after_probability: 1.67
323
+ silence_before_correction: -0.22
324
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
325
+ non_silence_before_correction: 0.14
326
+ preceding_context: ''
327
+ probability: 0.01
328
+ replacement: p
329
+ segment: b
330
+ silence_after_probability: 27.0
331
+ silence_before_correction: -0.22
332
+ - following_context: '[vʋf]ʲ?'
333
+ non_silence_before_correction: 0.08
334
+ preceding_context: ''
335
+ probability: 0.01
336
+ replacement: ɱ
337
+ segment: m
338
+ silence_after_probability: 7.75
339
+ silence_before_correction: -0.13
340
+ - following_context: '[vʋf]ʲ?'
341
+ non_silence_before_correction: 0.12
342
+ preceding_context: ''
343
+ probability: 0.01
344
+ replacement: ɱ
345
+ segment: n
346
+ silence_after_probability: 15.5
347
+ silence_before_correction: -0.2
348
+ - following_context: '[tʈdɖ]$'
349
+ non_silence_before_correction: 0.17
350
+ preceding_context: m
351
+ probability: 0.01
352
+ replacement: ''
353
+ segment: '[pb]'
354
+ silence_after_probability: 0.92
355
+ silence_before_correction: -0.4
356
+ - following_context: '[sz]$'
357
+ non_silence_before_correction: 0.09
358
+ preceding_context: n
359
+ probability: 0.01
360
+ replacement: ''
361
+ segment: '[tʈdɖ]'
362
+ silence_after_probability: 0.55
363
+ silence_before_correction: -0.15
364
+ - following_context: '[s]$'
365
+ non_silence_before_correction: 0.06
366
+ preceding_context: ''
367
+ probability: 0.01
368
+ replacement: ''
369
+ segment: '[sʃ] t'
370
+ silence_after_probability: 2.0
371
+ silence_before_correction: -0.12
372
+ - following_context: '[dɖʈtcɟɡk][ʲʷ]?'
373
+ non_silence_before_correction: 0.07
374
+ preceding_context: ''
375
+ probability: 0.01
376
+ replacement: ''
377
+ segment: p
378
+ silence_after_probability: 2.71
379
+ silence_before_correction: -0.13
380
+ - following_context: '[pbcɟɡk][ʲʷ]?'
381
+ non_silence_before_correction: 0.12
382
+ preceding_context: ''
383
+ probability: 0.01
384
+ replacement: ''
385
+ segment: '[tʈ]'
386
+ silence_after_probability: 8.0
387
+ silence_before_correction: -0.46
388
+ - following_context: '[dɖtʈcɟɡk][ʲʷ]?'
389
+ non_silence_before_correction: 0.18
390
+ preceding_context: ''
391
+ probability: 0.01
392
+ replacement: ''
393
+ segment: b
394
+ silence_after_probability: 7.0
395
+ silence_before_correction: -0.41
396
+ - following_context: '[dɖtʈpb][ʲʷ]?'
397
+ non_silence_before_correction: 0.09
398
+ preceding_context: ''
399
+ probability: 0.01
400
+ replacement: ''
401
+ segment: k
402
+ silence_after_probability: 1.57
403
+ silence_before_correction: -0.15
404
+ - following_context: ([tʈpkc][ʲʷ]?)? ɹ
405
+ non_silence_before_correction: 0.13
406
+ preceding_context: ''
407
+ probability: 0.01
408
+ replacement: ʃ
409
+ segment: s
410
+ silence_after_probability: 18.5
411
+ silence_before_correction: -0.24
412
+ - following_context: ɹ
413
+ non_silence_before_correction: 0.11
414
+ preceding_context: ''
415
+ probability: 0.01
416
+ replacement: tʃ
417
+ segment: '[tʈ][ʲʷ]?'
418
+ silence_after_probability: 30.0
419
+ silence_before_correction: -0.2
420
+ - following_context: ''
421
+ non_silence_before_correction: 0.13
422
+ preceding_context: ''
423
+ probability: 0.01
424
+ replacement: tʃ
425
+ segment: '[tʈ][ʲʷ]? ɹ'
426
+ silence_after_probability: 21.0
427
+ silence_before_correction: -0.26
428
+ - following_context: $
429
+ non_silence_before_correction: 0.05
430
+ preceding_context: ''
431
+ probability: 0.01
432
+ replacement: t s
433
+ segment: d z
434
+ silence_after_probability: 1.6
435
+ silence_before_correction: -0.09
436
+ - following_context: $
437
+ non_silence_before_correction: 0.13
438
+ preceding_context: ''
439
+ probability: 0.01
440
+ replacement: k s
441
+ segment: ɡ z
442
+ silence_after_probability: 2.82
443
+ silence_before_correction: -0.23
444
+ - following_context: $
445
+ non_silence_before_correction: 0.1
446
+ preceding_context: ''
447
+ probability: 0.01
448
+ replacement: s
449
+ segment: z
450
+ silence_after_probability: 1.33
451
+ silence_before_correction: -0.18
452
+ - following_context: ''
453
+ non_silence_before_correction: 0.07
454
+ preceding_context: ^
455
+ probability: 0.01
456
+ replacement: ''
457
+ segment: ç
458
+ silence_after_probability: 1.45
459
+ silence_before_correction: -0.12
460
+ - following_context: ''
461
+ non_silence_before_correction: 0.04
462
+ preceding_context: ^
463
+ probability: 0.01
464
+ replacement: ''
465
+ segment: h
466
+ silence_after_probability: 1.78
467
+ silence_before_correction: 0.02
468
+ - following_context: $
469
+ non_silence_before_correction: 0.1
470
+ preceding_context: ŋ
471
+ probability: 0.31
472
+ replacement: ''
473
+ segment: ɡ
474
+ silence_after_probability: 3.0
475
+ silence_before_correction: -0.2
476
+ - following_context: $
477
+ non_silence_before_correction: 0.11
478
+ preceding_context: '[sʃn]'
479
+ probability: 0.02
480
+ replacement: ''
481
+ segment: '[tʈ]'
482
+ silence_after_probability: 0.27
483
+ silence_before_correction: -0.22
484
+ - following_context: $
485
+ non_silence_before_correction: 0.09
486
+ preceding_context: '[zʒn]'
487
+ probability: 0.02
488
+ replacement: ''
489
+ segment: '[dɖ]'
490
+ silence_after_probability: 0.53
491
+ silence_before_correction: -0.2
492
+ - following_context: ''
493
+ non_silence_before_correction: 0.12
494
+ preceding_context: n
495
+ probability: 0.01
496
+ replacement: ''
497
+ segment: '[dɖ]'
498
+ silence_after_probability: 1.45
499
+ silence_before_correction: -0.24
500
+ - following_context: ''
501
+ non_silence_before_correction: 0.13
502
+ preceding_context: ''
503
+ probability: 0.01
504
+ replacement: dʒ
505
+ segment: '[dɖ][ʲʷ]? ɹ'
506
+ silence_after_probability: 15.67
507
+ silence_before_correction: -0.22
508
+ - following_context: ɹ
509
+ non_silence_before_correction: 0.14
510
+ preceding_context: ''
511
+ probability: 0.01
512
+ replacement: dʒ
513
+ segment: '[dɖ][ʲʷ]?'
514
+ silence_after_probability: 10.0
515
+ silence_before_correction: -0.23
516
+ uk:
517
+ - following_context: $
518
+ non_silence_before_correction: 0.08
519
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
520
+ probability: 0.01
521
+ replacement: ''
522
+ segment: '[tʈ]'
523
+ silence_after_probability: 0.29
524
+ silence_before_correction: -0.12
525
+ - following_context: ''
526
+ non_silence_before_correction: 0.05
527
+ preceding_context: ''
528
+ probability: 0.01
529
+ replacement: d̪
530
+ segment: ð
531
+ silence_after_probability: 4.38
532
+ silence_before_correction: -0.1
533
+ - following_context: ''
534
+ non_silence_before_correction: 0.06
535
+ preceding_context: ''
536
+ probability: 0.01
537
+ replacement: t̪
538
+ segment: θ
539
+ silence_after_probability: 2.42
540
+ silence_before_correction: -0.12
541
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
542
+ non_silence_before_correction: 0.13
543
+ preceding_context: ''
544
+ probability: 0.01
545
+ replacement: s
546
+ segment: z
547
+ silence_after_probability: 75.0
548
+ silence_before_correction: -0.22
549
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
550
+ non_silence_before_correction: 0.15
551
+ preceding_context: ''
552
+ probability: 0.01
553
+ replacement: t
554
+ segment: d
555
+ silence_after_probability: 7.67
556
+ silence_before_correction: -0.23
557
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
558
+ non_silence_before_correction: 0.03
559
+ preceding_context: ''
560
+ probability: 0.01
561
+ replacement: p
562
+ segment: b
563
+ silence_after_probability: 29.0
564
+ silence_before_correction: 0.08
565
+ - following_context: '[tpkcsʃf][ʲʷ]?'
566
+ non_silence_before_correction: 0.12
567
+ preceding_context: ''
568
+ probability: 0.01
569
+ replacement: k
570
+ segment: ɡ
571
+ silence_after_probability: 67.0
572
+ silence_before_correction: -0.2
573
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
574
+ non_silence_before_correction: 0.25
575
+ preceding_context: ''
576
+ probability: 0.01
577
+ replacement: tʃ
578
+ segment: dʒ
579
+ silence_after_probability: 60.0
580
+ silence_before_correction: -0.37
581
+ - following_context: '[vʋf]ʲ?'
582
+ non_silence_before_correction: 0.1
583
+ preceding_context: ''
584
+ probability: 0.03
585
+ replacement: ɱ
586
+ segment: m
587
+ silence_after_probability: 9.25
588
+ silence_before_correction: -0.17
589
+ - following_context: '[vʋf]ʲ?'
590
+ non_silence_before_correction: 0.13
591
+ preceding_context: ''
592
+ probability: 0.02
593
+ replacement: ɱ
594
+ segment: n
595
+ silence_after_probability: 11.0
596
+ silence_before_correction: -0.24
597
+ - following_context: '[tʈdɖ]$'
598
+ non_silence_before_correction: 0.1
599
+ preceding_context: m
600
+ probability: 0.1
601
+ replacement: ''
602
+ segment: '[pb]'
603
+ silence_after_probability: 0.75
604
+ silence_before_correction: -0.23
605
+ - following_context: '[sz]$'
606
+ non_silence_before_correction: 0.12
607
+ preceding_context: n
608
+ probability: 0.09
609
+ replacement: ''
610
+ segment: '[tʈdɖ]'
611
+ silence_after_probability: 0.55
612
+ silence_before_correction: -0.22
613
+ - following_context: '[s]$'
614
+ non_silence_before_correction: 0.2
615
+ preceding_context: ''
616
+ probability: 0.01
617
+ replacement: k
618
+ segment: '[sʃ] k'
619
+ silence_after_probability: 4.95
620
+ silence_before_correction: -0.28
621
+ - following_context: '[s]$'
622
+ non_silence_before_correction: 0.1
623
+ preceding_context: ''
624
+ probability: 0.06
625
+ replacement: ''
626
+ segment: '[sʃ] t'
627
+ silence_after_probability: 0.94
628
+ silence_before_correction: 0.33
629
+ - following_context: '[dɖʈtcɟɡk][ʲʷ]?'
630
+ non_silence_before_correction: 0.05
631
+ preceding_context: ''
632
+ probability: 0.01
633
+ replacement: ''
634
+ segment: p
635
+ silence_after_probability: 3.86
636
+ silence_before_correction: -0.09
637
+ - following_context: '[pbcɟɡk][ʲʷ]?'
638
+ non_silence_before_correction: 0.14
639
+ preceding_context: ''
640
+ probability: 0.01
641
+ replacement: ''
642
+ segment: '[tʈ]'
643
+ silence_after_probability: 17.5
644
+ silence_before_correction: -0.21
645
+ - following_context: '[pbcɟɡk][ʲʷ]?'
646
+ non_silence_before_correction: 0.14
647
+ preceding_context: ''
648
+ probability: 0.01
649
+ replacement: ''
650
+ segment: d
651
+ silence_after_probability: 26.67
652
+ silence_before_correction: -0.24
653
+ - following_context: '[dɖtʈcɟɡk][ʲʷ]?'
654
+ non_silence_before_correction: 0.22
655
+ preceding_context: ''
656
+ probability: 0.01
657
+ replacement: ''
658
+ segment: b
659
+ silence_after_probability: 9.0
660
+ silence_before_correction: -0.4
661
+ - following_context: '[dɖtʈpb][ʲʷ]?'
662
+ non_silence_before_correction: 0.1
663
+ preceding_context: ''
664
+ probability: 0.01
665
+ replacement: ''
666
+ segment: k
667
+ silence_after_probability: 2.71
668
+ silence_before_correction: -0.18
669
+ - following_context: ([tʈpkc][ʲʷ]?)? ɹ
670
+ non_silence_before_correction: 0.12
671
+ preceding_context: ''
672
+ probability: 0.01
673
+ replacement: ʃ
674
+ segment: s
675
+ silence_after_probability: 18.5
676
+ silence_before_correction: -0.21
677
+ - following_context: ɹ
678
+ non_silence_before_correction: 0.12
679
+ preceding_context: ''
680
+ probability: 0.05
681
+ replacement: tʃ
682
+ segment: '[tʈ][ʲʷ]?'
683
+ silence_after_probability: 35.0
684
+ silence_before_correction: -0.13
685
+ - following_context: ''
686
+ non_silence_before_correction: 0.1
687
+ preceding_context: ''
688
+ probability: 0.01
689
+ replacement: tʃ
690
+ segment: '[tʈ][ʲʷ]? ɹ'
691
+ silence_after_probability: 22.0
692
+ silence_before_correction: -0.14
693
+ - following_context: ''
694
+ non_silence_before_correction: 0.1
695
+ preceding_context: ''
696
+ probability: 0.01
697
+ replacement: dʒ
698
+ segment: '[dɖ][ʲʷ]? ɹ'
699
+ silence_after_probability: 3.0
700
+ silence_before_correction: -0.18
701
+ - following_context: $
702
+ non_silence_before_correction: 0.08
703
+ preceding_context: ɪ
704
+ probability: 0.02
705
+ replacement: n
706
+ segment: ŋ
707
+ silence_after_probability: 2.11
708
+ silence_before_correction: -0.13
709
+ - following_context: z$
710
+ non_silence_before_correction: 0.12
711
+ preceding_context: ɪ
712
+ probability: 0.01
713
+ replacement: n
714
+ segment: ŋ
715
+ silence_after_probability: 3.11
716
+ silence_before_correction: -0.22
717
+ - following_context: ''
718
+ non_silence_before_correction: 0.09
719
+ preceding_context: ''
720
+ probability: 0.03
721
+ replacement: l ə
722
+ segment: ə l ə
723
+ silence_after_probability: 5.33
724
+ silence_before_correction: -0.08
725
+ - following_context: ''
726
+ non_silence_before_correction: 0.1
727
+ preceding_context: ''
728
+ probability: 0.03
729
+ replacement: n ə
730
+ segment: ə n ə
731
+ silence_after_probability: 5.0
732
+ silence_before_correction: -0.18
733
+ - following_context: ''
734
+ non_silence_before_correction: 0.09
735
+ preceding_context: ''
736
+ probability: 0.01
737
+ replacement: m ə
738
+ segment: ə m ə
739
+ silence_after_probability: 17.5
740
+ silence_before_correction: -0.07
741
+ - following_context: ''
742
+ non_silence_before_correction: 0.09
743
+ preceding_context: ''
744
+ probability: 0.09
745
+ replacement: ɹ ə
746
+ segment: ə ɹ ə
747
+ silence_after_probability: 5.0
748
+ silence_before_correction: -0.15
749
+ - following_context: ''
750
+ non_silence_before_correction: 0.08
751
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
752
+ probability: 0.01
753
+ replacement: ʔ
754
+ segment: t[ʲʷ]?
755
+ silence_after_probability: 2.9
756
+ silence_before_correction: -0.13
757
+ - following_context: ʉː?
758
+ non_silence_before_correction: 0.12
759
+ preceding_context: ''
760
+ probability: 0.13
761
+ replacement: tʃ
762
+ segment: tʲ
763
+ silence_after_probability: 8.5
764
+ silence_before_correction: -0.21
765
+ - following_context: ʉː?
766
+ non_silence_before_correction: 0.11
767
+ preceding_context: ''
768
+ probability: 0.11
769
+ replacement: dʒ
770
+ segment: dʲ
771
+ silence_after_probability: 4.6
772
+ silence_before_correction: -0.22
773
+ - following_context: ''
774
+ non_silence_before_correction: 0.1
775
+ preceding_context: ^
776
+ probability: 0.04
777
+ replacement: ''
778
+ segment: ç
779
+ silence_after_probability: 0.82
780
+ silence_before_correction: -0.26
781
+ - following_context: ''
782
+ non_silence_before_correction: 0.12
783
+ preceding_context: ''
784
+ probability: 0.01
785
+ replacement: ʔ n̩
786
+ segment: t ə n
787
+ silence_after_probability: 2.47
788
+ silence_before_correction: -0.2
789
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
790
+ non_silence_before_correction: 0.12
791
+ preceding_context: ''
792
+ probability: 0.01
793
+ replacement: n̩
794
+ segment: ə n
795
+ silence_after_probability: 3.12
796
+ silence_before_correction: -0.21
797
+ - following_context: $
798
+ non_silence_before_correction: 0.12
799
+ preceding_context: ''
800
+ probability: 0.01
801
+ replacement: n̩
802
+ segment: ə n
803
+ silence_after_probability: 1.78
804
+ silence_before_correction: -0.23
805
+ - following_context: $
806
+ non_silence_before_correction: 0.08
807
+ preceding_context: ''
808
+ probability: 0.14
809
+ replacement: m̩
810
+ segment: ə m
811
+ silence_after_probability: 0.71
812
+ silence_before_correction: -0.15
813
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
814
+ non_silence_before_correction: 0.12
815
+ preceding_context: ''
816
+ probability: 0.01
817
+ replacement: m̩
818
+ segment: ə m
819
+ silence_after_probability: 9.4
820
+ silence_before_correction: -0.2
821
+ - following_context: $
822
+ non_silence_before_correction: 0.09
823
+ preceding_context: ''
824
+ probability: 0.07
825
+ replacement: ɫ̩
826
+ segment: ə ɫ
827
+ silence_after_probability: 1.91
828
+ silence_before_correction: -0.15
829
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
830
+ non_silence_before_correction: 0.13
831
+ preceding_context: ''
832
+ probability: 0.04
833
+ replacement: ɫ̩
834
+ segment: ə ɫ
835
+ silence_after_probability: 4.56
836
+ silence_before_correction: -0.23
837
+ - following_context: .*(ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
838
+ non_silence_before_correction: 0.11
839
+ preceding_context: ^
840
+ probability: 0.01
841
+ replacement: ''
842
+ segment: ə
843
+ silence_after_probability: 4.5
844
+ silence_before_correction: -0.19
845
+ - following_context: $
846
+ non_silence_before_correction: 0.1
847
+ preceding_context: '[sʃn]'
848
+ probability: 0.04
849
+ replacement: ''
850
+ segment: '[tʈ]'
851
+ silence_after_probability: 0.36
852
+ silence_before_correction: -0.18
853
+ - following_context: $
854
+ non_silence_before_correction: 0.1
855
+ preceding_context: '[zʒn]'
856
+ probability: 0.07
857
+ replacement: ''
858
+ segment: '[dɖ]'
859
+ silence_after_probability: 0.2
860
+ silence_before_correction: -0.17
861
+ - following_context: ''
862
+ non_silence_before_correction: 0.13
863
+ preceding_context: n
864
+ probability: 0.01
865
+ replacement: ''
866
+ segment: '[dɖ]'
867
+ silence_after_probability: 2.64
868
+ silence_before_correction: -0.28
869
+ - following_context: ə|ɚ
870
+ non_silence_before_correction: 0.08
871
+ preceding_context: ''
872
+ probability: 0.01
873
+ replacement: j
874
+ segment: i
875
+ silence_after_probability: 6.0
876
+ silence_before_correction: -0.14
877
+ - following_context: ə|ɚ
878
+ non_silence_before_correction: 0.06
879
+ preceding_context: ''
880
+ probability: 0.01
881
+ replacement: w
882
+ segment: '[ʉu]'
883
+ silence_after_probability: 5.75
884
+ silence_before_correction: -0.07
885
+ - following_context: $
886
+ non_silence_before_correction: 0.17
887
+ preceding_context: ''
888
+ probability: 0.01
889
+ replacement: d̪
890
+ segment: ð [ɖd]
891
+ silence_after_probability: 9.0
892
+ silence_before_correction: -0.27
893
+ - following_context: ɹ
894
+ non_silence_before_correction: 0.09
895
+ preceding_context: ''
896
+ probability: 0.09
897
+ replacement: dʒ
898
+ segment: '[dɖ][ʲʷ]?'
899
+ silence_after_probability: 10.0
900
+ silence_before_correction: -0.11
901
+ - following_context: ə n
902
+ non_silence_before_correction: 0.1
903
+ preceding_context: ''
904
+ probability: 0.01
905
+ replacement: ʔ
906
+ segment: '[tʈ]'
907
+ silence_after_probability: 2.92
908
+ silence_before_correction: -0.16
909
+ us:
910
+ - following_context: $
911
+ non_silence_before_correction: 0.12
912
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
913
+ probability: 0.04
914
+ replacement: ''
915
+ segment: '[tʈ]'
916
+ silence_after_probability: 0.14
917
+ silence_before_correction: -0.22
918
+ - following_context: ''
919
+ non_silence_before_correction: 0.11
920
+ preceding_context: ''
921
+ probability: 0.01
922
+ replacement: d̪
923
+ segment: ð
924
+ silence_after_probability: 1.88
925
+ silence_before_correction: -0.26
926
+ - following_context: ''
927
+ non_silence_before_correction: 0.03
928
+ preceding_context: ''
929
+ probability: 0.01
930
+ replacement: t̪
931
+ segment: θ
932
+ silence_after_probability: 1.5
933
+ silence_before_correction: -0.06
934
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
935
+ non_silence_before_correction: 0.12
936
+ preceding_context: ''
937
+ probability: 0.02
938
+ replacement: s
939
+ segment: z
940
+ silence_after_probability: 38.0
941
+ silence_before_correction: -0.2
942
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
943
+ non_silence_before_correction: 0.12
944
+ preceding_context: ''
945
+ probability: 0.01
946
+ replacement: t
947
+ segment: d
948
+ silence_after_probability: 7.0
949
+ silence_before_correction: -0.23
950
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
951
+ non_silence_before_correction: 0.08
952
+ preceding_context: ''
953
+ probability: 0.01
954
+ replacement: p
955
+ segment: b
956
+ silence_after_probability: 48.0
957
+ silence_before_correction: -0.11
958
+ - following_context: '[tpkcsʃf][ʲʷ]?'
959
+ non_silence_before_correction: 0.06
960
+ preceding_context: ''
961
+ probability: 0.01
962
+ replacement: k
963
+ segment: ɡ
964
+ silence_after_probability: 37.0
965
+ silence_before_correction: 0.1
966
+ - following_context: '[tʈpkcsʃf][ʲʷ]?'
967
+ non_silence_before_correction: 0.19
968
+ preceding_context: ''
969
+ probability: 0.01
970
+ replacement: tʃ
971
+ segment: dʒ
972
+ silence_after_probability: 27.0
973
+ silence_before_correction: -0.39
974
+ - following_context: '[vʋf]ʲ?'
975
+ non_silence_before_correction: 0.1
976
+ preceding_context: ''
977
+ probability: 0.02
978
+ replacement: ɱ
979
+ segment: m
980
+ silence_after_probability: 8.0
981
+ silence_before_correction: -0.17
982
+ - following_context: '[vʋf]ʲ?'
983
+ non_silence_before_correction: 0.11
984
+ preceding_context: ''
985
+ probability: 0.01
986
+ replacement: ɱ
987
+ segment: n
988
+ silence_after_probability: 15.0
989
+ silence_before_correction: -0.21
990
+ - following_context: '[tʈdɖ]$'
991
+ non_silence_before_correction: 0.14
992
+ preceding_context: m
993
+ probability: 0.06
994
+ replacement: ''
995
+ segment: '[pb]'
996
+ silence_after_probability: 1.17
997
+ silence_before_correction: -0.3
998
+ - following_context: '[sz]$'
999
+ non_silence_before_correction: 0.1
1000
+ preceding_context: n
1001
+ probability: 0.05
1002
+ replacement: ''
1003
+ segment: '[tʈdɖ]'
1004
+ silence_after_probability: 0.85
1005
+ silence_before_correction: -0.2
1006
+ - following_context: '[s]$'
1007
+ non_silence_before_correction: 0.1
1008
+ preceding_context: ''
1009
+ probability: 0.03
1010
+ replacement: ''
1011
+ segment: '[sʃ] t'
1012
+ silence_after_probability: 1.56
1013
+ silence_before_correction: -0.22
1014
+ - following_context: $
1015
+ non_silence_before_correction: 0.03
1016
+ preceding_context: ''
1017
+ probability: 0.01
1018
+ replacement: k s
1019
+ segment: s k
1020
+ silence_after_probability: 11.4
1021
+ silence_before_correction: -0.1
1022
+ - following_context: '[dɖʈtcɟɡk][ʲʷ]?'
1023
+ non_silence_before_correction: 0.1
1024
+ preceding_context: ''
1025
+ probability: 0.01
1026
+ replacement: ''
1027
+ segment: p
1028
+ silence_after_probability: 2.57
1029
+ silence_before_correction: -0.18
1030
+ - following_context: '[pbcɟɡk][ʲʷ]?'
1031
+ non_silence_before_correction: 0.12
1032
+ preceding_context: ''
1033
+ probability: 0.02
1034
+ replacement: ''
1035
+ segment: '[tʈ]'
1036
+ silence_after_probability: 9.0
1037
+ silence_before_correction: -0.3
1038
+ - following_context: '[pbcɟɡk][ʲʷ]?'
1039
+ non_silence_before_correction: 0.02
1040
+ preceding_context: ''
1041
+ probability: 0.01
1042
+ replacement: ''
1043
+ segment: d
1044
+ silence_after_probability: 25.0
1045
+ silence_before_correction: -0.1
1046
+ - following_context: '[dɖtʈcɟɡk][ʲʷ]?'
1047
+ non_silence_before_correction: 0.07
1048
+ preceding_context: ''
1049
+ probability: 0.01
1050
+ replacement: ''
1051
+ segment: b
1052
+ silence_after_probability: 8.67
1053
+ silence_before_correction: -0.14
1054
+ - following_context: '[dɖtʈpb][ʲʷ]?'
1055
+ non_silence_before_correction: 0.12
1056
+ preceding_context: ''
1057
+ probability: 0.01
1058
+ replacement: ''
1059
+ segment: k
1060
+ silence_after_probability: 2.86
1061
+ silence_before_correction: -0.2
1062
+ - following_context: '[dɖtʈpb][ʲʷ]?'
1063
+ non_silence_before_correction: 0.14
1064
+ preceding_context: ''
1065
+ probability: 0.01
1066
+ replacement: ''
1067
+ segment: ɡ
1068
+ silence_after_probability: 4.2
1069
+ silence_before_correction: -0.23
1070
+ - following_context: ([tʈpkc][ʲʷ]?)? ɹ
1071
+ non_silence_before_correction: 0.14
1072
+ preceding_context: ''
1073
+ probability: 0.01
1074
+ replacement: ʃ
1075
+ segment: s
1076
+ silence_after_probability: 9.5
1077
+ silence_before_correction: -0.24
1078
+ - following_context: ɹ
1079
+ non_silence_before_correction: 0.13
1080
+ preceding_context: ''
1081
+ probability: 0.04
1082
+ replacement: tʃ
1083
+ segment: '[tʈ][ʲʷ]?'
1084
+ silence_after_probability: 33.0
1085
+ silence_before_correction: -0.25
1086
+ - following_context: ''
1087
+ non_silence_before_correction: 0.12
1088
+ preceding_context: ''
1089
+ probability: 0.01
1090
+ replacement: tʃ
1091
+ segment: '[tʈ][ʲʷ]? ɹ'
1092
+ silence_after_probability: 15.0
1093
+ silence_before_correction: -0.21
1094
+ - following_context: ''
1095
+ non_silence_before_correction: 0.11
1096
+ preceding_context: ''
1097
+ probability: 0.01
1098
+ replacement: dʒ
1099
+ segment: '[dɖ][ʲʷ]? ɹ'
1100
+ silence_after_probability: 7.33
1101
+ silence_before_correction: -0.21
1102
+ - following_context: $
1103
+ non_silence_before_correction: 0.1
1104
+ preceding_context: ɪ
1105
+ probability: 0.05
1106
+ replacement: n
1107
+ segment: ŋ
1108
+ silence_after_probability: 1.44
1109
+ silence_before_correction: -0.2
1110
+ - following_context: z$
1111
+ non_silence_before_correction: 0.12
1112
+ preceding_context: ɪ
1113
+ probability: 0.04
1114
+ replacement: n
1115
+ segment: ŋ
1116
+ silence_after_probability: 2.06
1117
+ silence_before_correction: -0.22
1118
+ - following_context: ''
1119
+ non_silence_before_correction: 0.06
1120
+ preceding_context: ''
1121
+ probability: 0.01
1122
+ replacement: l ə
1123
+ segment: ə l ə
1124
+ silence_after_probability: 4.67
1125
+ silence_before_correction: 0.02
1126
+ - following_context: ''
1127
+ non_silence_before_correction: 0.12
1128
+ preceding_context: ''
1129
+ probability: 0.02
1130
+ replacement: n ə
1131
+ segment: ə n ə
1132
+ silence_after_probability: 5.33
1133
+ silence_before_correction: -0.22
1134
+ - following_context: ''
1135
+ non_silence_before_correction: 0.1
1136
+ preceding_context: ''
1137
+ probability: 0.02
1138
+ replacement: m ə
1139
+ segment: ə m ə
1140
+ silence_after_probability: 23.5
1141
+ silence_before_correction: -0.19
1142
+ - following_context: ''
1143
+ non_silence_before_correction: 0.1
1144
+ preceding_context: ''
1145
+ probability: 0.04
1146
+ replacement: ɹ ə
1147
+ segment: ə ɹ ə
1148
+ silence_after_probability: 2.75
1149
+ silence_before_correction: -0.17
1150
+ - following_context: (ɪ|ə|ɚ|i)
1151
+ non_silence_before_correction: 0.11
1152
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1153
+ probability: 0.07
1154
+ replacement: ɾ
1155
+ segment: '[td]'
1156
+ silence_after_probability: 3.0
1157
+ silence_before_correction: -0.22
1158
+ - following_context: (ɪ|ə|ɚ|i)
1159
+ non_silence_before_correction: 0.11
1160
+ preceding_context: ɹ
1161
+ probability: 0.15
1162
+ replacement: ɾ
1163
+ segment: '[td]'
1164
+ silence_after_probability: 4.29
1165
+ silence_before_correction: -0.18
1166
+ - following_context: (ɪ|ə|ɚ|i)
1167
+ non_silence_before_correction: 0.04
1168
+ preceding_context: ɫ
1169
+ probability: 0.01
1170
+ replacement: ɾ
1171
+ segment: '[td]'
1172
+ silence_after_probability: 2.08
1173
+ silence_before_correction: -0.1
1174
+ - following_context: (ɪ|ə|ɚ|i)
1175
+ non_silence_before_correction: 0.11
1176
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1177
+ probability: 0.05
1178
+ replacement: ɾʲ
1179
+ segment: '[td]ʲ'
1180
+ silence_after_probability: 2.71
1181
+ silence_before_correction: -0.2
1182
+ - following_context: (ɪ|ə|ɚ|i)
1183
+ non_silence_before_correction: 0.1
1184
+ preceding_context: ɹ
1185
+ probability: 0.18
1186
+ replacement: ɾʲ
1187
+ segment: '[td]ʲ'
1188
+ silence_after_probability: 4.5
1189
+ silence_before_correction: -0.17
1190
+ - following_context: (ɪ|ə|ɚ|i)
1191
+ non_silence_before_correction: 0.09
1192
+ preceding_context: (ɫ|ɫ̩)
1193
+ probability: 0.01
1194
+ replacement: ɾʲ
1195
+ segment: '[td]ʲ'
1196
+ silence_after_probability: 3.5
1197
+ silence_before_correction: -0.18
1198
+ - following_context: (ɪ|ə|ɚ|i)
1199
+ non_silence_before_correction: 0.09
1200
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1201
+ probability: 0.01
1202
+ replacement: ɾ̃
1203
+ segment: (ɲ|n)
1204
+ silence_after_probability: 3.6
1205
+ silence_before_correction: -0.18
1206
+ - following_context: (ɪ|ə|ɚ|i)
1207
+ non_silence_before_correction: 0.12
1208
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1209
+ probability: 0.01
1210
+ replacement: ɾ̃
1211
+ segment: (ɲ|n) [td][ʲʷ]?
1212
+ silence_after_probability: 2.57
1213
+ silence_before_correction: -0.22
1214
+ - following_context: $
1215
+ non_silence_before_correction: 0.11
1216
+ preceding_context: ''
1217
+ probability: 0.01
1218
+ replacement: ɑː
1219
+ segment: ɒː
1220
+ silence_after_probability: 1.75
1221
+ silence_before_correction: -0.17
1222
+ - following_context: '[^ɹ]'
1223
+ non_silence_before_correction: 0.12
1224
+ preceding_context: ''
1225
+ probability: 0.02
1226
+ replacement: ɑː
1227
+ segment: ɒː
1228
+ silence_after_probability: 2.78
1229
+ silence_before_correction: -0.22
1230
+ - following_context: $
1231
+ non_silence_before_correction: 0.16
1232
+ preceding_context: ''
1233
+ probability: 0.01
1234
+ replacement: ɑ
1235
+ segment: ɒ
1236
+ silence_after_probability: 17.0
1237
+ silence_before_correction: -0.36
1238
+ - following_context: '[^ɹ]'
1239
+ non_silence_before_correction: 0.12
1240
+ preceding_context: ''
1241
+ probability: 0.02
1242
+ replacement: ɑ
1243
+ segment: ɒ
1244
+ silence_after_probability: 2.86
1245
+ silence_before_correction: -0.23
1246
+ - following_context: $
1247
+ non_silence_before_correction: 0.1
1248
+ preceding_context: (ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1249
+ probability: 0.02
1250
+ replacement: ɾ
1251
+ segment: d
1252
+ silence_after_probability: 1.08
1253
+ silence_before_correction: -0.19
1254
+ - following_context: ''
1255
+ non_silence_before_correction: 0.1
1256
+ preceding_context: ''
1257
+ probability: 0.01
1258
+ replacement: ʔ n̩
1259
+ segment: t ə n
1260
+ silence_after_probability: 1.13
1261
+ silence_before_correction: -0.18
1262
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
1263
+ non_silence_before_correction: 0.12
1264
+ preceding_context: ''
1265
+ probability: 0.01
1266
+ replacement: n̩
1267
+ segment: ə n
1268
+ silence_after_probability: 2.88
1269
+ silence_before_correction: -0.22
1270
+ - following_context: $
1271
+ non_silence_before_correction: 0.12
1272
+ preceding_context: ''
1273
+ probability: 0.03
1274
+ replacement: n̩
1275
+ segment: ə n
1276
+ silence_after_probability: 1.33
1277
+ silence_before_correction: -0.22
1278
+ - following_context: $
1279
+ non_silence_before_correction: 0.13
1280
+ preceding_context: ''
1281
+ probability: 0.11
1282
+ replacement: m̩
1283
+ segment: ə m
1284
+ silence_after_probability: 0.57
1285
+ silence_before_correction: -0.27
1286
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
1287
+ non_silence_before_correction: 0.09
1288
+ preceding_context: ''
1289
+ probability: 0.01
1290
+ replacement: m̩
1291
+ segment: ə m
1292
+ silence_after_probability: 6.0
1293
+ silence_before_correction: -0.18
1294
+ - following_context: $
1295
+ non_silence_before_correction: 0.09
1296
+ preceding_context: ''
1297
+ probability: 0.08
1298
+ replacement: ɫ̩
1299
+ segment: ə ɫ
1300
+ silence_after_probability: 1.45
1301
+ silence_before_correction: -0.17
1302
+ - following_context: '[^ʊɔɝaɔɛɜeuoæɐɪəɚɑʉɒi].*'
1303
+ non_silence_before_correction: 0.12
1304
+ preceding_context: ''
1305
+ probability: 0.04
1306
+ replacement: ɫ̩
1307
+ segment: ə ɫ
1308
+ silence_after_probability: 4.11
1309
+ silence_before_correction: -0.23
1310
+ - following_context: .*(ʊ|ɔj|ɝ|ɛ|ej|ɜ|a|u|o|ow|æ|aw|əw|aj|ɐ|ɪ|ə|ɔ|e|ɚ|ɑ|ʉ|ɒ|i)ː?
1311
+ non_silence_before_correction: 0.12
1312
+ preceding_context: ^
1313
+ probability: 0.01
1314
+ replacement: ''
1315
+ segment: ə
1316
+ silence_after_probability: 1.88
1317
+ silence_before_correction: -0.22
1318
+ - following_context: $
1319
+ non_silence_before_correction: 0.11
1320
+ preceding_context: '[sʃn]'
1321
+ probability: 0.05
1322
+ replacement: ''
1323
+ segment: '[tʈ]'
1324
+ silence_after_probability: 0.55
1325
+ silence_before_correction: -0.21
1326
+ - following_context: $
1327
+ non_silence_before_correction: 0.11
1328
+ preceding_context: '[zʒn]'
1329
+ probability: 0.07
1330
+ replacement: ''
1331
+ segment: '[dɖ]'
1332
+ silence_after_probability: 0.4
1333
+ silence_before_correction: -0.22
1334
+ - following_context: ''
1335
+ non_silence_before_correction: 0.12
1336
+ preceding_context: n
1337
+ probability: 0.01
1338
+ replacement: ''
1339
+ segment: '[dɖ]'
1340
+ silence_after_probability: 1.82
1341
+ silence_before_correction: -0.23
1342
+ - following_context: ə|ɚ
1343
+ non_silence_before_correction: 0.11
1344
+ preceding_context: ''
1345
+ probability: 0.01
1346
+ replacement: j
1347
+ segment: i
1348
+ silence_after_probability: 6.4
1349
+ silence_before_correction: -0.19
1350
+ - following_context: ə|ɚ
1351
+ non_silence_before_correction: 0.12
1352
+ preceding_context: ''
1353
+ probability: 0.01
1354
+ replacement: w
1355
+ segment: '[ʉu]'
1356
+ silence_after_probability: 7.75
1357
+ silence_before_correction: -0.19
1358
+ - following_context: $
1359
+ non_silence_before_correction: 0.1
1360
+ preceding_context: ''
1361
+ probability: 0.01
1362
+ replacement: t̪
1363
+ segment: '[tʈ] θ'
1364
+ silence_after_probability: 4.95
1365
+ silence_before_correction: -0.19
1366
+ - following_context: ɹ
1367
+ non_silence_before_correction: 0.11
1368
+ preceding_context: ''
1369
+ probability: 0.06
1370
+ replacement: dʒ
1371
+ segment: '[dɖ][ʲʷ]?'
1372
+ silence_after_probability: 7.67
1373
+ silence_before_correction: -0.21
1374
+ - following_context: ə n
1375
+ non_silence_before_correction: 0.13
1376
+ preceding_context: ''
1377
+ probability: 0.01
1378
+ replacement: ʔ
1379
+ segment: '[tʈ]'
1380
+ silence_after_probability: 1.75
1381
+ silence_before_correction: -0.22
1382
+ rules: []
g2p/english_india_mfa/graphemes.txt ADDED
@@ -0,0 +1,285 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ | 1
3
+ _ 2
4
+ '|d 3
5
+ '|e 4
6
+ m 5
7
+ ' 6
8
+ l|l 7
9
+ '|m 8
10
+ r|e 9
11
+ '|s 10
12
+ u|n 11
13
+ v|e 12
14
+ '|v 13
15
+ e 14
16
+ a 15
17
+ d 16
18
+ r 17
19
+ g 18
20
+ v 19
21
+ w|e 20
22
+ s 21
23
+ o|m 22
24
+ b 23
25
+ h 24
26
+ c|a 25
27
+ p 26
28
+ c 27
29
+ c|h 28
30
+ e|n 29
31
+ j 30
32
+ l 31
33
+ a|c 32
34
+ u 33
35
+ k 34
36
+ a|l 35
37
+ o|r 36
38
+ l|i 37
39
+ i 38
40
+ s|t 39
41
+ a|m 40
42
+ a|n 41
43
+ l|o 42
44
+ a|r 43
45
+ w|o 44
46
+ f 45
47
+ h|u 46
48
+ o|n 47
49
+ r|o 48
50
+ n|i 49
51
+ o 50
52
+ i|c 51
53
+ v|o 52
54
+ e|l 53
55
+ t|h 54
56
+ b|a 55
57
+ c|i 56
58
+ n|a 57
59
+ t|e 58
60
+ c|k 59
61
+ w 60
62
+ c|o 61
63
+ u|s 62
64
+ f|t 63
65
+ t 64
66
+ i|s 65
67
+ c|e 66
68
+ j|o 67
69
+ n|e 68
70
+ m|p 69
71
+ n|d 70
72
+ e|d 71
73
+ l|y 72
74
+ e|r 73
75
+ n|g 74
76
+ m|e 75
77
+ n|t 76
78
+ s|e 77
79
+ s|h 78
80
+ s|s 79
81
+ l|e 80
82
+ i|a 81
83
+ s|i 82
84
+ t|i 83
85
+ a|t 84
86
+ t|o 85
87
+ z|a 86
88
+ b|b 87
89
+ e|s 88
90
+ v|i 89
91
+ y 90
92
+ i|e 91
93
+ d|e 92
94
+ d|a 93
95
+ d|i 94
96
+ d|o 95
97
+ m|i 96
98
+ n 97
99
+ o|u 98
100
+ d|u 99
101
+ u|c 100
102
+ u|l 101
103
+ l|a 102
104
+ z|i 103
105
+ z 104
106
+ b|e 105
107
+ g|e 106
108
+ k|i 107
109
+ b|i 108
110
+ c|y 109
111
+ g|a 110
112
+ r|a 111
113
+ w|y 112
114
+ e|t 113
115
+ h|i 114
116
+ k|a 115
117
+ h|o 116
118
+ i|n 117
119
+ i|l 118
120
+ u|r 119
121
+ b|j 120
122
+ e|c 121
123
+ j|e 122
124
+ c|t 123
125
+ j|u 124
126
+ n|c 125
127
+ r|i 126
128
+ g|u 127
129
+ b|l 128
130
+ m|a 129
131
+ b|o 130
132
+ o|o 131
133
+ g|i 132
134
+ u|m 133
135
+ h|a 134
136
+ m|s 135
137
+ i|d 136
138
+ m|o 137
139
+ f|f 138
140
+ b|r 139
141
+ d|g 140
142
+ z|z 141
143
+ s|c 142
144
+ o|l 143
145
+ z|e 144
146
+ q|u 145
147
+ t|a 146
148
+ t|r 147
149
+ i|t 148
150
+ b|u 149
151
+ j|a 150
152
+ f|e 151
153
+ f|u 152
154
+ u|t 153
155
+ w|a 154
156
+ b|y 155
157
+ m|y 156
158
+ c|u 157
159
+ p|h 158
160
+ u|a 159
161
+ p|i 160
162
+ c|c 161
163
+ m|m 162
164
+ c|r 163
165
+ u|e 164
166
+ m|u 165
167
+ g|l 166
168
+ v|a 167
169
+ n|o 168
170
+ z|o 169
171
+ f|i 170
172
+ p|o 171
173
+ k|e 172
174
+ n|s 173
175
+ a|d 174
176
+ o|c 175
177
+ d|d 176
178
+ m|b 177
179
+ o|p 178
180
+ h|e 179
181
+ x 180
182
+ g|h 181
183
+ f|o 182
184
+ w|n 183
185
+ d|r 184
186
+ w|s 185
187
+ p|e 186
188
+ d|y 187
189
+ e|a 188
190
+ p|y 189
191
+ p|a 190
192
+ g|o 191
193
+ f|a 192
194
+ r|s 193
195
+ f|l 194
196
+ f|r 195
197
+ g|r 196
198
+ p|p 197
199
+ h|l 198
200
+ '|t 199
201
+ i|o 200
202
+ j|i 201
203
+ k|h 202
204
+ k|k 203
205
+ k|n 204
206
+ k|o 205
207
+ k|s 206
208
+ k|u 207
209
+ u|i 208
210
+ c|l 209
211
+ h|y 210
212
+ x|i 211
213
+ k|y 212
214
+ y|s 213
215
+ p|u 214
216
+ o|t 215
217
+ w|i 216
218
+ s|a 217
219
+ f|y 218
220
+ a|s 219
221
+ w|h 220
222
+ p|l 221
223
+ p|r 222
224
+ o|s 223
225
+ w|r 224
226
+ w|l 225
227
+ h|r 226
228
+ h|n 227
229
+ w|f 228
230
+ w|k 229
231
+ k|w 230
232
+ z|u 231
233
+ z|y 232
234
+ k|l 233
235
+ b|d 234
236
+ v|y 235
237
+ '|n 236
238
+ h|m 237
239
+ w|d 238
240
+ v|u 239
241
+ w|b 240
242
+ z|h 241
243
+ q|l 242
244
+ w|u 243
245
+ q|i 244
246
+ m|n 245
247
+ q 246
248
+ k|r 247
249
+ p|s 248
250
+ ô 249
251
+ x|s 250
252
+ w|m 251
253
+ '|a 252
254
+ b|t 253
255
+ x|c 254
256
+ p|t 255
257
+ v|s 256
258
+ b|s 257
259
+ c|s 258
260
+ '|r 259
261
+ x|e 260
262
+ x|h 261
263
+ x|u 262
264
+ d|s 263
265
+ '|c 264
266
+ q|a 265
267
+ k|b 266
268
+ '|l 267
269
+ m|c 268
270
+ h|t 269
271
+ z|s 270
272
+ x|o 271
273
+ q|w 272
274
+ w|t 273
275
+ '|i 274
276
+ h|b 275
277
+ h|w 276
278
+ '|o 277
279
+ x|t 278
280
+ d|m 279
281
+ f|s 280
282
+ x|a 281
283
+ x|y 282
284
+ ü|r 283
285
+ ü 284
g2p/english_india_mfa/meta.json CHANGED
@@ -1 +1 @@
1
- {"version": "2.2.6.dev1+g6874b58.d20230320", "architecture": "phonetisaurus", "train_date": "2023-05-07 11:50:08.461087", "phones": ["a", "aj", "aw", "b", "b\u02b2", "c", "c\u02b7", "d\u0292", "d\u032a", "e\u02d0", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "k\u02b7", "l", "m", "m\u02b2", "n", "o\u02d0", "p", "p\u02b2", "p\u02b7", "s", "t\u0283", "t\u032a", "z", "\u00e7", "\u014b", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0256", "\u0259", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u0272", "\u0279", "\u027e", "\u0283", "\u0288", "\u0288\u02b2", "\u0288\u02b7", "\u0289", "\u0289\u02d0", "\u028a", "\u028b", "\u028e", "\u0292"], "graphemes": ["\u00fc", "e", "v", "g", "n", "x", "f", "l", "\u00f4", "m", "w", "t", "z", "u", "p", "i", "'", "o", "b", "h", "y", "r", "c", "k", "s", "q", "d", "j", "a"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "evaluation": {"num_words": 7493, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 67432, "num_graphemes": 29, "num_phones": 61}}
 
1
+ {"version": "3.3.0", "architecture": "phonetisaurus", "train_date": "2026-05-12 11:53:05.347871", "phones": ["a", "aj", "aw", "a\u02d0", "b", "b\u02b2", "c", "c\u02b7", "d\u0292", "d\u032a", "e\u02d0", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "k\u02b7", "l", "m", "m\u02b2", "n", "o\u02d0", "p", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u032a", "w", "z", "\u00e7", "\u014b", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0256", "\u0259", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u0271", "\u0272", "\u0279", "\u027e", "\u0283", "\u0288", "\u0288\u02b2", "\u0288\u02b7", "\u0289", "\u0289\u02d0", "\u028a", "\u028b", "\u028e", "\u0292", "\u0294"], "graphemes": ["k", "j", "q", "u", "a", "z", "g", "\u00fc", "t", "x", "d", "b", "w", "r", "y", "v", "n", "e", "c", "p", "m", "i", "f", "'", "h", "\u00f4", "o", "s", "l"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "unicode_decomposition": false, "evaluation": {"num_words": 7488, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 67406, "num_graphemes": 29, "num_phones": 66}}
g2p/english_india_mfa/model.fst CHANGED

Git LFS Details

  • SHA256: 4caf7deb712242ee3b69f1a3be85b3f29333275c8bf088017a09d137efe6bdba
  • Pointer size: 133 Bytes
  • Size of remote file: 37.6 MB

Git LFS Details

  • SHA256: b6e1e3c815e471c91fa33e4015a7824ceca90b2ddc19e6a9df99794c2147c1e4
  • Pointer size: 133 Bytes
  • Size of remote file: 37.4 MB
g2p/english_india_mfa/phones.txt ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ ɖ 1
3
+ ʈ 2
4
+ ə 3
5
+ ɪ 4
6
+ m 5
7
+ ɛ 6
8
+ l 7
9
+ z 8
10
+ s 9
11
+ n 10
12
+ ʋ 11
13
+ eː 12
14
+ iː 13
15
+ ɑː 14
16
+ dʒ 15
17
+ ɒː 16
18
+ bʲ 17
19
+ tʃ 18
20
+ k 19
21
+ a 20
22
+ p 21
23
+ ʊ 22
24
+ c 23
25
+ j 24
26
+ b 25
27
+ ɡ 26
28
+ ʎ 27
29
+ ɒ 28
30
+ f 29
31
+ ɹ 30
32
+ h 31
33
+ ʉː 32
34
+ ɛː 33
35
+ ɲ 34
36
+ ɑ 35
37
+ oː 36
38
+ t̪ 37
39
+ i 38
40
+ aj 39
41
+ ɾ 40
42
+ ŋ 41
43
+ ʃ 42
44
+ ʒ 43
45
+ ʈʲ 44
46
+ ʈʷ 45
47
+ mʲ 46
48
+ ɟ 47
49
+ ʉ 48
50
+ ɜː 49
51
+ aw 50
52
+ ɜ 51
53
+ ɔj 52
54
+ fʲ 53
55
+ cʷ 54
56
+ d̪ 55
57
+ ç 56
58
+ ɡʷ 57
59
+ pʲ 58
60
+ ɟʷ 59
61
+ pʷ 60
62
+ kʷ 61
g2p/english_nigeria_mfa/graphemes.txt ADDED
@@ -0,0 +1,384 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ | 1
3
+ _ 2
4
+ '|d 3
5
+ ' 4
6
+ e|m 5
7
+ l|l 6
8
+ '|m 7
9
+ r|e 8
10
+ '|s 9
11
+ u|n 10
12
+ '|v 11
13
+ e 12
14
+ v|e 13
15
+ -|d 14
16
+ a 15
17
+ d 16
18
+ a|r 17
19
+ g 18
20
+ v 19
21
+ a|w 20
22
+ s|o 21
23
+ m|e 22
24
+ b 23
25
+ h 24
26
+ c|a 25
27
+ p 26
28
+ c 27
29
+ c|h 28
30
+ e|n 29
31
+ s 30
32
+ j 31
33
+ l 32
34
+ m 33
35
+ c|u 34
36
+ u 35
37
+ k 36
38
+ a|l 37
39
+ o|r 38
40
+ l|i 39
41
+ i 40
42
+ s|t 41
43
+ a|m 42
44
+ n|d 43
45
+ l|o 44
46
+ v|a 45
47
+ r|k 46
48
+ w|o 47
49
+ f 48
50
+ h|u 49
51
+ o|n 50
52
+ o 51
53
+ n|i 52
54
+ r|o 53
55
+ v|o 54
56
+ e|l 55
57
+ t|h 56
58
+ e|i 57
59
+ a|b 58
60
+ b|a 59
61
+ c|i 60
62
+ n|a 61
63
+ t|e 62
64
+ c|k 63
65
+ w|a 64
66
+ r|d 65
67
+ c|o 66
68
+ a|c 67
69
+ u|s 68
70
+ f|t 69
71
+ a|h 70
72
+ t 71
73
+ a|i 72
74
+ a|n 73
75
+ c|e 74
76
+ n|e 75
77
+ p|e 76
78
+ e|d 77
79
+ l|y 78
80
+ e|r 79
81
+ n|g 80
82
+ n|t 81
83
+ s|e 82
84
+ s|h 83
85
+ s|s 84
86
+ l|e 85
87
+ s|i 86
88
+ t|i 87
89
+ a|t 88
90
+ t|o 89
91
+ i|r 90
92
+ z|a 91
93
+ b|b 92
94
+ e|s 93
95
+ e|y 94
96
+ o|t 95
97
+ t|s 96
98
+ i|p 97
99
+ v|i 98
100
+ y 99
101
+ d|e 100
102
+ r|i 101
103
+ d|i 102
104
+ d|o 103
105
+ n 104
106
+ m|i 105
107
+ o|u 106
108
+ d|u 107
109
+ c|t 108
110
+ e|e 109
111
+ l|a 110
112
+ h|i 111
113
+ r 112
114
+ a|u 113
115
+ b|e 114
116
+ d|a 115
117
+ r|y 116
118
+ e|g 117
119
+ g|e 118
120
+ m|o 119
121
+ s|k 120
122
+ e|o 121
123
+ k|u 122
124
+ t|a 123
125
+ r|r 124
126
+ r|a 125
127
+ e|t 126
128
+ k|a 127
129
+ h|o 128
130
+ i|n 129
131
+ c|y 130
132
+ b|i 131
133
+ i|d 132
134
+ g|a 133
135
+ i|l 134
136
+ t|y 135
137
+ m|b 136
138
+ u|r 137
139
+ j|e 138
140
+ j|u 139
141
+ n|c 140
142
+ z|e 141
143
+ g|u 142
144
+ o|d 143
145
+ l|u 144
146
+ b|l 145
147
+ b|o 146
148
+ o|o 147
149
+ g|i 148
150
+ u|m 149
151
+ h|a 150
152
+ b|r 151
153
+ e|u 152
154
+ d|g 153
155
+ a|d 154
156
+ s|c 155
157
+ a|e 156
158
+ o|l 157
159
+ t|u 158
160
+ p|t 159
161
+ q|u 160
162
+ r|u 161
163
+ s|u 162
164
+ b|u 163
165
+ j|a 164
166
+ u|l 165
167
+ f|e 166
168
+ a|g 167
169
+ t|t 168
170
+ z|z 169
171
+ l|t 170
172
+ b|y 171
173
+ m|y 172
174
+ o|e 173
175
+ p|h 174
176
+ r|p 175
177
+ s|a 176
178
+ p|i 177
179
+ c|l 178
180
+ m|s 179
181
+ c|c 180
182
+ m|a 181
183
+ o|i 182
184
+ m|m 183
185
+ m|p 184
186
+ n|y 185
187
+ u|p 186
188
+ c|r 187
189
+ o|a 188
190
+ m|u 189
191
+ x 190
192
+ o|m 191
193
+ l|d 192
194
+ r|b 193
195
+ i|c 194
196
+ r|v 195
197
+ z|o 196
198
+ f|i 197
199
+ z|i 198
200
+ i|e 199
201
+ o|w 200
202
+ r|n 201
203
+ o|c 202
204
+ g|y 203
205
+ p|o 204
206
+ n|o 205
207
+ p|u 206
208
+ n|s 207
209
+ d|d 208
210
+ u|c 209
211
+ y|o 210
212
+ j|i 211
213
+ p|a 212
214
+ s|p 213
215
+ u|g 214
216
+ d|h 215
217
+ h|e 216
218
+ r|t 217
219
+ g|h 218
220
+ k|i 219
221
+ j|o 220
222
+ o|p 221
223
+ o|g 222
224
+ d|r 223
225
+ d|v 224
226
+ d|y 225
227
+ p|y 226
228
+ g|o 227
229
+ f|a 228
230
+ f|f 229
231
+ r|s 230
232
+ e|c 231
233
+ k|p 232
234
+ f|l 233
235
+ f|o 234
236
+ f|r 235
237
+ '|t 236
238
+ n|n 237
239
+ g|g 238
240
+ p|r 239
241
+ g|l 240
242
+ e|a 241
243
+ g|r 242
244
+ w|u 243
245
+ u|e 244
246
+ a|p 245
247
+ l|s 246
248
+ w 247
249
+ r|l 248
250
+ r|m 249
251
+ y|s 250
252
+ i|s 251
253
+ y|e 252
254
+ y|i 253
255
+ k|e 254
256
+ a|k 255
257
+ k|h 256
258
+ y|a 257
259
+ k|k 258
260
+ k|n 259
261
+ k|o 260
262
+ k|s 261
263
+ k|w 262
264
+ a|f 263
265
+ p|p 264
266
+ u|t 265
267
+ w|i 266
268
+ f|u 267
269
+ g|n 268
270
+ y|u 269
271
+ a|y 270
272
+ y|w 271
273
+ n|u 272
274
+ z 273
275
+ t|r 274
276
+ é|t 275
277
+ é|p 276
278
+ é 277
279
+ i|a 278
280
+ y|l 279
281
+ n|k 280
282
+ v|y 281
283
+ o|v 282
284
+ e|w 283
285
+ n|f 284
286
+ u|i 285
287
+ h|y 286
288
+ s|w 287
289
+ s|y 288
290
+ n|v 289
291
+ x|i 290
292
+ w|h 291
293
+ p|l 292
294
+ i|m 293
295
+ i|v 294
296
+ o|s 295
297
+ r|c 296
298
+ w|e 297
299
+ a|s 298
300
+ i|g 299
301
+ w|r 300
302
+ e|p 301
303
+ u|a 302
304
+ y|n 303
305
+ i|o 304
306
+ h|n 305
307
+ m|n 306
308
+ a|v 307
309
+ a|z 308
310
+ z|u 309
311
+ z|y 310
312
+ '|l 311
313
+ k|y 312
314
+ e|f 313
315
+ b|d 314
316
+ f|y 315
317
+ i|z 316
318
+ u|d 317
319
+ h|l 318
320
+ v|v 319
321
+ w|y 320
322
+ k|l 321
323
+ o|k 322
324
+ y|d 323
325
+ v|u 324
326
+ i|t 325
327
+ y|r 326
328
+ l|v 327
329
+ z|h 328
330
+ q|l 329
331
+ u|b 330
332
+ e|v 331
333
+ w|s 332
334
+ q|i 333
335
+ e|x 334
336
+ q 335
337
+ w|n 336
338
+ p|s 337
339
+ ô 338
340
+ x|s 339
341
+ c|s 340
342
+ w|m 341
343
+ w|t 342
344
+ o|b 343
345
+ r|g 344
346
+ b|s 345
347
+ b|j 346
348
+ x|e 347
349
+ x|h 348
350
+ é|e 349
351
+ h|m 350
352
+ '|c 351
353
+ o|f 352
354
+ y|c 353
355
+ h|w 354
356
+ d|w 355
357
+ b|t 356
358
+ '|n 357
359
+ '|a 358
360
+ q|a 359
361
+ d|s 360
362
+ h|r 361
363
+ k|r 362
364
+ -|k 363
365
+ -|f 364
366
+ i|b 365
367
+ x|u 366
368
+ í|a 367
369
+ x|y 368
370
+ '|r 369
371
+ p|n 370
372
+ x|o 371
373
+ t|l 372
374
+ '|i 373
375
+ h|b 374
376
+ é|g 375
377
+ w|f 376
378
+ f|s 377
379
+ d|n 378
380
+ x|a 379
381
+ y|p 380
382
+ ü 381
383
+ - 382
384
+ í 383
g2p/english_nigeria_mfa/meta.json CHANGED
@@ -1 +1 @@
1
- {"version": "2.2.6.dev1+g6874b58.d20230320", "architecture": "phonetisaurus", "train_date": "2023-05-07 11:33:01.252545", "phones": ["a", "aj", "aw", "a\u02d0", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "e", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "kp", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "n", "o", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "u", "u\u02d0", "v", "v\u02b2", "w", "z", "\u00e7", "\u00f0", "\u014b", "\u0254", "\u0254j", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261b", "\u0261\u02b7", "\u026b", "\u0272", "\u0279", "\u0283", "\u028a", "\u028e", "\u03b8"], "graphemes": ["\u00fc", "e", "v", "g", "n", "x", "f", "l", "-", "m", "w", "t", "z", "u", "p", "i", "\u00e9", "'", "o", "b", "h", "y", "r", "c", "k", "s", "q", "d", "j", "a"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "evaluation": {"num_words": 5633, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 50705, "num_graphemes": 30, "num_phones": 65}}
 
1
+ {"version": "3.3.0", "architecture": "phonetisaurus", "train_date": "2026-05-12 11:53:06.988505", "phones": ["a", "aj", "aw", "a\u02d0", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "d\u032a", "e", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "kp", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "n", "o", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "t\u032a", "u", "u\u02d0", "v", "v\u02b2", "w", "z", "\u00e7", "\u00f0", "\u014b", "\u0254", "\u0254j", "\u0259", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261b", "\u0261\u02b7", "\u026b", "\u0271", "\u0272", "\u0279", "\u0283", "\u028a", "\u028e", "\u0294", "\u03b8"], "graphemes": ["-", "\u00e9", "k", "j", "q", "u", "a", "z", "g", "\u00fc", "t", "x", "d", "r", "w", "b", "y", "v", "n", "e", "c", "p", "m", "i", "f", "'", "h", "\u00f4", "\u00ed", "o", "s", "l"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "unicode_decomposition": false, "evaluation": {"num_words": 5630, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 50676, "num_graphemes": 32, "num_phones": 70}}
g2p/english_nigeria_mfa/model.fst CHANGED

Git LFS Details

  • SHA256: 05815898dc747f7f62bdcd3cc06b9901230920b28e330f3770a8367809a8ff1d
  • Pointer size: 133 Bytes
  • Size of remote file: 26.1 MB

Git LFS Details

  • SHA256: 0cfec6070c42eecddf82bdd772cc6acd297756d24a64075dacf32713a43ee2be
  • Pointer size: 133 Bytes
  • Size of remote file: 26.8 MB
g2p/english_nigeria_mfa/phones.txt ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ d 1
3
+ t 2
4
+ ɛ 3
5
+ m 4
6
+ ɫ 5
7
+ i 6
8
+ a 7
9
+ z 8
10
+ s 9
11
+ ɔ 10
12
+ n 11
13
+ v 12
14
+ spn 13
15
+ e 14
16
+ dʲ 15
17
+ iː 16
18
+ dʒ 17
19
+ vʲ 18
20
+ bʲ 19
21
+ tʃ 20
22
+ kʰ 21
23
+ p 22
24
+ pʰ 23
25
+ k 24
26
+ ʊ 25
27
+ cʰ 26
28
+ j 27
29
+ b 28
30
+ ɡ 29
31
+ ʎ 30
32
+ l 31
33
+ w 32
34
+ f 33
35
+ ɹ 34
36
+ h 35
37
+ uː 36
38
+ ɛː 37
39
+ u 38
40
+ ɲ 39
41
+ o 40
42
+ θ 41
43
+ pʲ 42
44
+ ŋ 43
45
+ ʃ 44
46
+ tʲ 45
47
+ tʰ 46
48
+ tʷ 47
49
+ mʲ 48
50
+ c 49
51
+ ç 50
52
+ aj 51
53
+ ɟ 52
54
+ aː 53
55
+ aw 54
56
+ kʷ 55
57
+ ɜ 56
58
+ ɔj 57
59
+ fʲ 58
60
+ cʷ 59
61
+ ɡb 60
62
+ ð 61
63
+ kp 62
64
+ ɟʷ 63
65
+ ɡʷ 64
66
+ pʷ 65
g2p/english_uk_mfa/graphemes.txt ADDED
@@ -0,0 +1,287 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ | 1
3
+ _ 2
4
+ '|d 3
5
+ '|e 4
6
+ m 5
7
+ ' 6
8
+ l|l 7
9
+ '|m 8
10
+ r|e 9
11
+ '|s 10
12
+ u|n 11
13
+ v|e 12
14
+ '|v 13
15
+ e 14
16
+ a 15
17
+ d 16
18
+ r 17
19
+ g 18
20
+ v 19
21
+ w|e 20
22
+ s 21
23
+ o|m 22
24
+ b 23
25
+ h 24
26
+ c|a 25
27
+ p 26
28
+ c 27
29
+ c|h 28
30
+ e|n 29
31
+ j 30
32
+ l 31
33
+ c|u 32
34
+ u 33
35
+ k 34
36
+ a|l 35
37
+ o|r 36
38
+ l|i 37
39
+ i 38
40
+ s|t 39
41
+ a|m 40
42
+ a|n 41
43
+ l|o 42
44
+ a|r 43
45
+ w|o 44
46
+ f 45
47
+ h|u 46
48
+ o|n 47
49
+ r|o 48
50
+ n|i 49
51
+ o 50
52
+ i|c 51
53
+ v|o 52
54
+ e|l 53
55
+ t|h 54
56
+ b|a 55
57
+ c|i 56
58
+ n|a 57
59
+ t|e 58
60
+ c|k 59
61
+ w 60
62
+ c|o 61
63
+ u|s 62
64
+ f|t 63
65
+ t 64
66
+ i|s 65
67
+ c|e 66
68
+ j|o 67
69
+ n|e 68
70
+ m|p 69
71
+ n|d 70
72
+ e|d 71
73
+ l|y 72
74
+ e|r 73
75
+ n|g 74
76
+ m|e 75
77
+ n|t 76
78
+ s|e 77
79
+ s|h 78
80
+ s|s 79
81
+ l|e 80
82
+ i|a 81
83
+ s|i 82
84
+ t|i 83
85
+ a|t 84
86
+ t|o 85
87
+ z|a 86
88
+ b|b 87
89
+ e|s 88
90
+ v|i 89
91
+ y 90
92
+ i|e 91
93
+ d|e 92
94
+ d|a 93
95
+ d|i 94
96
+ d|o 95
97
+ m|i 96
98
+ n 97
99
+ o|u 98
100
+ d|u 99
101
+ u|c 100
102
+ u|l 101
103
+ l|a 102
104
+ z|i 103
105
+ z 104
106
+ b|e 105
107
+ e|c 106
108
+ g|e 107
109
+ k|i 108
110
+ b|i 109
111
+ c|y 110
112
+ g|a 111
113
+ r|a 112
114
+ w|y 113
115
+ e|t 114
116
+ h|i 115
117
+ k|a 116
118
+ h|o 117
119
+ i|n 118
120
+ i|l 119
121
+ u|r 120
122
+ j|e 121
123
+ c|t 122
124
+ b|j 123
125
+ j|u 124
126
+ n|c 125
127
+ b|l 126
128
+ a|c 127
129
+ t|a 128
130
+ z|e 129
131
+ g|u 130
132
+ r|i 131
133
+ o|o 132
134
+ m|a 133
135
+ b|o 134
136
+ g|i 135
137
+ u|m 136
138
+ h|a 137
139
+ m|s 138
140
+ i|d 139
141
+ m|o 140
142
+ f|f 141
143
+ b|r 142
144
+ d|g 143
145
+ a|d 144
146
+ z|z 145
147
+ s|c 146
148
+ o|l 147
149
+ q|u 148
150
+ t|r 149
151
+ i|t 150
152
+ d|l 151
153
+ b|u 152
154
+ j|a 153
155
+ f|e 154
156
+ f|u 155
157
+ u|t 156
158
+ w|a 157
159
+ b|y 158
160
+ m|y 159
161
+ r|s 160
162
+ u|a 161
163
+ p|t 162
164
+ p|i 163
165
+ c|c 164
166
+ m|m 165
167
+ d|s 166
168
+ c|r 167
169
+ u|e 168
170
+ m|b 169
171
+ m|u 170
172
+ x 171
173
+ p|h 172
174
+ g|l 173
175
+ v|a 174
176
+ v|u 175
177
+ n|o 176
178
+ z|o 177
179
+ f|i 178
180
+ p|o 179
181
+ k|e 180
182
+ n|s 181
183
+ d|d 182
184
+ h|e 183
185
+ g|h 184
186
+ f|o 185
187
+ o|p 186
188
+ w|n 187
189
+ d|r 188
190
+ w|s 189
191
+ r|t 190
192
+ p|e 191
193
+ d|y 192
194
+ e|a 193
195
+ p|y 194
196
+ f|a 195
197
+ g|o 196
198
+ f|l 197
199
+ f|r 198
200
+ '|t 199
201
+ g|r 200
202
+ p|r 201
203
+ p|p 202
204
+ s|a 203
205
+ h|l 204
206
+ i|o 205
207
+ j|i 206
208
+ k|h 207
209
+ k|k 208
210
+ k|n 209
211
+ k|o 210
212
+ k|u 211
213
+ c|l 212
214
+ h|y 213
215
+ o|t 214
216
+ k|s 215
217
+ w|i 216
218
+ x|i 217
219
+ k|y 218
220
+ y|s 219
221
+ p|a 220
222
+ f|y 221
223
+ p|l 222
224
+ p|u 223
225
+ a|s 224
226
+ v|y 225
227
+ o|s 226
228
+ w|r 227
229
+ w|l 228
230
+ h|r 229
231
+ h|n 230
232
+ w|f 231
233
+ w|h 232
234
+ w|k 233
235
+ k|w 234
236
+ z|u 235
237
+ z|y 236
238
+ k|l 237
239
+ z|s 238
240
+ d|w 239
241
+ '|n 240
242
+ h|m 241
243
+ w|d 242
244
+ z|h 243
245
+ q|l 244
246
+ w|u 245
247
+ q|i 246
248
+ q 247
249
+ k|r 248
250
+ p|s 249
251
+ ô 250
252
+ x|s 251
253
+ c|s 252
254
+ w|m 253
255
+ w|t 254
256
+ '|a 255
257
+ m|n 256
258
+ b|t 257
259
+ x|c 258
260
+ v|l 259
261
+ w|b 260
262
+ b|s 261
263
+ '|r 262
264
+ x|h 263
265
+ x|u 264
266
+ '|c 265
267
+ q|a 266
268
+ '|p 267
269
+ '|l 268
270
+ v|s 269
271
+ m|c 270
272
+ h|t 271
273
+ x|y 272
274
+ x|o 273
275
+ '|i 274
276
+ h|b 275
277
+ h|w 276
278
+ z|w 277
279
+ '|o 278
280
+ d|m 279
281
+ f|s 280
282
+ w|p 281
283
+ x|a 282
284
+ x|e 283
285
+ '|k 284
286
+ ü|r 285
287
+ ü 286
g2p/english_uk_mfa/meta.json CHANGED
@@ -1 +1 @@
1
- {"version": "2.2.6.dev1+g6874b58.d20230320", "architecture": "phonetisaurus", "train_date": "2023-05-07 12:25:49.314352", "phones": ["a", "aj", "aw", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "e", "ej", "f", "f\u02b2", "f\u02b7", "h", "i", "i\u02d0", "j", "k", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "n", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "v", "v\u02b2", "v\u02b7", "w", "z", "\u00e7", "\u00f0", "\u014b", "\u0250", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0259", "\u0259w", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u026b", "\u0272", "\u0279", "\u0283", "\u0289", "\u0289\u02d0", "\u028a", "\u028e", "\u0292", "\u03b8"], "graphemes": ["\u00fc", "e", "v", "g", "n", "x", "f", "l", "\u00f4", "m", "w", "t", "z", "u", "p", "i", "'", "o", "b", "h", "y", "r", "c", "k", "s", "q", "d", "j", "a"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "evaluation": {"num_words": 7497, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 67478, "num_graphemes": 29, "num_phones": 72}}
 
1
+ {"version": "3.3.0", "architecture": "phonetisaurus", "train_date": "2026-05-12 11:53:01.135930", "phones": ["a", "aj", "aw", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "d\u032a", "e", "ej", "f", "f\u02b2", "f\u02b7", "h", "i", "i\u02d0", "j", "k", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "m\u0329", "n", "n\u0329", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "t\u032a", "v", "v\u02b2", "v\u02b7", "w", "z", "\u00e7", "\u00f0", "\u014b", "\u0250", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0259", "\u0259w", "\u025b", "\u025b\u02d0", "\u025c", "\u025c\u02d0", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u026b", "\u026b\u0329", "\u0271", "\u0272", "\u0279", "\u0283", "\u0289", "\u0289\u02d0", "\u028a", "\u028e", "\u0292", "\u0294", "\u03b8"], "graphemes": ["k", "j", "q", "u", "a", "z", "g", "\u00fc", "t", "x", "d", "r", "w", "b", "y", "v", "n", "e", "c", "p", "m", "i", "f", "'", "h", "\u00f4", "o", "s", "l"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "unicode_decomposition": false, "evaluation": {"num_words": 7498, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 67465, "num_graphemes": 29, "num_phones": 79}}
g2p/english_uk_mfa/model.fst CHANGED

Git LFS Details

  • SHA256: b80168903f21c0b23aa001e91a98cce29499b16db24a9a13d54ad751b6b5ea5f
  • Pointer size: 133 Bytes
  • Size of remote file: 39.5 MB

Git LFS Details

  • SHA256: a03541d2c2fde7d84514ff812f8d082be214fcb722d442e248d379ca016df1bb
  • Pointer size: 133 Bytes
  • Size of remote file: 38.9 MB
g2p/english_uk_mfa/phones.txt ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ d 1
3
+ t 2
4
+ ə 3
5
+ m 4
6
+ ɛ 5
7
+ ɪ 6
8
+ ɫ 7
9
+ s 8
10
+ z 9
11
+ ɐ 10
12
+ n 11
13
+ v 12
14
+ ej 13
15
+ dʲ 14
16
+ iː 15
17
+ ɑː 16
18
+ dʒ 17
19
+ vʲ 18
20
+ ɒː 19
21
+ bʲ 20
22
+ tʃ 21
23
+ kʰ 22
24
+ a 23
25
+ p 24
26
+ pʰ 25
27
+ k 26
28
+ ʊ 27
29
+ cʰ 28
30
+ j 29
31
+ b 30
32
+ ɡ 31
33
+ ʎ 32
34
+ l 33
35
+ ɒ 34
36
+ w 35
37
+ f 36
38
+ ɹ 37
39
+ h 38
40
+ ʉː 39
41
+ ɛː 40
42
+ ɲ 41
43
+ ɑ 42
44
+ əw 43
45
+ θ 44
46
+ i 45
47
+ aj 46
48
+ ŋ 47
49
+ ʃ 48
50
+ ʒ 49
51
+ tʲ 50
52
+ tʷ 51
53
+ mʲ 52
54
+ c 53
55
+ tʰ 54
56
+ ɟ 55
57
+ ʉ 56
58
+ ɜː 57
59
+ aw 58
60
+ vʷ 59
61
+ kʷ 60
62
+ ɜ 61
63
+ pʲ 62
64
+ ɔj 63
65
+ fʲ 64
66
+ cʷ 65
67
+ ð 66
68
+ ç 67
69
+ ɡʷ 68
70
+ ɟʷ 69
71
+ pʷ 70
72
+ e 71
73
+ fʷ 72
g2p/english_us_mfa/graphemes.txt ADDED
@@ -0,0 +1,349 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ | 1
3
+ _ 2
4
+ '|a 3
5
+ '|d 4
6
+ ' 5
7
+ d 6
8
+ e|m 7
9
+ i|n 8
10
+ l|l 9
11
+ '|m 10
12
+ '|n 11
13
+ '|r 12
14
+ e 13
15
+ r|e 14
16
+ '|s 15
17
+ u|n 16
18
+ '|v 17
19
+ v|e 18
20
+ a 19
21
+ r 20
22
+ g 21
23
+ v 22
24
+ w 23
25
+ s|o 24
26
+ m|e 25
27
+ b 26
28
+ h 27
29
+ a|b 28
30
+ o 29
31
+ m 30
32
+ y|c 31
33
+ c|a 32
34
+ p 33
35
+ c 34
36
+ c|h 35
37
+ e|n 36
38
+ s 37
39
+ j 38
40
+ l 39
41
+ c|u 40
42
+ u 41
43
+ k 42
44
+ a|l 43
45
+ o|r 44
46
+ l|i 45
47
+ i 46
48
+ s|t 47
49
+ a|m 48
50
+ a|n 49
51
+ l|o 50
52
+ a|r 51
53
+ w|o 52
54
+ f 53
55
+ h|u 54
56
+ o|n 55
57
+ i|c 56
58
+ t|e 57
59
+ v|o 58
60
+ e|l 59
61
+ t|h 60
62
+ e|i 61
63
+ b|a 62
64
+ d|a 63
65
+ d|e 64
66
+ a|c 65
67
+ v|i 66
68
+ x|i 67
69
+ c|i 68
70
+ s|c 69
71
+ u|s 70
72
+ c|k 71
73
+ c|o 72
74
+ r|i 73
75
+ t|i 74
76
+ n 75
77
+ l|y 76
78
+ t 77
79
+ u|l 78
80
+ a|d 79
81
+ d|d 80
82
+ f|t 81
83
+ a|i 82
84
+ c|e 83
85
+ e|r 84
86
+ s|s 85
87
+ k|a 86
88
+ n|a 87
89
+ n|e 88
90
+ m|p 89
91
+ p|e 90
92
+ e|d 91
93
+ e|e 92
94
+ d|o 93
95
+ n|i 94
96
+ n|g 95
97
+ n|t 96
98
+ a|p 97
99
+ r|o 98
100
+ n|o 99
101
+ s|i 100
102
+ e|s 101
103
+ l|a 102
104
+ s|e 103
105
+ b|i 104
106
+ s|h 105
107
+ e|v 106
108
+ l|e 107
109
+ i|a 108
110
+ a|s 109
111
+ d|i 110
112
+ z|e 111
113
+ a|t 112
114
+ t|o 113
115
+ i|r 114
116
+ u|r 115
117
+ o|i 116
118
+ x 117
119
+ a|y 118
120
+ z|a 119
121
+ b|b 120
122
+ c|y 121
123
+ b|e 122
124
+ e|y 123
125
+ t|s 124
126
+ i|p 125
127
+ o|t 126
128
+ z|z 127
129
+ y 128
130
+ z 129
131
+ i|m 130
132
+ m|b 131
133
+ o|m 132
134
+ m|i 133
135
+ g|i 134
136
+ g|e 135
137
+ h|y 136
138
+ e|c 137
139
+ p|y 138
140
+ r|a 139
141
+ o|u 140
142
+ v|a 141
143
+ d|u 142
144
+ n|s 143
145
+ c|t 144
146
+ u|c 145
147
+ e|a 146
148
+ u|m 147
149
+ g|o 148
150
+ e|g 149
151
+ m|o 150
152
+ s|k 151
153
+ t|r 152
154
+ b|r 153
155
+ k|i 154
156
+ k|u 155
157
+ t|a 156
158
+ c|r 157
159
+ b|y 158
160
+ f|o 159
161
+ g|a 160
162
+ n|n 161
163
+ w|e 162
164
+ w|y 163
165
+ e|t 164
166
+ p|o 165
167
+ f|a 166
168
+ h|e 167
169
+ y|a 168
170
+ h|i 169
171
+ h|o 170
172
+ i|d 171
173
+ j|a 172
174
+ i|l 173
175
+ o|g 174
176
+ j|e 175
177
+ j|u 176
178
+ n|c 177
179
+ k|h 178
180
+ z|i 179
181
+ b|l 180
182
+ p|h 181
183
+ p|s 182
184
+ g|u 183
185
+ o|o 184
186
+ o|w 185
187
+ l|u 186
188
+ g|r 187
189
+ m|a 188
190
+ o|b 189
191
+ c|c 190
192
+ a|u 191
193
+ b|o 192
194
+ i|z 193
195
+ s|p 194
196
+ i|g 195
197
+ g|h 196
198
+ n|d 197
199
+ o|v 198
200
+ i|o 199
201
+ h|a 200
202
+ z|o 201
203
+ d|g 202
204
+ o|a 203
205
+ p|t 204
206
+ k|e 205
207
+ i|s 206
208
+ o|l 207
209
+ q|u 208
210
+ i|t 209
211
+ b|u 210
212
+ f|e 211
213
+ a|g 212
214
+ u|t 213
215
+ l|t 214
216
+ w|a 215
217
+ y|e 216
218
+ j|o 217
219
+ o|k 218
220
+ o|p 219
221
+ i|b 220
222
+ p|i 221
223
+ c|l 222
224
+ d|s 223
225
+ l|d 224
226
+ g|l 225
227
+ y|n 226
228
+ v|u 227
229
+ f|i 228
230
+ y|l 229
231
+ i|e 230
232
+ y|m 231
233
+ p|u 232
234
+ e|p 233
235
+ d|h 234
236
+ l|s 235
237
+ d|r 236
238
+ d|y 237
239
+ y|p 238
240
+ f|f 239
241
+ f|l 240
242
+ f|r 241
243
+ '|t 242
244
+ w|h 243
245
+ f|y 244
246
+ p|l 245
247
+ p|r 246
248
+ p|p 247
249
+ k|o 248
250
+ s|l 249
251
+ y|o 250
252
+ j|i 251
253
+ k|s 252
254
+ a|k 253
255
+ k|k 254
256
+ k|n 255
257
+ w|i 256
258
+ x|e 257
259
+ n|k 258
260
+ p|a 259
261
+ y|s 260
262
+ n|u 261
263
+ s|a 262
264
+ v|y 263
265
+ y|t 264
266
+ i|v 265
267
+ o|s 266
268
+ y|g 267
269
+ w|s 268
270
+ y|i 269
271
+ f|u 270
272
+ q 271
273
+ q|a 272
274
+ o|c 273
275
+ w|r 274
276
+ o|d 275
277
+ r|s 276
278
+ w|l 277
279
+ s|u 278
280
+ y|r 279
281
+ h|w 280
282
+ h|n 281
283
+ a|v 282
284
+ -|g 283
285
+ w|k 284
286
+ w|n 285
287
+ y|u 286
288
+ z|u 287
289
+ z|y 288
290
+ '|l 289
291
+ '|y 290
292
+ h|r 291
293
+ k|y 292
294
+ k|l 293
295
+ h|l 294
296
+ n|f 295
297
+ j|y 296
298
+ i|f 297
299
+ x|s 298
300
+ y|d 299
301
+ v|v 300
302
+ z|h 301
303
+ q|l 302
304
+ r|t 303
305
+ d|l 304
306
+ w|u 305
307
+ q|i 306
308
+ e|f 307
309
+ k|r 308
310
+ ô 309
311
+ d|n 310
312
+ w|m 311
313
+ c|s 312
314
+ '|o 313
315
+ w|d 314
316
+ v|l 315
317
+ b|s 316
318
+ x|x 317
319
+ w|b 318
320
+ e|x 319
321
+ x|c 320
322
+ x|h 321
323
+ x|t 322
324
+ '|c 323
325
+ k|w 324
326
+ w|f 325
327
+ '|e 326
328
+ '|p 327
329
+ x|u 328
330
+ y|w 329
331
+ h|m 330
332
+ -|p 331
333
+ -|j 332
334
+ -|f 333
335
+ v|r 334
336
+ -|m 335
337
+ w|t 336
338
+ x|y 337
339
+ x|a 338
340
+ '|i 339
341
+ h|b 340
342
+ d|w 341
343
+ -|h 342
344
+ x|o 343
345
+ '|k 344
346
+ -|z 345
347
+ ü|r 346
348
+ - 347
349
+ ü 348
g2p/english_us_mfa/meta.json CHANGED
@@ -1 +1 @@
1
- {"version": "2.2.6.dev1+g6874b58.d20230320", "architecture": "phonetisaurus", "train_date": "2023-05-07 12:08:58.048522", "phones": ["aj", "aw", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "ej", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "n", "ow", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "v", "v\u02b2", "w", "z", "\u00e6", "\u00e7", "\u00f0", "\u014b", "\u0250", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0259", "\u025a", "\u025b", "\u025d", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u026b", "\u0272", "\u0279", "\u0283", "\u0289", "\u0289\u02d0", "\u028a", "\u028e", "\u0292", "\u0294", "\u03b8"], "graphemes": ["\u00fc", "e", "v", "g", "n", "x", "f", "l", "\u00f4", "-", "m", "w", "t", "z", "u", "p", "i", "'", "o", "b", "h", "y", "r", "c", "k", "s", "q", "d", "j", "a"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "evaluation": {"num_words": 7904, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 71153, "num_graphemes": 30, "num_phones": 69}}
 
1
+ {"version": "3.3.0", "architecture": "phonetisaurus", "train_date": "2026-05-12 11:53:03.341229", "phones": ["aj", "aw", "b", "b\u02b2", "c", "c\u02b0", "c\u02b7", "d", "d\u0292", "d\u02b2", "d\u032a", "ej", "f", "f\u02b2", "h", "i", "i\u02d0", "j", "k", "k\u02b0", "k\u02b7", "l", "m", "m\u02b2", "m\u0329", "n", "n\u0329", "ow", "p", "p\u02b0", "p\u02b2", "p\u02b7", "s", "t", "t\u0283", "t\u02b0", "t\u02b2", "t\u02b7", "t\u032a", "v", "v\u02b2", "w", "z", "\u00e6", "\u00e7", "\u00f0", "\u014b", "\u0250", "\u0251", "\u0251\u02d0", "\u0252", "\u0252\u02d0", "\u0254j", "\u0259", "\u025a", "\u025b", "\u025d", "\u025f", "\u025f\u02b7", "\u0261", "\u0261\u02b7", "\u026a", "\u026b", "\u026b\u0329", "\u0271", "\u0272", "\u0279", "\u027e", "\u027e\u02b2", "\u027e\u0303", "\u0283", "\u0289", "\u0289\u02d0", "\u028a", "\u028e", "\u0292", "\u0294", "\u03b8"], "graphemes": ["-", "k", "j", "q", "u", "a", "z", "g", "\u00fc", "t", "x", "d", "r", "w", "b", "y", "v", "n", "e", "c", "p", "m", "i", "f", "'", "h", "\u00f4", "o", "s", "l"], "grapheme_order": 2, "phone_order": 2, "sequence_separator": "|", "unicode_decomposition": false, "evaluation": {"num_words": 8068, "word_error_rate": null, "phone_error_rate": null}, "training": {"num_words": 72625, "num_graphemes": 30, "num_phones": 78}}
g2p/english_us_mfa/model.fst CHANGED

Git LFS Details

  • SHA256: 0da3ed70916a4b541d00cfc7e5dbc2d21086df2f6ace4b84be7d0bda2557ed06
  • Pointer size: 133 Bytes
  • Size of remote file: 37.7 MB

Git LFS Details

  • SHA256: 001bc21fe1934f3b1e86ae006aca9673c6be60de33698101396bec8545b3f7f3
  • Pointer size: 133 Bytes
  • Size of remote file: 38.9 MB
g2p/english_us_mfa/phones.txt ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <eps> 0
2
+ ə 1
3
+ ɪ 2
4
+ d 3
5
+ t 4
6
+ m 5
7
+ ɛ 6
8
+ n 7
9
+ ɫ 8
10
+ ɹ 9
11
+ ɚ 10
12
+ ʔ 11
13
+ z 12
14
+ s 13
15
+ ɐ 14
16
+ v 15
17
+ ɑː 16
18
+ ej 17
19
+ dʲ 18
20
+ iː 19
21
+ ɑ 20
22
+ dʒ 21
23
+ vʲ 22
24
+ ɒː 23
25
+ bʲ 24
26
+ tʃ 25
27
+ æ 26
28
+ b 27
29
+ ow 28
30
+ aj 29
31
+ cʰ 30
32
+ p 31
33
+ pʰ 32
34
+ k 33
35
+ kʰ 34
36
+ ʊ 35
37
+ j 36
38
+ ɒ 37
39
+ ɡ 38
40
+ ʎ 39
41
+ i 40
42
+ l 41
43
+ w 42
44
+ f 43
45
+ h 44
46
+ ʉ 45
47
+ ɲ 46
48
+ θ 47
49
+ ʉː 48
50
+ tʲ 49
51
+ ʃ 50
52
+ c 51
53
+ tʰ 52
54
+ pʲ 53
55
+ ŋ 54
56
+ ʒ 55
57
+ tʷ 56
58
+ mʲ 57
59
+ ç 58
60
+ ɝ 59
61
+ ɔj 60
62
+ aw 61
63
+ ɟ 62
64
+ kʷ 63
65
+ fʲ 64
66
+ cʷ 65
67
+ ð 66
68
+ ɟʷ 67
69
+ ɡʷ 68
70
+ pʷ 69
71
+ m̩ 70