diff --git a/README.md b/README.md index 25ab99438e5368aabd19fc7a4d6cba143334782a..bd3ffb591ca583ce362b3bb3d988db14cfbfc8a2 100644 --- a/README.md +++ b/README.md @@ -23,14 +23,14 @@ dataset_info: metrics: - name: best_compression_ratio type: compression - value: 3.456 + value: 3.497 - name: best_isotropy type: isotropy - value: 0.0028 + value: 0.0023 - name: vocabulary_size type: vocab - value: 1659 -generated: 2025-12-28 + value: 0 +generated: 2026-01-03 --- # CHY - Wikilangs Models @@ -44,12 +44,13 @@ We analyze tokenizers, n-gram models, Markov chains, vocabulary statistics, and ### Models & Assets - Tokenizers (8k, 16k, 32k, 64k) -- N-gram models (2, 3, 4-gram) -- Markov chains (context of 1, 2, 3 and 4) +- N-gram models (2, 3, 4, 5-gram) +- Markov chains (context of 1, 2, 3, 4 and 5) - Subword N-gram and Markov chains -- Embeddings in various sizes and dimensions +- Embeddings in various sizes and dimensions (aligned and unaligned) - Language Vocabulary - Language Statistics + ![Performance Dashboard](visualizations/performance_dashboard.png) ### Analysis and Evaluation @@ -59,7 +60,8 @@ We analyze tokenizers, n-gram models, Markov chains, vocabulary statistics, and - [3. Markov Chain Evaluation](#3-markov-chain-evaluation) - [4. Vocabulary Analysis](#4-vocabulary-analysis) - [5. Word Embeddings Evaluation](#5-word-embeddings-evaluation) -- [6. Summary & Recommendations](#6-summary--recommendations) +- [6. Morphological Analysis (Experimental)](#6-morphological-analysis) +- [7. Summary & Recommendations](#7-summary--recommendations) - [Metrics Glossary](#appendix-metrics-glossary--interpretation-guide) - [Visualizations Index](#visualizations-index) @@ -68,50 +70,45 @@ We analyze tokenizers, n-gram models, Markov chains, vocabulary statistics, and ![Tokenizer Compression](visualizations/tokenizer_compression.png) +![Tokenizer Fertility](visualizations/tokenizer_fertility.png) + +![Tokenizer OOV](visualizations/tokenizer_oov.png) + +![Total Tokens](visualizations/tokenizer_total_tokens.png) + ### Results | Vocab Size | Compression | Avg Token Len | UNK Rate | Total Tokens | |------------|-------------|---------------|----------|--------------| -| **8k** | 3.426x | 3.37 | 0.0811% | 33,276 | -| **16k** | 3.456x 🏆 | 3.40 | 0.0819% | 32,987 | +| **8k** | 3.497x 🏆 | 3.52 | 0.0960% | 19,785 | ### Tokenization Examples Below are sample sentences tokenized with each vocabulary size: -**Sample 1:** `Môxéhéó'o (vé'ho'énêstsestôtse: broom, "sweeping [thing]") Pl: môxéheonôtse. - - -C...` +**Sample 1:** `Vášêtaëno, Amâho'hestôtse (Pl: amâho'héstotôtse) Ama'éno'hamémôxe'êstoo'o Ama'én...` | Vocab | Tokens | Count | |-------|--------|-------| -| 8k | `▁môxéhéó ' o ▁( vé ' ho ' énêstsestôtse : ... (+18 more)` | 28 | -| 16k | `▁môxéhéó ' o ▁( vé ' ho ' énêstsestôtse : ... (+17 more)` | 27 | +| 8k | `▁vášêtaëno , ▁amâho ' hestôtse ▁( pl : ▁amâho ' ... (+16 more)` | 26 | -**Sample 2:** `Brazil, na'éstse ho'e-éve, Amérika. - -Category:Brazil` +**Sample 2:** `Ma'xeamóvôhtó'hestôtse, Éam-óvôhtó'heo'o. thumb|right thumb|right thumb|Daimler-...` | Vocab | Tokens | Count | |-------|--------|-------| -| 8k | `▁brazil , ▁na ' éstse ▁ho ' e - éve ... (+6 more)` | 16 | -| 16k | `▁brazil , ▁na ' éstse ▁ho ' e - éve ... (+6 more)` | 16 | - -**Sample 3:** `Boise, na'éstse manâhéno, Idaho. +| 8k | `▁ma ' xeamóvôhtó ' hestôtse , ▁éam - óvôhtó ' ... (+16 more)` | 26 | -Category:Mâhoestôtse` +**Sample 3:** `Mâhpémo'éhe (Alces alces) máto héva popóhpoévêsémo'éhe váótséva-éve.` | Vocab | Tokens | Count | |-------|--------|-------| -| 8k | `▁boise , ▁na ' éstse ▁manâhéno , ▁idaho . ▁category ... (+2 more)` | 12 | -| 16k | `▁boise , ▁na ' éstse ▁manâhéno , ▁idaho . ▁category ... (+2 more)` | 12 | +| 8k | `▁mâhpémo ' éhe ▁( alces ▁alces ) ▁máto ▁héva ▁popóhpoévêsémo ... (+6 more)` | 16 | ### Key Findings -- **Best Compression:** 16k achieves 3.456x compression -- **Lowest UNK Rate:** 8k with 0.0811% unknown tokens +- **Best Compression:** 8k achieves 3.497x compression +- **Lowest UNK Rate:** 8k with 0.0960% unknown tokens - **Trade-off:** Larger vocabularies improve compression but increase model size - **Recommendation:** 32k vocabulary provides optimal balance for production use @@ -120,57 +117,89 @@ Category:Mâhoestôtse` ![N-gram Perplexity](visualizations/ngram_perplexity.png) +![N-gram Unique](visualizations/ngram_unique.png) + ![N-gram Coverage](visualizations/ngram_coverage.png) ### Results -| N-gram | Perplexity | Entropy | Unique N-grams | Top-100 Coverage | Top-1000 Coverage | -|--------|------------|---------|----------------|------------------|-------------------| -| **2-gram** | 237 🏆 | 7.89 | 654 | 65.5% | 100.0% | -| **2-gram** | 360 🏆 | 8.49 | 1,127 | 58.7% | 99.4% | -| **3-gram** | 533 | 9.06 | 1,211 | 49.6% | 95.0% | -| **3-gram** | 1,561 | 10.61 | 4,876 | 32.9% | 74.3% | -| **4-gram** | 1,077 | 10.07 | 2,302 | 37.1% | 77.9% | -| **4-gram** | 3,419 | 11.74 | 11,151 | 25.8% | 57.5% | +| N-gram | Variant | Perplexity | Entropy | Unique N-grams | Top-100 Coverage | Top-1000 Coverage | +|--------|---------|------------|---------|----------------|------------------|-------------------| +| **2-gram** | Word | 102 🏆 | 6.68 | 159 | 86.3% | 100.0% | +| **2-gram** | Subword | 330 | 8.37 | 871 | 59.3% | 100.0% | +| **3-gram** | Word | 156 | 7.28 | 245 | 72.6% | 100.0% | +| **3-gram** | Subword | 1,700 | 10.73 | 3,811 | 27.1% | 73.0% | +| **4-gram** | Word | 310 | 8.27 | 449 | 52.1% | 100.0% | +| **4-gram** | Subword | 4,072 | 11.99 | 8,559 | 18.3% | 52.6% | ### Top 5 N-grams by Size -**2-grams:** +**2-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `na éstse` | 161 | +| 2 | `vé ho` | 119 | +| 3 | `ho énêstsestôtse` | 72 | +| 4 | `republic of` | 67 | +| 5 | `ho e` | 57 | + +**3-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `vé ho énêstsestôtse` | 72 | +| 2 | `na éstse manâhéno` | 56 | +| 3 | `ho e éve` | 49 | +| 4 | `na éstse ho` | 48 | +| 5 | `éstse ho e` | 48 | + +**4-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `éstse ho e éve` | 48 | +| 2 | `na éstse ho e` | 48 | +| 3 | `ma kaetaévôxe êstoo o` | 25 | +| 4 | `toháano éve ho etse` | 23 | +| 5 | `na éstse manâhéno ho` | 22 | + +**2-grams (Subword):** | Rank | N-gram | Count | |------|--------|-------| -| 1 | `category :` | 973 | -| 2 | `' e` | 663 | -| 3 | `ho '` | 511 | -| 4 | `' o` | 391 | -| 5 | `. category` | 332 | +| 1 | `e _` | 1,534 | +| 2 | `s e` | 1,395 | +| 3 | `s t` | 1,310 | +| 4 | `t s` | 1,310 | +| 5 | `h e` | 1,012 | -**3-grams:** +**3-grams (Subword):** | Rank | N-gram | Count | |------|--------|-------| -| 1 | `. category :` | 331 | -| 2 | `na ' éstse` | 288 | -| 3 | `' ho '` | 225 | -| 4 | `| thumb |` | 204 | -| 5 | `| right |` | 201 | +| 1 | `t s e` | 1,002 | +| 2 | `s e _` | 580 | +| 3 | `e s t` | 468 | +| 4 | `s t s` | 459 | +| 5 | `h o '` | 443 | -**4-grams:** +**4-grams (Subword):** | Rank | N-gram | Count | |------|--------|-------| -| 1 | `, na ' éstse` | 199 | -| 2 | `| thumb | right` | 167 | -| 3 | `thumb | right |` | 155 | -| 4 | `vé ' ho '` | 131 | -| 5 | `300px | thumb |` | 128 | +| 1 | `t s e _` | 456 | +| 2 | `s t s e` | 436 | +| 3 | `ô t s e` | 287 | +| 4 | `t ô t s` | 208 | +| 5 | `e s t ô` | 198 | ### Key Findings -- **Best Perplexity:** 2-gram with 237 +- **Best Perplexity:** 2-gram (word) with 102 - **Entropy Trend:** Decreases with larger n-grams (more predictable) -- **Coverage:** Top-1000 patterns cover ~58% of corpus +- **Coverage:** Top-1000 patterns cover ~53% of corpus - **Recommendation:** 4-gram or 5-gram for best predictive performance --- @@ -178,55 +207,86 @@ Category:Mâhoestôtse` ![Markov Entropy](visualizations/markov_entropy.png) +![Markov Contexts](visualizations/markov_contexts.png) + ![Markov Branching](visualizations/markov_branching.png) ### Results -| Context | Avg Entropy | Perplexity | Branching Factor | Unique Contexts | Predictability | -|---------|-------------|------------|------------------|-----------------|----------------| -| **1** | 0.3489 | 1.274 | 2.42 | 4,255 | 65.1% | -| **1** | 1.3997 | 2.638 | 10.97 | 189 | 0.0% | -| **2** | 0.1560 | 1.114 | 1.36 | 10,197 | 84.4% | -| **2** | 1.2345 | 2.353 | 5.27 | 2,073 | 0.0% | -| **3** | 0.0936 | 1.067 | 1.18 | 13,745 | 90.6% | -| **3** | 0.6401 | 1.558 | 2.32 | 10,919 | 36.0% | -| **4** | 0.0555 🏆 | 1.039 | 1.10 | 16,004 | 94.5% | -| **4** | 0.2796 🏆 | 1.214 | 1.44 | 25,260 | 72.0% | +| Context | Variant | Avg Entropy | Perplexity | Branching Factor | Unique Contexts | Predictability | +|---------|---------|-------------|------------|------------------|-----------------|----------------| +| **1** | Word | 0.4118 | 1.330 | 2.00 | 3,383 | 58.8% | +| **1** | Subword | 1.3726 | 2.589 | 9.72 | 175 | 0.0% | +| **2** | Word | 0.1118 | 1.081 | 1.20 | 6,516 | 88.8% | +| **2** | Subword | 1.2032 | 2.303 | 5.05 | 1,699 | 0.0% | +| **3** | Word | 0.0474 | 1.033 | 1.08 | 7,515 | 95.3% | +| **3** | Subword | 0.6524 | 1.572 | 2.34 | 8,541 | 34.8% | +| **4** | Word | 0.0269 🏆 | 1.019 | 1.04 | 7,792 | 97.3% | +| **4** | Subword | 0.2844 | 1.218 | 1.44 | 19,944 | 71.6% | + +### Generated Text Samples (Word-based) + +Below are text samples generated from each word-based Markov chain model: + +**Context Size 1:** + +1. `e he óonéma enóne éohkê héška ó he hohamháa há continentan naa nêhéóhe násáahéne enomóvóhe tsé` +2. `ho honáemanėstóoseo o united states manâhestôtse 1 188 lwanda 100px mogadishu somali shilling swahil...` +3. `o vé keehoohtsêstse vó kaehevôtse vé ho énêstsestôtse billingscheyenne english dictionary chief dull...` + +**Context Size 2:** + +1. `na éstse ho e éve asia center thumb handelskade in willemstad curaçao` +2. `vé ho énêstsestôtse black hills ho honáéšé e missouri ó he e pónoeo hé e na éstse` +3. `ho énêstsestôtse cimarron river bull river forgan heévȧhetanéno` + +**Context Size 3:** + +1. `vé ho énêstsestôtse bay horse variant tsé vó névóvâtse` +2. `na éstse manâhéno ho honáéšé e united states óoetaneo o óoetanéno tsé amo eétâhéstove vé ho énêstses...` +3. `ho e éve hóxovê hooma center frameless upright 1 5` + +**Context Size 4:** + +1. `éstse ho e éve meško` +2. `na éstse ho e éve vietnam dong hoi airport` +3. `ma kaetaévôxe êstoo o sango toháano éve ho etse 622 984 4 216 666 1 198 chad republic of` -### Generated Text Samples -Below are text samples generated from each Markov chain model: +### Generated Text Samples (Subword-based) + +Below are text samples generated from each subword-based Markov chain model: **Context Size 1:** -1. `' konénėhesó - éve . manâhestôtse 7 heše ' tavö ' éhoo ' he tsénėxhésemé '` -2. `, na ' he tsénėxhésemé ' evo ' éno ' e 1904 , na ' otsenáhkohe` -3. `: turkey , na ' o tsétsêhéstâhese - éve . gus . curitiba - éve .` +1. `e_a_29150002)_te` +2. `_rkul_mâxpoeése.` +3. `a_xema'ėh-évese:` **Context Size 2:** -1. `category : ó ' he ( vé ' ho ' énestse 71 , 740 6 , 418` -2. `' e 300px | thumb | amâho ' hestôtse amêške tsémo ' ôhtávoome amâho ' hestôtse category` -3. `ho ' xó ' mâhoéve ' ho ' énêstsestôtse : purgatoire river , picketwire river ) -` +1. `e_(vé'še'tó'neadd` +2. `seotó'o_poestôtse` +3. `ts_wymnetugual_ju` **Context Size 3:** -1. `. category : mâhoestôtse category : california` -2. `na ' éstse ho ' e - éve hóxovê - hooma , asia ) . *` -3. `' ho ' e - éve , meško . category : mâhoestôtse category : ho ' honáeo '` +1. `tse._vé'ho'hé'e_bo` +2. `se_rokese_mâhestȯt` +3. `estôtse_manâhá'e_(` **Context Size 4:** -1. `, na ' éstse ho ' e - éve , amérika . *` -2. `| thumb | right | hóxeeséeto ' hamestôtse 300px | thumb | right | méstaa ' êhéhe category :` -3. `thumb | right | hotóhkeo ' o tsénésôhtôxese 300px | thumb | right | mámaa ' e mámaa '` +1. `tse_hotómá'e_12_évȯ` +2. `stseévenomo_hovanan` +3. `ôtse:_ten_sage";_ar` ### Key Findings -- **Best Predictability:** Context-4 with 94.5% predictability +- **Best Predictability:** Context-4 (word) with 97.3% predictability - **Branching Factor:** Decreases with context size (more deterministic) -- **Memory Trade-off:** Larger contexts require more storage (25,260 contexts) +- **Memory Trade-off:** Larger contexts require more storage (19,944 contexts) - **Recommendation:** Context-3 or Context-4 for text generation --- @@ -242,64 +302,64 @@ Below are text samples generated from each Markov chain model: | Metric | Value | |--------|-------| -| Vocabulary Size | 1,659 | -| Total Tokens | 14,360 | -| Mean Frequency | 8.66 | +| Vocabulary Size | 1,237 | +| Total Tokens | 8,401 | +| Mean Frequency | 6.79 | | Median Frequency | 3 | -| Frequency Std Dev | 38.60 | +| Frequency Std Dev | 21.86 | ### Most Common Words | Rank | Word | Frequency | |------|------|-----------| -| 1 | category | 974 | -| 2 | e | 690 | -| 3 | ho | 531 | -| 4 | o | 414 | -| 5 | na | 293 | -| 6 | éstse | 288 | -| 7 | right | 260 | -| 8 | éve | 259 | -| 9 | thumb | 226 | -| 10 | vé | 180 | +| 1 | e | 431 | +| 2 | ho | 377 | +| 3 | o | 237 | +| 4 | na | 165 | +| 5 | vé | 161 | +| 6 | éstse | 161 | +| 7 | éve | 154 | +| 8 | of | 118 | +| 9 | he | 108 | +| 10 | naa | 108 | ### Least Common Words (from vocabulary) | Rank | Word | Frequency | |------|------|-----------| -| 1 | evenóse | 2 | -| 2 | mountain | 2 | -| 3 | cal | 2 | -| 4 | poly | 2 | -| 5 | mustangs | 2 | -| 6 | sevonévo | 2 | -| 7 | ėstovátamevéotse | 2 | -| 8 | ėstova | 2 | -| 9 | nėstse | 2 | -| 10 | 2025 | 2 | +| 1 | mustangs | 2 | +| 2 | sevonévo | 2 | +| 3 | ėstovátamevéotse | 2 | +| 4 | ėstova | 2 | +| 5 | nėstse | 2 | +| 6 | kūnas | 2 | +| 7 | epsteins | 2 | +| 8 | ir | 2 | +| 9 | felon | 2 | +| 10 | immigrants | 2 | ### Zipf's Law Analysis | Metric | Value | |--------|-------| -| Zipf Coefficient | 0.8829 | -| R² (Goodness of Fit) | 0.980523 | +| Zipf Coefficient | 0.8151 | +| R² (Goodness of Fit) | 0.976018 | | Adherence Quality | **excellent** | ### Coverage Analysis | Top N Words | Coverage | |-------------|----------| -| Top 100 | 57.7% | -| Top 1,000 | 90.8% | +| Top 100 | 54.8% | +| Top 1,000 | 94.4% | | Top 5,000 | 0.0% | | Top 10,000 | 0.0% | ### Key Findings -- **Zipf Compliance:** R²=0.9805 indicates excellent adherence to Zipf's law -- **High Frequency Dominance:** Top 100 words cover 57.7% of corpus -- **Long Tail:** -8,341 words needed for remaining 100.0% coverage +- **Zipf Compliance:** R²=0.9760 indicates excellent adherence to Zipf's law +- **High Frequency Dominance:** Top 100 words cover 54.8% of corpus +- **Long Tail:** -8,763 words needed for remaining 100.0% coverage --- ## 5. Word Embeddings Evaluation @@ -312,24 +372,109 @@ Below are text samples generated from each Markov chain model: ![t-SNE Sentences](visualizations/tsne_sentences.png) -### Model Comparison -| Model | Vocab Size | Dimension | Avg Norm | Std Norm | Isotropy | -|-------|------------|-----------|----------|----------|----------| -| **mono_32d** | 223 | 32 | 1.577 | 0.877 | 0.0028 🏆 | -| **mono_64d** | 223 | 64 | 1.556 | 0.897 | 0.0009 | -| **mono_128d** | 223 | 128 | 1.593 | 0.888 | 0.0002 | -| **embeddings_enhanced** | 0 | 0 | 0.000 | 0.000 | 0.0000 | +### 5.1 Cross-Lingual Alignment + +> *Note: Multilingual alignment visualization not available for this language.* + + +### 5.2 Model Comparison + +| Model | Dimension | Isotropy | Semantic Density | Alignment R@1 | Alignment R@10 | +|-------|-----------|----------|------------------|---------------|----------------| +| **mono_32d** | 32 | 0.0023 🏆 | 0.8533 | N/A | N/A | +| **mono_64d** | 64 | 0.0008 | 0.9264 | N/A | N/A | +| **mono_128d** | 128 | 0.0002 | 0.9821 | N/A | N/A | ### Key Findings -- **Best Isotropy:** mono_32d with 0.0028 (more uniform distribution) -- **Dimension Trade-off:** Higher dimensions capture more semantics but reduce isotropy -- **Vocabulary Coverage:** All models cover 223 words -- **Recommendation:** 100d for balanced semantic capture and efficiency +- **Best Isotropy:** mono_32d with 0.0023 (more uniform distribution) +- **Semantic Density:** Average pairwise similarity of 0.9206. Lower values indicate better semantic separation. +- **Alignment Quality:** No aligned models evaluated in this run. +- **Recommendation:** 128d aligned for best cross-lingual performance + +--- +## 6. Morphological Analysis (Experimental) + +> ⚠️ **Warning:** This language shows low morphological productivity. The statistical signals used for this analysis may be noisy or less reliable than for morphologically rich languages. + +This section presents an automated morphological analysis derived from the statistical divergence between word-level and subword-level models. By analyzing where subword predictability spikes and where word-level coverage fails, we can infer linguistic structures without supervised data. + +### 6.1 Productivity & Complexity + +| Metric | Value | Interpretation | Recommendation | +|--------|-------|----------------|----------------| +| Productivity Index | **0.000** | Low morphological productivity | ⚠️ Likely unreliable | +| Idiomaticity Gap | **-1.000** | Low formulaic content | - | + +### 6.2 Affix Inventory (Productive Units) + +These are the most productive prefixes and suffixes identified by sampling the vocabulary for global substitutability patterns. A unit is considered an affix if stripping it leaves a valid stem that appears in other contexts. + +#### Productive Prefixes +| Prefix | Examples | +|--------|----------| +| `-ho` | horse, hotoa, hoohëö | + +#### Productive Suffixes +| Suffix | Examples | +|--------|----------| +| `-e` | néstovátamevéotse, kôhtse, where | +| `-se` | néstovátamevéotse, kôhtse, kaehevotôtse | +| `-tse` | néstovátamevéotse, kôhtse, kaehevotôtse | +| `-ne` | mâhoestôtsene, kane, mâheóne | +| `-ôtse` | kaehevotôtse, oestôtse, xemenôtse | +| `-ia` | anastacia, abkhazia, shepherdia | +| `-ve` | êstonêstove, native, hestoháatamaahéstove | + +### 6.3 Bound Stems (Lexical Roots) + +Bound stems are high-frequency subword units that are semantically cohesive but rarely appear as standalone words. These often correspond to the 'core' of a word that requires inflection or derivation to be valid. + +*No significant bound stems detected.* + + +### 6.4 Affix Compatibility (Co-occurrence) + +This table shows which prefixes and suffixes most frequently co-occur on the same stems, revealing the 'stacking' rules of the language's morphology. + +| Prefix | Suffix | Frequency | Examples | +|--------|--------|-----------|----------| +| `-ho` | `-e` | 5 words | horse, hováhne | +| `-ho` | `-ne` | 2 words | hováhne, hovahne | +| `-ho` | `-se` | 1 words | horse, hotse | +| `-ho` | `-tse` | 1 words | hotse, hohpâhtsenámenôtse | +| `-ho` | `-ôtse` | 1 words | hohpâhtsenámenôtse | + +### 6.5 Recursive Morpheme Segmentation + +Using **Recursive Hierarchical Substitutability**, we decompose complex words into their constituent morphemes. This approach handles nested affixes (e.g., `prefix-prefix-root-suffix`). + +| Word | Suggested Split | Confidence | Stem | +|------|-----------------|------------|------| +| mâhoestôtsene | **`mâhoest-ôtse-ne`** | 3.0 | `mâhoest` | +| sevoneóneve | **`sevoneó-ne-ve`** | 3.0 | `sevoneó` | +| náhkȯhehetanetse | **`náhkȯheheta-ne-tse`** | 3.0 | `náhkȯheheta` | +| enóseoneve | **`enóseo-ne-ve`** | 3.0 | `enóseo` | +| éestsėstóseoneve | **`éestsėstóseo-ne-ve`** | 3.0 | `éestsėstóseo` | +| kaehevotôtse | **`kaehevot-ôtse`** | 1.5 | `kaehevot` | +| anastacia | **`anastac-ia`** | 1.5 | `anastac` | +| névóvâtse | **`névóvâ-tse`** | 1.5 | `névóvâ` | +| shepherdia | **`shepherd-ia`** | 1.5 | `shepherd` | +| êstonêstove | **`êstonêsto-ve`** | 1.5 | `êstonêsto` | +| yellowstone | **`yellowsto-ne`** | 1.5 | `yellowsto` | +| hoohtseto | **`ho-ohtseto`** | 1.5 | `ohtseto` | +| xemenôtse | **`xemen-ôtse`** | 1.5 | `xemen` | +| véhonevoemėstse | **`véhonevoemės-tse`** | 1.5 | `véhonevoemės` | +| manestôtse | **`manest-ôtse`** | 1.5 | `manest` | + +### 6.6 Linguistic Interpretation + +> **Automated Insight:** +The language CHY appears to be more isolating or has a highly fixed vocabulary. Word-level models perform nearly as well as subword models, indicating fewer productive morphological processes. --- -## 6. Summary & Recommendations +## 7. Summary & Recommendations ![Performance Dashboard](visualizations/performance_dashboard.png) @@ -337,11 +482,12 @@ Below are text samples generated from each Markov chain model: | Component | Recommended | Rationale | |-----------|-------------|-----------| -| Tokenizer | **32k BPE** | Best compression (3.46x) with low UNK rate | -| N-gram | **5-gram** | Lowest perplexity (237) | -| Markov | **Context-4** | Highest predictability (94.5%) | +| Tokenizer | **8k BPE** | Best compression (3.50x) | +| N-gram | **2-gram** | Lowest perplexity (102) | +| Markov | **Context-4** | Highest predictability (97.3%) | | Embeddings | **100d** | Balanced semantic capture and isotropy | + --- ## Appendix: Metrics Glossary & Interpretation Guide @@ -531,7 +677,8 @@ If you use these models in your research, please cite: author = {Kamali, Omar}, title = {Wikilangs: Open NLP Models for Wikipedia Languages}, year = {2025}, - publisher = {HuggingFace}, + doi = {10.5281/zenodo.18073153}, + publisher = {Zenodo}, url = {https://huggingface.co/wikilangs} institution = {Omneity Labs} } @@ -547,7 +694,8 @@ MIT License - Free for academic and commercial use. - 🤗 Models: [huggingface.co/wikilangs](https://huggingface.co/wikilangs) - 📊 Data: [wikipedia-monthly](https://huggingface.co/datasets/omarkamali/wikipedia-monthly) - 👤 Author: [Omar Kamali](https://huggingface.co/omarkamali) +- 🤝 Sponsor: [Featherless AI](https://featherless.ai) --- *Generated by Wikilangs Models Pipeline* -*Report Date: 2025-12-28 22:42:59* +*Report Date: 2026-01-03 10:13:21* diff --git a/models/embeddings/monolingual/chy_128d.bin b/models/embeddings/monolingual/chy_128d.bin index 7cc0d1a73826ae004ae0bbdfe1f6ecddaa38e567..cbbc1fbaa3af948312fcbc5028d7213731c479fd 100644 --- a/models/embeddings/monolingual/chy_128d.bin +++ b/models/embeddings/monolingual/chy_128d.bin @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:9abb00efdaee95eb235694ba4738708972eef813303beafecc2c4852c0a11360 -size 1024233321 +oid sha256:fe82c87b0cea7042a6a16f85cf1875011ca32f13b8c941e25d5c1d57849b2eed +size 1024170112 diff --git a/models/embeddings/monolingual/chy_128d_metadata.json b/models/embeddings/monolingual/chy_128d_metadata.json index e6d7548ad73151b59072f478fccb96cd56bde54f..7cb412f2a05bf300795a1bd9c0deb3cc791e4e3a 100644 --- a/models/embeddings/monolingual/chy_128d_metadata.json +++ b/models/embeddings/monolingual/chy_128d_metadata.json @@ -3,11 +3,13 @@ "dimension": 128, "version": "monolingual", "training_params": { - "dim": 128, + "algorithm": "skipgram", "min_count": 5, "window": 5, "negative": 5, - "epochs": 5 + "epochs": 5, + "encoding_method": "rope", + "dim": 128 }, - "vocab_size": 223 + "vocab_size": 163 } \ No newline at end of file diff --git a/models/embeddings/monolingual/chy_32d.bin b/models/embeddings/monolingual/chy_32d.bin index faa1a2545909a3901b8e11b0d5062a490ed4c3b9..53109a70fa560ac8e5056453088c57c9e63824b2 100644 --- a/models/embeddings/monolingual/chy_32d.bin +++ b/models/embeddings/monolingual/chy_32d.bin @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:8c8e4f25fd101e1917ce06bf1768de9245362d868e23a5ebbbc943b0e4650a60 -size 256062057 +oid sha256:70c8741bb1d28e9983cdf32850bd345db541b195a40f92c1ebc8ba49374864be +size 256044928 diff --git a/models/embeddings/monolingual/chy_32d_metadata.json b/models/embeddings/monolingual/chy_32d_metadata.json index bf5c61f66a810a6453b2a58d2dc60838365927c8..8703755715898037e70b71b4a446ec1888736501 100644 --- a/models/embeddings/monolingual/chy_32d_metadata.json +++ b/models/embeddings/monolingual/chy_32d_metadata.json @@ -3,11 +3,13 @@ "dimension": 32, "version": "monolingual", "training_params": { - "dim": 32, + "algorithm": "skipgram", "min_count": 5, "window": 5, "negative": 5, - "epochs": 5 + "epochs": 5, + "encoding_method": "rope", + "dim": 32 }, - "vocab_size": 223 + "vocab_size": 163 } \ No newline at end of file diff --git a/models/embeddings/monolingual/chy_64d.bin b/models/embeddings/monolingual/chy_64d.bin index 8b835d503ef3904b198aa963ca8aa6ccc76c6426..6323ef4dce20317572c8cdad66ae99dc2f9b2570 100644 --- a/models/embeddings/monolingual/chy_64d.bin +++ b/models/embeddings/monolingual/chy_64d.bin @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:4c920935ac87a1441f862361919b9313ce17a00634fb4b6dc1a1bb483d79a2f5 -size 512119145 +oid sha256:235dc5785bb5f5c2123f5d894c3c7266a0f239882928a7be48b4ddbb112ff63d +size 512086656 diff --git a/models/embeddings/monolingual/chy_64d_metadata.json b/models/embeddings/monolingual/chy_64d_metadata.json index 28e834650be51bdc4e5b8fac779e22cad0c921ca..a5bb0711d4a69c2409647310e3406b462d88b0c6 100644 --- a/models/embeddings/monolingual/chy_64d_metadata.json +++ b/models/embeddings/monolingual/chy_64d_metadata.json @@ -3,11 +3,13 @@ "dimension": 64, "version": "monolingual", "training_params": { - "dim": 64, + "algorithm": "skipgram", "min_count": 5, "window": 5, "negative": 5, - "epochs": 5 + "epochs": 5, + "encoding_method": "rope", + "dim": 64 }, - "vocab_size": 223 + "vocab_size": 163 } \ No newline at end of file diff --git a/models/subword_markov/chy_markov_ctx1_subword.parquet b/models/subword_markov/chy_markov_ctx1_subword.parquet index 48bfbd9ca9423981a7fd71ed73dcdbb643a68ddf..b65525baa7d3f7cd71d46cb27241c8622e7e0004 100644 --- a/models/subword_markov/chy_markov_ctx1_subword.parquet +++ b/models/subword_markov/chy_markov_ctx1_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:6ea83b117f3a486981315ed6c06f66c8b179b52afa64753b10adcb037d01a299 -size 19111 +oid sha256:d18af851c234d84f1c8c493282bad10c2ce3cf38e5f884b0def3b69a8cc46a72 +size 16739 diff --git a/models/subword_markov/chy_markov_ctx1_subword_metadata.json b/models/subword_markov/chy_markov_ctx1_subword_metadata.json index 555cbdabfc245856d836baf2a585059de4b524f6..61c636a7b2c4bed78761d65a9eb60dcffce0edaa 100644 --- a/models/subword_markov/chy_markov_ctx1_subword_metadata.json +++ b/models/subword_markov/chy_markov_ctx1_subword_metadata.json @@ -2,6 +2,6 @@ "context_size": 1, "variant": "subword", "language": "chy", - "unique_contexts": 189, - "total_transitions": 113177 + "unique_contexts": 175, + "total_transitions": 68722 } \ No newline at end of file diff --git a/models/subword_markov/chy_markov_ctx2_subword.parquet b/models/subword_markov/chy_markov_ctx2_subword.parquet index 8af29c7d8c63dd216b83b54b4ae3cf708354f1f6..b51219dc720f154bc0467c8881e036051db8889d 100644 --- a/models/subword_markov/chy_markov_ctx2_subword.parquet +++ b/models/subword_markov/chy_markov_ctx2_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:cc4389e4a77f6c71ab2c352c280e4fc1281ab64c25a40abbcf7af73411115bda -size 71565 +oid sha256:8ecd88390cc78af12a57e86676cd4ac3d721a2a3b6105dedf56498e4f9198ba6 +size 55410 diff --git a/models/subword_markov/chy_markov_ctx2_subword_metadata.json b/models/subword_markov/chy_markov_ctx2_subword_metadata.json index 2e128b8c6e873f4c04c8ac8aa70c5939469d8f3f..5abcbba2adee7b69c55772e0e2126f9d51f9b219 100644 --- a/models/subword_markov/chy_markov_ctx2_subword_metadata.json +++ b/models/subword_markov/chy_markov_ctx2_subword_metadata.json @@ -2,6 +2,6 @@ "context_size": 2, "variant": "subword", "language": "chy", - "unique_contexts": 2073, - "total_transitions": 112352 + "unique_contexts": 1699, + "total_transitions": 68263 } \ No newline at end of file diff --git a/models/subword_markov/chy_markov_ctx3_subword.parquet b/models/subword_markov/chy_markov_ctx3_subword.parquet index 549a659dc4ac3f5025ab2f4fe797894755e9a950..cc95a10d7c95dacb2a56d9e697939efcd6c07628 100644 --- a/models/subword_markov/chy_markov_ctx3_subword.parquet +++ b/models/subword_markov/chy_markov_ctx3_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:e7f1d3cabcf310e4f6579f8d79db5b6ca9e6cb8c218d840ca5d728e17f8a9d72 -size 189294 +oid sha256:fd7301d331a8f6e5cf3904b5bd4befc8f3edf1b29659b35c4c4df3b64b73ed4d +size 150466 diff --git a/models/subword_markov/chy_markov_ctx3_subword_metadata.json b/models/subword_markov/chy_markov_ctx3_subword_metadata.json index 12bfcd9a3b0c47c5248808bf557431e91050a788..c24141efc8c80b4b285006966f092c5e10f45581 100644 --- a/models/subword_markov/chy_markov_ctx3_subword_metadata.json +++ b/models/subword_markov/chy_markov_ctx3_subword_metadata.json @@ -2,6 +2,6 @@ "context_size": 3, "variant": "subword", "language": "chy", - "unique_contexts": 10919, - "total_transitions": 111527 + "unique_contexts": 8541, + "total_transitions": 67804 } \ No newline at end of file diff --git a/models/subword_markov/chy_markov_ctx4_subword.parquet b/models/subword_markov/chy_markov_ctx4_subword.parquet index 05da4d063f6bbd0151f2c299b5e930a02f52835a..dc790e6a616230ca8a03aa4bb3c1f5ed316a9f83 100644 --- a/models/subword_markov/chy_markov_ctx4_subword.parquet +++ b/models/subword_markov/chy_markov_ctx4_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:6f537764e3ba64873ce0be3e727efce7efa3b3ee9a512520fdff8fe7b9f23660 -size 357786 +oid sha256:3e44005f2ed1b58ece31011bfe69d86d11229eda4839c36faadb0b8ac3af5479 +size 281701 diff --git a/models/subword_markov/chy_markov_ctx4_subword_metadata.json b/models/subword_markov/chy_markov_ctx4_subword_metadata.json index e5dcbb27e6019594508ac0e8b6c7489daea460b3..c58d3718779bc58a7413fb367c8b4c8ef76e37a3 100644 --- a/models/subword_markov/chy_markov_ctx4_subword_metadata.json +++ b/models/subword_markov/chy_markov_ctx4_subword_metadata.json @@ -2,6 +2,6 @@ "context_size": 4, "variant": "subword", "language": "chy", - "unique_contexts": 25260, - "total_transitions": 110702 + "unique_contexts": 19944, + "total_transitions": 67345 } \ No newline at end of file diff --git a/models/subword_ngram/chy_2gram_subword.parquet b/models/subword_ngram/chy_2gram_subword.parquet index f7fe4054bb2b00d6e459d9f5f58e96b2c8fc82ad..eeb1a3ba644fcd9c5a6fe84b4530a5066de6cb7a 100644 --- a/models/subword_ngram/chy_2gram_subword.parquet +++ b/models/subword_ngram/chy_2gram_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:2ed544f13314cfc2324a3b1462dfcc47e71f1236aeb6319b89b7d6e18ac3920a -size 14546 +oid sha256:0228df27f63ed413c576b4cbe4ab6f76c9785023f717763c1927ee318cba7061 +size 11557 diff --git a/models/subword_ngram/chy_2gram_subword_metadata.json b/models/subword_ngram/chy_2gram_subword_metadata.json index 3dcd391380620333d51a4816e09bc1fc27ac3d70..7d60949de5fe8c91d4b002d324ae6d41b1adc0bd 100644 --- a/models/subword_ngram/chy_2gram_subword_metadata.json +++ b/models/subword_ngram/chy_2gram_subword_metadata.json @@ -2,6 +2,6 @@ "n": 2, "variant": "subword", "language": "chy", - "unique_ngrams": 1127, - "total_ngrams": 113177 + "unique_ngrams": 871, + "total_ngrams": 68722 } \ No newline at end of file diff --git a/models/subword_ngram/chy_3gram_subword.parquet b/models/subword_ngram/chy_3gram_subword.parquet index c86eaa2a4b34e13973086ad076b8d89ccda56941..9f30cbda3681577623a233a22ba3ab9e73d419ca 100644 --- a/models/subword_ngram/chy_3gram_subword.parquet +++ b/models/subword_ngram/chy_3gram_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:4b077e5de241ab6ccaa7e1f45221e1202536459be7a01c5b4b63e9fd20ba730d -size 54182 +oid sha256:c6c8121df7c4fff9c78c7deca418c3eaecb7394cf750b4681239ca320d9977e8 +size 41181 diff --git a/models/subword_ngram/chy_3gram_subword_metadata.json b/models/subword_ngram/chy_3gram_subword_metadata.json index 555548174e8b8190acc337c36d67838ac1a9344b..d41fc983f291e48b27620d6b8b3577b107cf57a5 100644 --- a/models/subword_ngram/chy_3gram_subword_metadata.json +++ b/models/subword_ngram/chy_3gram_subword_metadata.json @@ -2,6 +2,6 @@ "n": 3, "variant": "subword", "language": "chy", - "unique_ngrams": 4876, - "total_ngrams": 112352 + "unique_ngrams": 3811, + "total_ngrams": 68263 } \ No newline at end of file diff --git a/models/subword_ngram/chy_4gram_subword.parquet b/models/subword_ngram/chy_4gram_subword.parquet index e539b7ada332f216735381cb95f42c91973d7500..98131800dc88baaa7e492a9a2e23e06e5cf8b1dc 100644 --- a/models/subword_ngram/chy_4gram_subword.parquet +++ b/models/subword_ngram/chy_4gram_subword.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:9b1c577153cd04ac40d0b50f9a07ec2270c218aff4d156a1ece01fd0edfec031 -size 130151 +oid sha256:4bfb26d6de423f9bad18bfb5f3a57a90fada9cbd23138dc9ac7e2fcdbe0c7831 +size 100088 diff --git a/models/subword_ngram/chy_4gram_subword_metadata.json b/models/subword_ngram/chy_4gram_subword_metadata.json index b4d5facb8d4bf125a5e1a04832f4cd389ea0b219..cb5dfa6b588bb1d733a482630adbddc954f31b32 100644 --- a/models/subword_ngram/chy_4gram_subword_metadata.json +++ b/models/subword_ngram/chy_4gram_subword_metadata.json @@ -2,6 +2,6 @@ "n": 4, "variant": "subword", "language": "chy", - "unique_ngrams": 11151, - "total_ngrams": 111527 + "unique_ngrams": 8559, + "total_ngrams": 67804 } \ No newline at end of file diff --git a/models/tokenizer/chy_tokenizer_8k.model b/models/tokenizer/chy_tokenizer_8k.model index 3c2bd8bdb30b7d0f487e18950af3dd12076f3693..2ef754ccf84ca9730755e4045c39b9a3d235e006 100644 --- a/models/tokenizer/chy_tokenizer_8k.model +++ b/models/tokenizer/chy_tokenizer_8k.model @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:17ad6a6d3558277d6925eb68f3855cb005e2fe3557d8485bf68a6f7274296dab -size 375317 +oid sha256:7224c406ce7f0ceae9ae28bfb815c3ebc81aafabd231728ea1823e5477c23e0d +size 374664 diff --git a/models/tokenizer/chy_tokenizer_8k.vocab b/models/tokenizer/chy_tokenizer_8k.vocab index 696dd1ad2040390fcaac724f8d8dab9d8ce0a04b..080cb7183827b809db0c9bc73806dc658602c9ab 100644 --- a/models/tokenizer/chy_tokenizer_8k.vocab +++ b/models/tokenizer/chy_tokenizer_8k.vocab @@ -4,7997 +4,7997 @@ 0 se -0 st -1 -ho -2 -he -3 -▁c -4 -at -5 -an -6 -or -7 -ate -8 -ory -9 -ateg -10 -ategory -11 -▁category -12 -tse -13 -▁m -14 -ne -15 -ve -16 -ht -17 -▁n -18 -stse -19 -hé -20 -ôtse -21 -me -22 -no -23 -ri -24 -▁( -25 -▁ho -26 -▁t -27 -on -28 -▁na -29 -sé -30 -stôtse -31 -ma -32 -éve -33 -in -34 -éstse -35 -px -36 -vé -37 -▁a -38 -va -39 -ha -40 -th -41 -rig -42 -right -43 -mb -44 -mâ -45 -en -46 -ta -47 -▁p -48 -hk -49 -umb -50 -êstse -51 -hó -52 -sto -53 -▁he -54 -▁o -55 -ane -56 +he -2 +ho -3 +an -4 +▁m -5 +tse -6 +▁n -7 +▁t -8 +stse -9 +ve -10 +on -11 +▁( -12 +hé -13 +en -14 +▁ho -15 +ôtse -16 +▁na -17 +in -18 +▁c -19 +én -20 +am -21 +▁v -22 +te -23 +ne -24 +sé -25 +ar -26 +▁o -27 +éve -28 +▁p -29 +stôtse -30 +ht -31 +▁h -32 +▁s -33 +om -34 +hk -35 +▁he -36 +êstse -37 +éstse -38 +▁é -39 +sto -40 +no -41 +re -42 +ta -43 +vé -44 +▁a -45 +er -46 +to -47 +ic -48 +va -49 +ia -50 +em -51 +▁b -52 +ha -53 +▁k -54 +le -55 +sta -56 tó -57 -▁s -58 -thumb -59 -re -60 -vó -61 -ia -62 -šk -63 -to -64 -▁é -65 -vo -66 -htá -67 -le -68 -estôtse -69 -er -70 -▁man -71 -xe -72 -mâho -73 -én -74 -ic -75 -há -76 -ná -77 -hp -78 -us -79 -▁hé -80 -né -81 -šé -82 -▁k -83 -it -84 -▁b -85 -eo -86 -mo -87 -un -88 -má -89 -▁ma -90 -vá -91 -âhé -92 -ar -93 -▁d -94 -heo -95 -honá -96 -mâhoestôtse -97 -sê -98 -še -99 -▁tsé -100 -âhe -101 -êse -102 -▁mo -103 -and -104 -al -105 -htáme -106 -▁manâhé -107 -ol -108 -om -109 -éno -110 -▁naa -111 -tane -112 -ed -113 -▁manâhéno -114 -énêstse -115 -nó -116 -honáé -117 -il -118 -is -119 -tsé -120 -es -121 -as -122 -év -123 -▁vé -124 -hoo -125 -honáéšé -126 -ing -127 -▁hó -128 -), -129 -ško -130 -mé -131 -ion -132 -tsê -133 -ra -134 -▁* -135 -na -136 -ro -137 -▁st -138 -so -139 -▁má -140 -ȯtse -141 -tan -142 -▁of -143 -▁vó -144 -énêstsestôtse -145 -sta -146 -hi -147 -▁l -148 -stȯtse -149 -▁g -150 -▁w -151 -vê -152 +▁d -58 +ane -59 +un -60 +us -61 +ma -62 +la -63 +▁man -64 +hp -65 +▁hé -66 +▁ma -67 +ri -68 +óne -69 +énêstse -70 +and -71 +âhe -72 +eo -73 +um -74 +▁naa -75 +ra -76 +âhé -77 +▁tsé -78 +vo -79 +es -80 +ti -81 +éno -82 +šé -83 +ná -84 +há -85 +xe -86 +htá -87 +▁hó -88 +is -89 +ts -90 +▁mo -91 +▁vé -92 +ame -93 +énêstsestôtse -94 +še -95 +hum -96 +▁* -97 +ing -98 +▁of -99 +ke -100 +humb -101 +), -102 +ol -103 +▁sta -104 +ite -105 +▁g -106 +▁re -107 +▁state -108 +al -109 +âho -110 +▁w -111 +▁un -112 +▁vó -113 +▁states -114 +▁manâhé -115 +me -116 +ro -117 +▁e -118 +one -119 +êse -120 +óv -121 +êhe -122 +▁ch -123 +▁f -124 +tan -125 +honá -126 +né -127 +▁the -128 +▁thumb -129 +▁manâhéno -130 +or -131 +vá -132 +▁má -133 +hó -134 +ited -135 +▁united -136 +▁am -137 +▁ó -138 +est -139 +ian -140 +lo -141 +án -142 +ȯtse -143 +it -144 +ov -145 +htse -146 +honáé -147 +év -148 +oma -149 +ter -150 +as -151 +▁ts -152 hko -153 -▁re -154 -▁un -155 -hést -156 -ited -157 -▁hést -158 -▁state -159 -▁f -160 -▁states -161 -ke -162 -óne -163 -▁e -164 -kêse -165 -kêseho -166 -só -167 -hpe -168 -htse -169 -▁the -170 -ac -171 -nė -172 -xo -173 -▁ó -174 -hke -175 -▁hotó -176 -ka -177 -▁me -178 -mene -179 -▁united -180 -ad -181 -ge -182 -ésto -183 -ur -184 -ánó -185 -▁máhtáme -186 -pu -187 -ôxe -188 -ation -189 -hne -190 -stó -191 -▁mâ -192 -ánóva -193 -han -194 -hetane -195 -lo -196 -ce -197 -ëö -198 -▁há -199 -▁ne -200 -ano -201 -hno -202 -tanéno -203 -▁héstánóva -204 -ww -205 -ot -206 -▁j -207 -sia -208 -▁hotómá -209 -ko -210 -seo -211 -ške -212 -▁ame -213 -onôtse -214 -máhtáme -215 -ina -216 -▁ch -217 -vóne -218 -co -219 -id -220 -ant -221 -men -222 -éhe -223 -pa -224 -▁ve -225 -hotó -226 -la -227 -xa -228 -ric -229 -âhese -230 -sóvóne -231 -// -232 -tp -233 -▁" -234 -:// -235 -sus -236 -http -237 -▁hova -238 -bl -239 -vi -240 -ôh -241 -êsto -242 -nestȯtse -243 -ap -244 -sa -245 -ȧhe -246 -némene -247 -némenestôtse -248 -ham -249 -héó -250 -ȯhó -251 +honáéšé -154 +il -155 +šk -156 +ina -157 +el -158 +éhe -159 +estôtse -160 +pu -161 +at -162 +ot -163 +▁" -164 +▁j -165 +má -166 +vó -167 +▁tse -168 +▁hotó -169 +ie -170 +ur -171 +▁ne -172 +tane -173 +ésto -174 +stȯtse -175 +sia -176 +otse -177 +▁hést -178 +ae -179 +ôxe -180 +tion -181 +htáme -182 +ad -183 +▁l -184 +ham -185 +ôhk -186 +▁pl -187 +xêse -188 +▁máhtáme -189 +ge -190 +ir -191 +ko -192 +oe -193 +eno -194 +xem -195 +▁china -196 +ch -197 +êhé -198 +ee -199 +na -200 +ôh -201 +▁i -202 +êst -203 +óva -204 +▁an -205 +ck -206 +ka -207 +hpe -208 +ėstse -209 +bl -210 +qu -211 +so -212 +her -213 +xov -214 +éma -215 +▁ar -216 +tanéno -217 +if -218 +mo -219 +ul -220 +ëö -221 +ver -222 +land -223 +ce -224 +han -225 +arian -226 +enter -227 +mé -228 +▁se -229 +▁ta -230 +ánóva -231 +enôtse -232 +hn -233 +sa -234 +êš -235 +hey -236 +sóv -237 +éne -238 +ëno -239 +▁mé -240 +enne -241 +publ -242 +êsto -243 +ariant -244 +sóvóne -245 +heyenne -246 +co -247 +th -248 +▁au -249 +▁in -250 +estó -251 htsé -252 -▁tse -253 -tsêhést -254 -tsêhéstâhese -255 -et -256 -if -257 -ôhe -258 -ȧhé -259 -▁an -260 -▁pl -261 -xêse -262 -meško -263 -▁meško -264 -ȯhónestȯtse -265 -em -266 -▁h -267 -ame -268 -oné -269 -the -270 -ëno -271 -sêstse -272 -hoestôtse -273 -go -274 -po -275 -qu -276 -▁in -277 -land -278 -tahe -279 -ôhtá -280 -ul -281 -cen -282 -esé -283 -otse -284 -xovê -285 -évȧhe -286 -▁china -287 -šê -288 -ata -289 -man -290 -▁mé -291 -blic -292 -hooma -293 -▁mâhe -294 -hotóma -295 -menôtse -296 -▁mâheóne -297 -). -298 -ee -299 -ft -300 -ver -301 -ania -302 -hova -303 -êheo -304 -public -305 -bu -306 -el -307 -kê -308 -kô -309 -oo -310 -▁i -311 -ans -312 -ian -313 -▁vo -314 -oland -315 -hovahne -316 -évȧhetanéno -317 -▁mâhoestôtse -318 -oe -319 -ti -320 -▁au -321 -▁ta -322 -vari -323 -ȧhého -324 -ch -325 -ir -326 -nê -327 -ou -328 -ter -329 -https -330 -mâhéó -331 -▁asia -332 -variant -333 -▁hóxovê -334 -hetaneho -335 -ay -336 -les -337 -ôse -338 -▁co -339 -môxe -340 -éhno -341 -ôhné -342 -▁chi -343 -▁col -344 -▁con -345 -▁éve -346 -htsêstse -347 -▁republic -348 -ehe -349 -hey -350 -hpo -351 -éso -352 -enne -353 -estó -354 -ôhném -355 -anêheo -356 -heyenne -357 -hoohtsêstse -358 -ôhnéménêstse -359 -ig -360 -vâ -361 -chi -362 -ome -363 -▁ar -364 -▁év -365 -ings -366 -rica -367 -▁mâhoestôtsene -368 -eb -369 -pp -370 -sė -371 -▁- -372 -háa -373 -éma -374 -óho -375 -▁ha -376 -▁oó -377 -▁th -378 -▁to -379 -pain -380 -▁hóestó -381 -▁manȧhého -382 -am -383 -▁– -384 -ana -385 -ild -386 -onê -387 -ort -388 -www -389 -▁ri -390 -▁se -391 -séeo -392 -tame -393 -ther -394 -taneo -395 -énano -396 -ceania -397 -mâhoéve -398 -mó -399 -xê -400 -▁r -401 -tal -402 -▁de -403 -▁ná -404 -hése -405 -▁hoo -406 -nestse -407 -▁hesto -408 -▁oxêse -409 -▁variant -410 -héstanêheo -411 -▁évȯhónestȯtse -412 -de -413 -hä -414 -te -415 -tr -416 -anė -417 -est -418 -ish -419 -▁mȧ -420 -▁né -421 -▁nė -422 -▁po -423 -ôhke -424 -šeno -425 -build -426 -heová -427 -hkohe -428 -▁hoham -429 -▁tséhe -430 -oceania -431 -buildings -432 -heováéstse -433 -heévȧhetanéno -434 -▁tsétsêhéstâhese -435 -gl -436 -pe -437 -ty -438 -tá -439 -ces -440 -eno -441 -hóo -442 -ohe -443 -anad -444 -häme -445 -vone -446 -vého -447 -xeme -448 -▁not -449 -▁poland -450 -hestôtse -451 -") -452 -ck -453 -gu -454 -ib -455 -evo -456 -ine -457 -ple -458 -óve -459 -▁al -460 -▁gr -461 -▁mi -462 -aneo -463 -name -464 -census -465 -▁river -466 -ab -467 -im -468 -ut -469 -▁[ -470 -ati -471 -com -472 -ene -473 -hkê -474 -nam -475 -pul -476 -êst -477 -óhk -478 -▁(" -479 -▁ka -480 -▁sa -481 -▁vá -482 -hame -483 -viet -484 -▁ind -485 -eople -486 -êstoo -487 -heséeo -488 -▁thumb -489 -hanôtse -490 -vietnam -491 -véhoname -492 -âhestôtse -493 -ki -494 -um -495 -ary -496 -fer -497 -hta -498 -hëö -499 -oná -500 -sëö -501 -évo -502 -hone -503 -šêta -504 -▁cen -505 -▁tsê -506 -feren -507 -véhoo -508 -vôtse -509 -▁auto -510 -anėheo -511 -éstove -512 -▁grand -513 -▁theft -514 -šêtaëno -515 -nėsóvóne -516 -▁tséhetó -517 -kêhanôtse -518 -tsêhetane -519 -tsêhetaneonôtse -520 -be -521 -ef -522 -ph -523 -éh -524 -óe -525 -ese -526 -kee -527 -one -528 -org -529 -▁ca -530 -▁pa -531 -▁tó -532 -apan -533 -haná -534 -port -535 -staa -536 -▁nor -537 -▁son -538 -hóomo -539 -mahpe -540 -mâheo -541 -▁hánê -542 -htseto -543 -▁referen -544 -▁cheyenne -545 -▁hoohtseto -546 -▁hánêsóvóne -547 -tsétsêhéstâhese -548 -ae -549 -mö -550 -ry -551 -ya -552 -za -553 -ehá -554 -hae -555 -hto -556 -ial -557 -ire -558 -ity -559 -nia -560 -nif -561 -oma -562 -son -563 -ôho -564 -ôhé -565 -▁km -566 -▁mó -567 -▁šé -568 -hamé -569 -hest -570 -nomá -571 -unty -572 -▁heó -573 -stone -574 -vátame -575 -▁chile -576 -▁héstanėheo -577 -▁references -578 -ba -579 -mê -580 -ôx -581 -ack -582 -ila -583 -nėx -584 -yom -585 -▁bo -586 -▁da -587 -▁oo -588 -▁xa -589 -nėse -590 -rika -591 -▁and -592 -▁bri -593 -▁éše -594 -trans -595 -▁môxe -596 -hésemé -597 -people -598 -sevone -599 -yoming -600 -éhnėse -601 -▁spain -602 -óonôtse -603 -▁tsénėx -604 -hóomoehá -605 -▁america -606 -portation -607 -▁hánėsóvóne -608 -évȯhónestȯtse -609 -▁tsénėxhésemé -610 -transportation -611 -'- -612 -ea -613 -ga -614 -kó -615 -sá -616 -uk -617 -ws -618 -▁y -619 -"), -620 -all -621 -col -622 -htó -623 -noo -624 -per -625 -ran -626 -▁ro -627 -▁sé -628 -▁tá -629 -inus -630 -left -631 -ohke -632 -tion -633 -▁oóo -634 -anada -635 -estse -636 -hóhtá -637 -lands -638 -tsêhé -639 -tôtse -640 -évâhe -641 -óhomö -642 -ėstse -643 -htsená -644 -staane -645 -éstova -646 -▁sonic -647 -énestse -648 -▁census -649 -▁oóoxêse -650 -šenovôtse -651 -▁nėstaane -652 -ca -653 -eé -654 -hö -655 -ik -656 -kâ -657 -ss -658 -ts -659 -xâ -660 -▁v -661 -ado -662 -ahe -663 -aho -664 -eso -665 -ica -666 -ill -667 -ral -668 -raz -669 -sis -670 -too -671 -ull -672 -äma -673 -▁gu -674 -▁is -675 -▁no -676 -asia -677 -nife -678 -stsé -679 -véto -680 -âhpe -681 -▁cro -682 -▁dic -683 -▁san -684 -▁sou -685 -anehe -686 -anese -687 -eotse -688 -hnoma -689 -ôhtsé -690 -▁héva -691 -▁kóhk -692 -▁máto -693 -mérika -694 -nêstse -695 -nėstse -696 -véotse -697 -▁knife -698 -▁nésto -699 -▁county -700 -▁notäma -701 -pulation -702 -▁heévâhe -703 -anestôtse -704 -vátamevéotse -705 -ao -706 -dp -707 -io -708 -ng -709 -▁z -710 -amo -711 -ast -712 -dge -713 -enó -714 -eve -715 -gra -716 -hoe -717 -ies -718 -ind -719 -mar -720 -omé -721 -ong -722 -oom -723 -otá -724 -tor -725 -êše -726 -êšk -727 -ëso -728 -ëva -729 -▁af -730 -▁bl -731 -▁ko -732 -engl -733 -hkoo -734 -hpév -735 -ital -736 -kota -737 -ótsé -738 -▁ama -739 -atama -740 -color -741 -graph -742 -onéma -743 -tsémâ -744 -ussia -745 -škéso -746 -šêške -747 -▁dull -748 -onêške -749 -ótséva -750 -▁colle -751 -▁tsêhe -752 -english -753 -tionary -754 -tsémâhpév -755 -▁dictionary -756 -▁heévâhetanéno -757 -%) -758 -." -759 -br -760 -ci -761 -ld -762 -nâ -763 -uc -764 -ue -765 -wa -766 -wn -767 -éo -768 -′′ -769 -▁· -770 -art -771 -evé -772 -fic -773 -gen -774 -key -775 -mon -776 -ond -777 -oro -778 -ree -779 -ton -780 -web -781 -wor -782 -ého -783 -émâ -784 -éše -785 -óno -786 -▁bu -787 -▁en -788 -▁lo -789 -▁tä -790 -néhe -791 -tain -792 -vôse -793 -xêšé -794 -éhpe -795 -▁cap -796 -▁mat -797 -ansas -798 -anéve -799 -ative -800 -chile -801 -hkôhe -802 -hésto -803 -spain -804 -stotó -805 -▁aust -806 -▁haná -807 -▁hešk -808 -▁hová -809 -▁xama -810 -center -811 -hanehe -812 -honáeo -813 -htôtse -814 -poland -815 -tóvéto -816 -êstove -817 -▁south -818 -▁tâhpe -819 -▁dakota -820 -▁nêstse -821 -▁amérika -822 -▁college -823 -▁vietnam -824 -therlands -825 -▁hohamháa -826 -▁mȧhóomoehá -827 -▁manâhestôtse -828 -▁nêstsestôtse -829 -ah -830 -da -831 -do -832 -li -833 -mô -834 -ni -835 -pd -836 -pr -837 -ub -838 -xė -839 -▁á -840 -aen -841 -ang -842 -are -843 -car -844 -eho -845 -eme -846 -eta -847 -low -848 -ono -849 -ria -850 -ser -851 -uth -852 -êha -853 -êhó -854 -óné -855 -▁hi -856 -▁pi -857 -▁sq -858 -alif -859 -news -860 -stat -861 -vêsé -862 -óevó -863 -ôheo -864 -▁com -865 -▁gre -866 -▁har -867 -▁tur -868 -▁óoe -869 -hohpe -870 -ornia -871 -razil -872 -taesé -873 -thern -874 -škese -875 -▁vóhp -876 -united -877 -ôhtávo -878 -▁aésto -879 -▁chief -880 -▁japan -881 -honáéva -882 -méškéso -883 -tsêhésê -884 -▁americ -885 -▁indian -886 -▁vóhkoo -887 -cheyenne -888 -colorado -889 -onêškeho -890 -émâhtáme -891 -ôhkemôxe -892 -▁kóhkonê -893 -▁šéstotó -894 -alifornia -895 -▁vášêtaëno -896 -▁heévȧhetanéno -897 -__ -898 -ak -899 -ds -900 -ex -901 -gy -902 -kȯ -903 -mi -904 -nt -905 -ok -906 +xovê -253 +▁con -254 +▁asia -255 +public -256 +▁hotómá -257 +ig -258 +les -259 +ome -260 +▁ka -261 +▁ri -262 +▁sa -263 +ther -264 +ôhtá -265 +hooma -266 +stâhe -267 +▁hóestó -268 +▁republic -269 +▁héstánóva -270 +ft -271 +id -272 +▁– -273 +ica -274 +man -275 +ném -276 +ohe -277 +▁oó -278 +tsêhé -279 +âhoestôtse -280 +ap -281 +ed -282 +▁co -283 +tahe -284 +▁hóxovê -285 +oo -286 +px -287 +ȯh -288 +eng -289 +heo -290 +ėse -291 +▁pa -292 +▁vo -293 +▁év -294 +hkon -295 +▁heév -296 +▁mâhe -297 +center -298 +▁oxêse -299 +stâhese -300 +") -301 +be -302 +hoo -303 +háa -304 +▁ve -305 +oland -306 +ȯhóne -307 +▁amer -308 +▁hova -309 +▁mâheóne -310 +▁variant -311 +ȯhónestȯtse -312 +▁mâhoestôtse -313 +ip -314 +kê -315 +▁- -316 +▁r -317 +ana -318 +hke -319 +stó -320 +ôhé -321 +▁(" -322 +▁mâ -323 +hése -324 +▁éve -325 +ėsóvóne -326 +▁évȯhónestȯtse -327 +▁[ -328 +▁x -329 +ces -330 +esé -331 +hne -332 +ôse -333 +ȧhe -334 +▁gr -335 +▁mȧ -336 +tame -337 +êške -338 +▁hán -339 +▁not -340 +óhkon -341 +ôhném -342 +âhoéve -343 +▁river -344 +ôhnéménêstse -345 +ay -346 +go -347 +ib -348 +im -349 +po -350 +sev -351 +son -352 +êho -353 +▁la -354 +▁vá -355 +▁auto -356 +▁grand -357 +▁hesto -358 +▁theft -359 +▁tséhe -360 +tsêhéstâhese -361 +ea -362 +ty -363 +ôx -364 +ary -365 +ema -366 +fer -367 +ish -368 +seo -369 +êšk -370 +êšé -371 +óho -372 +ôhe -373 +▁há -374 +▁le -375 +▁col -376 +eotse -377 +mâhoéve -378 +▁poland -379 +▁cheyenne -380 +pa -381 +tsé -382 +âha -383 +évo -384 +▁da -385 +▁ná -386 +▁nė -387 +▁son -388 +feren -389 +hkohe -390 +▁coun -391 +vátame -392 +▁hoham -393 +âhestôtse -394 +▁mâhoestôtsene -395 +▁tsétsêhéstâhese -396 +ba -397 +ou -398 +pe -399 +ém -400 +ahe -401 +ano -402 +ent -403 +ile -404 +ire -405 +nif -406 +ȧhé -407 +▁ko -408 +▁mi -409 +▁ro -410 +hóom -411 +▁ame -412 +▁nor -413 +êstoo -414 +▁tsén -415 +▁america -416 +▁referen -417 +eé -418 +io -419 +▁u -420 +"), -421 +enó -422 +hno -423 +hto -424 +ine -425 +rea -426 +sus -427 +tal -428 +tor -429 +▁de -430 +▁me -431 +▁mó -432 +▁po -433 +kêse -434 +oehá -435 +▁can -436 +▁dic -437 +▁hoo -438 +▁oóo -439 +estse -440 +kêseho -441 +staane -442 +sêstse -443 +éstova -444 +▁sonic -445 +▁tsêhe -446 +énestse -447 +hestôtse -448 +hóomoehá -449 +▁heévȧhe -450 +▁oóoxêse -451 +▁tséhetó -452 +▁nėstaane -453 +▁references -454 +ca -455 +fr -456 +ik -457 +ph -458 +ėx -459 +šê -460 +ang -461 +art -462 +hon -463 +ial -464 +ief -465 +ity -466 +ong -467 +ull -468 +éhn -469 +▁km -470 +▁né -471 +▁pó -472 +▁to -473 +▁šé -474 +nife -475 +▁and -476 +▁éše -477 +lands -478 +âhtse -479 +ôhkem -480 +▁vóhp -481 +hetane -482 +hésemé -483 +véotse -484 +éhnėse -485 +▁knife -486 +▁nésto -487 +hanôtse -488 +▁county -489 +▁kóhkon -490 +▁tsénėx -491 +therlands -492 +▁hánėsóvóne -493 +vátamevéotse -494 +▁tsénėxhésemé -495 +). -496 +ax -497 +nó -498 +sk -499 +ud -500 +▁y -501 +ama -502 +ber -503 +evo -504 +hta -505 +kee -506 +ono -507 +per -508 +rig -509 +âhá -510 +äma -511 +êha -512 +▁be -513 +engl -514 +oneo -515 +staa -516 +šêta -517 +▁car -518 +▁hes -519 +▁tur -520 +thumb -521 +ôhtse -522 +▁dull -523 +▁heóv -524 +▁héva -525 +▁máto -526 +onôtse -527 +▁canad -528 +▁colle -529 +english -530 +tionary -531 +šêtaëno -532 +▁héstan -533 +kêhanôtse -534 +▁vášêtaëno -535 +êstsestôtse -536 +▁dictionary -537 +▁heévȧhetanéno -538 +%) -539 +li -540 +ss -541 +xo -542 +éo -543 +êh -544 +▁· -545 +all -546 +ara -547 +are -548 +hal -549 +ist -550 +kon -551 +ura -552 +êše -553 +▁bo -554 +▁ca -555 +▁lo -556 +▁mô -557 +▁ob -558 +▁on -559 +▁su -560 +▁sé -561 +▁tó -562 +hame -563 +kota -564 +tain -565 +vôse -566 +âhéó -567 +▁ama -568 +▁har -569 +▁óoe -570 +tsêhe -571 +véhon -572 +êston -573 +šêške -574 +htseto -575 +êstove -576 +menôtse -577 +▁notäma -578 +cheyenne -579 +htsêstse -580 +véhoname -581 +▁college -582 +▁hoohtseto -583 +▁mȧhóomoehá -584 +▁manâhestôtse -585 +da -586 +do -587 +fa -588 +ga -589 +hi -590 +ry -591 +ug -592 +wa -593 +aho -594 +amo -595 +car -596 +dge -597 +ené -598 +fic -599 +ira -600 +lan -601 +low -602 +nia -603 +noo -604 +ond -605 +ree -606 +ser -607 +sëö -608 +uck -609 +óve -610 +ôho -611 +šen -612 +▁te -613 +aehe -614 +amae -615 +aneo -616 +onal -617 +tive -618 +tová -619 +âheo -620 +êheo -621 +▁bri -622 +▁cro -623 +▁san -624 +▁tai -625 +stotó -626 +êstsé -627 +▁môxe -628 +nėstse -629 +▁chief -630 +▁canada -631 +▁dakota -632 +âhetanéno -633 +▁hohamháa -634 +hoohtsêstse -635 +▁netherlands -636 +▁nêstsestôtse -637 +." -638 +.) -639 +ac -640 +ct -641 +kâ -642 +mö -643 +ok -644 +up -645 +ut -646 +▁á -647 +▁ł -648 +cep -649 +dom -650 +emâ -651 +ene -652 +eve -653 +evó -654 +iet -655 +ila -656 +mar -657 +omo -658 +ral -659 +éso -660 +▁al -661 +▁fr -662 +▁no -663 +▁ok -664 +▁sq -665 +▁tä -666 +ille -667 +lack -668 +ohke -669 +otsé -670 +séeo -671 +êhëö -672 +ėheo -673 +▁bra -674 +▁com -675 +▁ehó -676 +▁for -677 +▁ind -678 +▁tsė -679 +anéve -680 +stótó -681 +thern -682 +šenov -683 +ȧhého -684 +▁city -685 +hkoohe -686 +ingdom -687 +▁black -688 +▁chile -689 +▁xamae -690 +estȯtse -691 +sevoneo -692 +▁ehóhtá -693 +emenôtse -694 +hetaneho -695 +▁šéstotó -696 +šenovôtse -697 +▁kóhkonêhëö -698 +▁heévâhetanéno -699 +"; -700 +bi -701 +ds -702 +hu -703 +hö -704 +jo -705 +mó -706 +ni -707 +pp -708 +pó -709 +si -710 +uc -711 +ya -712 +áa -713 +ôs -714 +▁į -715 +amá -716 +ané -717 +api -718 +des -719 +ero -720 +eto -721 +gra -722 +ill -723 +int -724 +iss -725 +key -726 +oná -727 +onó -728 +rse -729 +sem -730 +stá -731 +ted -732 +éri -733 +▁ad -734 +▁is -735 +▁tr -736 +▁tâ -737 +enëö -738 +eéve -739 +inus -740 +reat -741 +reek -742 +ėsto -743 +▁cen -744 +▁mar -745 +▁rus -746 +▁sou -747 +eséve -748 +hesto -749 +onald -750 +oneve -751 +tôtse -752 +érika -753 +óhomö -754 +óneve -755 +▁braz -756 +▁sena -757 +ficial -758 +htôtse -759 +nêstse -760 +ôhtávo -761 +▁great -762 +▁netse -763 +êsóvóne -764 +šeeséve -765 +▁census -766 +▁center -767 +▁contor -768 +▁tšêške -769 +ôhkemôxe -770 +▁amérika -771 +▁kingdom -772 +▁tsėhése -773 +anestôtse -774 +enestôtse -775 +▁héstanėheo -776 +ao -777 +de -778 +kó -779 +mê -780 +pr -781 +só -782 +tš -783 +wn -784 +ze -785 +ash -786 +bia -787 +era -788 +ery -789 +eta -790 +etó -791 +eét -792 +hpé -793 +htó -794 +ier -795 +ion -796 +kâs -797 +lep -798 +nam -799 +ona -800 +pul -801 +rid -802 +tap -803 +tin -804 +van -805 +via -806 +way -807 +xêš -808 +ést -809 +éše -810 +êsé -811 +óhe -812 +óné -813 +óon -814 +óxe -815 +ôhó -816 +▁ba -817 +▁gu -818 +▁oa -819 +▁oo -820 +▁qu -821 +▁ra -822 +▁sw -823 +▁wy -824 +erus -825 +ings -826 +less -827 +skin -828 +tana -829 +váše -830 +xeme -831 +xêhe -832 +éhno -833 +ótsé -834 +▁bel -835 +▁bir -836 +▁eng -837 +▁new -838 +▁ota -839 +anese -840 +carpa -841 +evoem -842 +frame -843 +graph -844 +hohpe -845 +kâsén -846 +right -847 +stahe -848 +stove -849 +ursus -850 +ôhtsé -851 +▁aust -852 +▁capi -853 +▁comm -854 +▁heva -855 +▁hoto -856 +▁hová -857 +▁náhk -858 +▁viet -859 +emahpe -860 +hpotsé -861 +notová -862 +sevone -863 +tapâha -864 +tional -865 +vášeta -866 +xemeno -867 +ótséva -868 +▁amâho -869 +▁creek -870 +▁mâhpé -871 +▁póevó -872 +seoneve -873 +uckskin -874 +variant -875 +▁americ -876 +▁donald -877 +▁hotóhk -878 +▁russia -879 +▁britain -880 +▁náhkohe -881 +▁vietnam -882 +frameless -883 +hnestôtse -884 +▁contorta -885 +▁manȧhého -886 +▁northern -887 +▁official -888 +▁vétapâha -889 +evoemêstse -890 +▁tsevášeta -891 +▁óoetanéno -892 +hpotséhohpe -893 +sevoneóneve -894 +▁hánêsóvóne -895 +nėstsestȯtse -896 +aa -897 +ah -898 +bo -899 +eb -900 +ev -901 +eš -902 +gy -903 +kh -904 +ly -905 +mâ -906 pi -907 pl -908 -si -909 -sw -910 -▁' -911 -▁ł -912 -ase -913 -dom -914 -erc -915 -eto -916 -gov -917 -hal -918 -ite -919 -ith -920 -lep -921 -neo -922 -onó -923 -rib -924 -toa -925 -tur -926 -uri -927 -way -928 -âhá -929 -évé -930 -êho -931 -óhé -932 -▁le -933 -▁lu -934 -▁ob -935 -▁sp -936 -▁sw -937 -aska -938 -enëö -939 -eéve -940 -hese -941 -htáv -942 -ille -943 -kâsé -944 -késo -945 -kôsá -946 -sono -947 -tová -948 -vose -949 -óhtá -950 -ôhkê -951 -ėsto -952 -▁arc -953 -▁new -954 -▁tsė -955 -▁war -956 -chive -957 -heóve -958 -háano -959 -ralia -960 -story -961 -stótó -962 -ursus -963 -ôhtse -964 -▁city -965 -▁hesó -966 -▁king -967 -ficial -968 -kansas -969 -kȯhtáv -970 -émâhéó -971 -ôhtávê -972 -ational -973 -háatama -974 -náhkohe -975 -sevoneo -976 -▁canada -977 -▁ehóhtá -978 -stotôtse -979 -▁kingdom -980 -▁náhkohe -981 -hoestȯtse -982 -vášêtaëno -983 -hamestôtse -984 -population -985 -▁kóhkonêhëö -986 -▁manestôtse -987 -▁netherlands -988 -"; -989 -fi -990 -km -991 -mȧ -992 -sk -993 -vô -994 -▁x -995 -amb -996 -anó -997 -ash -998 -ave -999 -cep -1000 -con -1001 -des -1002 -dis -1003 -ero -1004 -geo -1005 -gin -1006 -hné -1007 -kon -1008 -ook -1009 -ors -1010 -pdf -1011 -red -1012 -rse -1013 -uro -1014 -use -1015 -éme -1016 -óoe -1017 -óxe -1018 -ôsé -1019 -▁ab -1020 -▁ad -1021 -▁at -1022 -▁cl -1023 -▁fr -1024 -▁jo -1025 -▁on -1026 -▁vi -1027 -eden -1028 -hemê -1029 -heve -1030 -hóno -1031 -móht -1032 -noma -1033 -reek -1034 -ress -1035 -tavo -1036 -tâhé -1037 -tóhé -1038 -ukra -1039 -vovó -1040 -xévo -1041 -éome -1042 -šeta -1043 -▁car -1044 -▁ire -1045 -ambig -1046 -anóse -1047 -eséve -1048 -hesto -1049 -honoo -1050 -japan -1051 -maheo -1052 -onáhe -1053 -pinus -1054 -tohke -1055 -vâtse -1056 -évôxe -1057 -óneve -1058 -▁sena -1059 -▁vóhk -1060 -estone -1061 -hpotsé -1062 -seotse -1063 -xekôsá -1064 -xemeno -1065 -êstone -1066 -óonéma -1067 -▁black -1068 -▁great -1069 -graphic -1070 -ukraine -1071 -wyoming -1072 -šeeséve -1073 -▁africa -1074 -▁otaesé -1075 -▁russia -1076 -▁tšêške -1077 -disambig -1078 -nestôtse -1079 -óoetaneo -1080 -▁ireland -1081 -▁tsėhése -1082 -▁wyoming -1083 -▁northern -1084 -▁official -1085 -estonemaheo -1086 -hpotséhohpe -1087 -haméstotôtse -1088 -.) -1089 -ax -1090 -bo -1091 -hu -1092 -kh -1093 -ud -1094 -vö -1095 -xó -1096 -yy -1097 -ze -1098 -ää -1099 -ëë -1100 -ón -1101 -▁, -1102 -▁ô -1103 -age -1104 -ané -1105 -ara -1106 -axa -1107 -ber -1108 -bri -1109 -che -1110 -ers -1111 -her -1112 -hog -1113 -hpâ -1114 -hpé -1115 -ile -1116 -ini -1117 -net -1118 -ora -1119 -ril -1120 -stá -1121 -ura -1122 -urg -1123 -uta -1124 -weo -1125 -wpp -1126 -zon -1127 -éne -1128 -óhe -1129 -▁kâ -1130 -▁la -1131 -▁nê -1132 -▁nó -1133 -▁oe -1134 -aehe -1135 -atia -1136 -cksk -1137 -etan -1138 -hkeo -1139 -hóme -1140 -kane -1141 -mber -1142 -mbia -1143 -moxe -1144 -mâhö -1145 -nehe -1146 -neve -1147 -névó -1148 -omëë -1149 -onal -1150 -taly -1151 -tana -1152 -tern -1153 -tóno -1154 -xemé -1155 -êstó -1156 -▁bel -1157 -▁for -1158 -▁mon -1159 -▁mén -1160 -▁tai -1161 -▁éšk -1162 -carpa -1163 -evoem -1164 -hoham -1165 -htseo -1166 -idaho -1167 -iland -1168 -ohtsé -1169 -rance -1170 -south -1171 -stahe -1172 -torta -1173 -tsêhe -1174 -world -1175 -éheme -1176 -óhkon -1177 -óhomo -1178 -▁hetó -1179 -▁heva -1180 -braska -1181 -ckskin -1182 -dgehog -1183 -kôhóme -1184 -notová -1185 -vášeta -1186 -xêšéne -1187 -êškóne -1188 -▁amâho -1189 -▁creek -1190 -▁heóvê -1191 -▁hohpâ -1192 -▁italy -1193 -▁mâhpé -1194 -▁netse -1195 -▁poeso -1196 -▁péhpe -1197 -▁póevó -1198 -▁vóhpo -1199 -seoneve -1200 -šéstótó -1201 -staahtsé -1202 -váótséva -1203 -▁britain -1204 -▁capital -1205 -▁náhkôhe -1206 -▁tâhpeno -1207 -hnestôtse -1208 -névóvâtse -1209 -xekôsáeho -1210 -▁contorta -1211 -▁hedgehog -1212 -▁manâhého -1213 -evoemêstse -1214 -▁australia -1215 -▁tsevášeta -1216 -▁óoetanéno -1217 -sevoneóneve -1218 -▁california -1219 -▁hohamháahp -1220 -nėstsestȯtse -1221 -". -1222 -): -1223 -fa -1224 -hn -1225 -ie -1226 -ká -1227 -ls -1228 -lv -1229 -ly -1230 -ov -1231 -pt -1232 -pâ -1233 -pó -1234 -sp -1235 -uz -1236 -we -1237 -wi -1238 -▁â -1239 -▁š -1240 -▁— -1241 -ace -1242 -amp -1243 -anâ -1244 -arc -1245 -ard -1246 -ass -1247 -bud -1248 -cro -1249 -ená -1250 -eše -1251 -ger -1252 -ges -1253 -haa -1254 -heó -1255 -hle -1256 -hnó -1257 -hox -1258 -ice -1259 -ili -1260 -ism -1261 -ket -1262 -lan -1263 -lig -1264 -mit -1265 -oli -1266 -pes -1267 -que -1268 -rat -1269 -rid -1270 -tem -1271 -tos -1272 -tov -1273 -tsė -1274 -tôx -1275 -uni -1276 -ust -1277 -von -1278 -xov -1279 -ylv -1280 -éšk -1281 -óvé -1282 -ȯxe -1283 -▁ac -1284 -▁hu -1285 -▁it -1286 -▁qu -1287 -▁ra -1288 -▁so -1289 -▁sá -1290 -▁te -1291 -▁ur -1292 -aenê -1293 -eoxa -1294 -esta -1295 -esto -1296 -hase -1297 -hevé -1298 -homa -1299 -icle -1300 -idae -1301 -ides -1302 -ilai -1303 -inal -1304 -kêsé -1305 -meno -1306 -máno -1307 -náne -1308 -ohko -1309 -ortu -1310 -pano -1311 -quah -1312 -tapâ -1313 -tine -1314 -tish -1315 -tôxá -1316 -utch -1317 -vian -1318 -vánó -1319 -vávo -1320 -vâhá -1321 -xeto -1322 -xêhe -1323 -âhev -1324 -óhko -1325 -▁anê -1326 -▁anė -1327 -▁bar -1328 -▁bir -1329 -▁den -1330 -▁est -1331 -▁geo -1332 -▁mas -1333 -▁ome -1334 -▁ová -1335 -▁par -1336 -▁pro -1337 -▁red -1338 -▁sea -1339 -▁slo -1340 -▁squ -1341 -▁too -1342 -▁óno -1343 -aehes -1344 -ensis -1345 -erosa -1346 -hahtá -1347 -hasëö -1348 -heóne -1349 -hkêho -1350 -honáe -1351 -icago -1352 -kaneo -1353 -mâhae -1354 -nêhpo -1355 -omêše -1356 -oured -1357 -perus -1358 -sebud -1359 -stove -1360 -stséa -1361 -thing -1362 -tsévó -1363 -urope -1364 -évohk -1365 -óhévâ -1366 -ónétó -1367 -ôhévo -1368 -▁ange -1369 -▁boer -1370 -▁crow -1371 -▁hesé -1372 -▁honó -1373 -▁kéme -1374 -▁mana -1375 -▁moun -1376 -▁nemâ -1377 -▁pond -1378 -▁site -1379 -▁tase -1380 -▁tsis -1381 -anâhke -1382 -canada -1383 -eneško -1384 -enôtse -1385 -hemêšé -1386 -hóxovê -1387 -inland -1388 -kâsénâ -1389 -lahoma -1390 -ligion -1391 -manâhé -1392 -xovôho -1393 -▁canad -1394 -▁heškó -1395 -▁hotse -1396 -▁hoóxe -1397 -▁mahpe -1398 -▁miles -1399 -▁mésta -1400 -▁portu -1401 -▁tahle -1402 -▁táase -1403 -▁vóhka -1404 -ameotse -1405 -archive -1406 -emanese -1407 -hailand -1408 -ométane -1409 -seohtsé -1410 -tapâhae -1411 -ôhtsévó -1412 -škovávo -1413 -▁brazil -1414 -▁france -1415 -▁hetane -1416 -▁sweden -1417 -▁tsesto -1418 -▁tséana -1419 -▁turkey -1420 -hásêstse -1421 -keemahpe -1422 -manȧhého -1423 -nebraska -1424 -notováhe -1425 -onêstove -1426 -tsénêhpo -1427 -uniperus -1428 -xâhtsená -1429 -xėseotse -1430 -êhóóhévâ -1431 -▁heškóve -1432 -▁hováhne -1433 -▁háhnoma -1434 -▁oeškese -1435 -▁ukraine -1436 -héstánóva -1437 -évohkôtse -1438 -▁buckskin -1439 -▁coloured -1440 -▁national -1441 -▁váótséva -1442 -▁vóhpeoxa -1443 -hahtátaneo -1444 -háatamaahe -1445 -véhestôtse -1446 -xetoeneško -1447 -▁anėsóvóne -1448 -▁ponderosa -1449 -▁tahlequah -1450 -▁vétapâhae -1451 -êsenotováhe -1452 -êstonemâheo -1453 -▁canadensis -1454 -mâhaemenôtse -1455 -▁mȧhoestȯtse -1456 -aenêhoestôtse -1457 -êhóóhévâhtseo -1458 -▁vóhkoohetane -1459 -keehoohtsêstse -1460 -kâsénâhnestôtse -1461 -ôhtávêhahtátaneo -1462 -bi -1463 -cl -1464 -ct -1465 -dź -1466 -eá -1467 -gs -1468 -hã -1469 -ji -1470 -jo -1471 -ju -1472 -ky -1473 -ns -1474 -op -1475 -oó -1476 -rt -1477 -vf -1478 -xé -1479 -yd -1480 -▁) -1481 -▁+ -1482 -▁. -1483 -▁: -1484 -act -1485 -ala -1486 -bez -1487 -can -1488 -clo -1489 -dle -1490 -edg -1491 -ent -1492 -era -1493 -evá -1494 -gal -1495 -gas -1496 -gdp -1497 -gel -1498 -har -1499 -hkė -1500 -hov -1501 -hve -1502 -ick -1503 -imb -1504 -int -1505 -iro -1506 -ise -1507 -itu -1508 -kie -1509 -maa -1510 -mat -1511 -mes -1512 -mus -1513 -nev -1514 -nez -1515 -nii -1516 -ola -1517 -olf -1518 -oly -1519 -ová -1520 -par -1521 -pet -1522 -pic -1523 -ris -1524 -sel -1525 -ssa -1526 -séó -1527 -sóe -1528 -tho -1529 -tom -1530 -táa -1531 -tšê -1532 -uby -1533 -uit -1534 -ulo -1535 -vak -1536 -wan -1537 -âsé -1538 -éna -1539 -ódź -1540 -óto -1541 -▁'' -1542 -▁), -1543 -▁__ -1544 -▁am -1545 -▁ex -1546 -▁kȧ -1547 -▁nē -1548 -▁ok -1549 -▁or -1550 -▁pu -1551 -▁sȯ -1552 -▁tr -1553 -▁va -1554 -▁wi -1555 -▁ło -1556 -aceb -1557 -acer -1558 -ance -1559 -asio -1560 -ated -1561 -ater -1562 -bean -1563 -coun -1564 -ench -1565 -eohé -1566 -erre -1567 -gent -1568 -glas -1569 -gypt -1570 -hahe -1571 -havo -1572 -háse -1573 -icat -1574 -icus -1575 -iver -1576 -kome -1577 -kone -1578 -kévé -1579 -kôhé -1580 -lery -1581 -ment -1582 -menó -1583 -mons -1584 -nóne -1585 -omée -1586 -orea -1587 -póno -1588 -rahã -1589 -rgin -1590 -ring -1591 -riti -1592 -sena -1593 -sity -1594 -sone -1595 -stan -1596 -tamó -1597 -tove -1598 -tury -1599 -tâhá -1600 -uela -1601 -unus -1602 -vfik -1603 -voto -1604 -xôse -1605 -zart -1606 -zech -1607 -áhto -1608 -éotó -1609 -êstá -1610 -êške -1611 -óseo -1612 -ôhëö -1613 -ôtáa -1614 -šemé -1615 -ȯhkė -1616 -▁amo -1617 -▁ant -1618 -▁dia -1619 -▁han -1620 -▁hoi -1621 -▁jim -1622 -▁lib -1623 -▁los -1624 -▁mic -1625 -▁mis -1626 -▁môx -1627 -▁mȧx -1628 -▁okô -1629 -▁oné -1630 -▁otá -1631 -▁per -1632 -▁ter -1633 -▁ton -1634 -▁vée -1635 -▁web -1636 -▁wor -1637 -▁yel -1638 -▁éte -1639 -▁ôxa -1640 -aehno -1641 -archt -1642 -edgar -1643 -enohe -1644 -ercus -1645 -estsé -1646 -etane -1647 -files -1648 -guage -1649 -hohtó -1650 -horse -1651 -hotoa -1652 -hpenó -1653 -imate -1654 -italy -1655 -ition -1656 -kaehe -1657 -keéno -1658 -mâhöö -1659 -netsé -1660 -nésta -1661 -ombia -1662 -omôho -1663 -onald -1664 -ondon -1665 -onávo -1666 -picea -1667 -poeso -1668 -pulus -1669 -péhpe -1670 -rizon -1671 -theid -1672 -tsêha -1673 -ubykh -1674 -vóono -1675 -xeesé -1676 -âséha -1677 -énohe -1678 -ôhené -1679 -öhtse -1680 -škëso -1681 -▁demo -1682 -▁dong -1683 -▁from -1684 -▁heap -1685 -▁heše -1686 -▁héne -1687 -▁héve -1688 -▁koro -1689 -▁mato -1690 -▁náhk -1691 -▁sage -1692 -▁sint -1693 -▁táxe -1694 -▁tôho -1695 -▁wolf -1696 -▁łódź -1697 -aenôhe -1698 -ameehe -1699 -ashing -1700 -brazil -1701 -ehóhtá -1702 -estôhe -1703 -gelman -1704 -graphy -1705 -hamëso -1706 -hkeehe -1707 -hovane -1708 -illion -1709 -kóhkon -1710 -ohketo -1711 -onéame -1712 -ouglas -1713 -russia -1714 -sėstse -1715 -ternal -1716 -ternet -1717 -tinent -1718 -tšêške -1719 -xemené -1720 -êhaseo -1721 -êhasëö -1722 -ôhtáva -1723 -ôseesé -1724 -▁birds -1725 -▁congo -1726 -▁czech -1727 -▁hestó -1728 -▁heóve -1729 -▁hésto -1730 -▁india -1731 -▁north -1732 -▁onéma -1733 -▁pinus -1734 -▁sȯsóe -1735 -▁vóhpe -1736 -▁áháse -1737 -▁éohke -1738 -▁łobez -1739 -acebook -1740 -etanóto -1741 +sp -909 +ue -910 +vö -911 +ón -912 +▁' -913 +▁z -914 +▁š -915 +aan -916 +aná -917 +con -918 +cto -919 +ehe -920 +ena -921 +ená -922 +gar -923 +hka -924 +hle -925 +hog -926 +hou -927 +ino -928 +isi -929 +ism -930 +iti -931 +kem -932 +ket -933 +mus -934 +olf -935 +omé -936 +ple -937 +tar -938 +tho -939 +ton -940 +ump -941 +óhé -942 +óvé -943 +ėhe -944 +ȯxe -945 +▁ab -946 +▁ed -947 +▁kâ -948 +▁wa -949 +ance -950 +ania -951 +ants -952 +bean -953 +eehe -954 +eove -955 +esta -956 +esto -957 +etse -958 +haná -959 +háat -960 +hést -961 +inal -962 +kane -963 +kêsé -964 +mate -965 +náne -966 +pula -967 +quah -968 +star -969 +tóvé -970 +unip -971 +xêšé -972 +xėse -973 +óhtá -974 +▁cal -975 +▁den -976 +▁est -977 +▁hoi -978 +▁hon -979 +▁ire -980 +▁lit -981 +▁min -982 +▁mon -983 +▁nan -984 +▁ome -985 +▁par -986 +▁pro -987 +▁red -988 +▁sea -989 +▁slo -990 +▁sto -991 +▁tha -992 +▁tra -993 +anehe -994 +erman -995 +erosa -996 +hnoma -997 +hpeno -998 +hésto -999 +mâheo -1000 +mâhéó -1001 +oming -1002 +thing -1003 +évohk -1004 +óhomo -1005 +ôhtáv -1006 +▁crow -1007 +▁heap -1008 +▁hešk -1009 +▁mešk -1010 +▁miss -1011 +▁moun -1012 +▁nemâ -1013 +▁pond -1014 +▁sint -1015 +▁site -1016 +dgehog -1017 +ibbean -1018 +kâsénâ -1019 +manâhé -1020 +nestse -1021 +tóvéto -1022 +xêšéne -1023 +âhonoo -1024 +êstséa -1025 +óonéma -1026 +ôhkêhe -1027 +ôhtáva -1028 +▁aésto -1029 +▁birds -1030 +▁hoóxe -1031 +▁môxem -1032 +▁south -1033 +▁tahle -1034 +▁tasem -1035 +▁vóhpo -1036 +emanese -1037 +háatama -1038 +âhtsená -1039 +▁brazil -1040 +▁otaesé -1041 +▁tsesto -1042 +▁tséana -1043 +keemahpe -1044 +notováhe -1045 +pulation -1046 +staahtsé -1047 +uniperus -1048 +xėseotse -1049 +êšéstótó -1050 +▁capital -1051 +▁ireland -1052 +▁tâhpeno -1053 +▁wyoming -1054 +évohkôtse -1055 +▁buckskin -1056 +▁hedgehog -1057 +▁national -1058 +▁vóhkoohe -1059 +hamestôtse -1060 +▁anėsóvóne -1061 +▁ponderosa -1062 +▁tahlequah -1063 +âhaemenôtse -1064 +êsenotováhe -1065 +▁hohamháahp -1066 +▁manestôtse -1067 +véhpotséhohpe -1068 +kâsénâhnestôtse -1069 +): -1070 +av -1071 +aó -1072 +dź -1073 +eá -1074 +gu -1075 +hã -1076 +iv -1077 +iz -1078 +ld -1079 +nt -1080 +op -1081 +tá -1082 +tâ -1083 +xi -1084 +ää -1085 +óo -1086 +▁) -1087 +▁+ -1088 +▁. -1089 +aen -1090 +bez -1091 +bud -1092 +cca -1093 +cem -1094 +che -1095 +cia -1096 +col -1097 +com -1098 +cus -1099 +eet -1100 +ers -1101 +ese -1102 +eso -1103 +eše -1104 +gas -1105 +gen -1106 +ges -1107 +gin -1108 +hen -1109 +hné -1110 +hov -1111 +hpa -1112 +hpo -1113 +hve -1114 +höö -1115 +ice -1116 +ies -1117 +ind -1118 +ith -1119 +kév -1120 +las -1121 +lig -1122 +nas -1123 +net -1124 +nii -1125 +ohk -1126 +ony -1127 +que -1128 +ria -1129 +ril -1130 +ron -1131 +rus -1132 +sed -1133 +sel -1134 +stâ -1135 +tem -1136 +tia -1137 +tic -1138 +too -1139 +tri -1140 +ulo -1141 +uly -1142 +vak -1143 +âhk -1144 +çao -1145 +éha -1146 +ése -1147 +êhá -1148 +êhó -1149 +ëva -1150 +ódź -1151 +óse -1152 +ôhö -1153 +ôsá -1154 +ôxa -1155 +ėho -1156 +ėst -1157 +▁ac -1158 +▁ap -1159 +▁bu -1160 +▁en -1161 +▁eu -1162 +▁jo -1163 +▁nē -1164 +▁sk -1165 +▁so -1166 +▁sp -1167 +▁sá -1168 +▁th -1169 +▁tá -1170 +▁ón -1171 +▁ôh -1172 +▁ło -1173 +ameo -1174 +axen -1175 +eden -1176 +eohé -1177 +hahe -1178 +hase -1179 +hoht -1180 +homa -1181 +háse -1182 +iber -1183 +ides -1184 +ilai -1185 +iten -1186 +kese -1187 +kôhó -1188 +pano -1189 +ress -1190 +sono -1191 +taly -1192 +tamá -1193 +unus -1194 +voto -1195 +vóon -1196 +xévo -1197 +zart -1198 +zech -1199 +áhto -1200 +âtse -1201 +éotó -1202 +ôheo -1203 +ôhéó -1204 +▁air -1205 +▁anê -1206 +▁ben -1207 +▁cli -1208 +▁des -1209 +▁fin -1210 +▁geo -1211 +▁gla -1212 +▁hae -1213 +▁kok -1214 +▁las -1215 +▁mas -1216 +▁mȧx -1217 +▁ová -1218 +▁pat -1219 +▁peo -1220 +▁spa -1221 +▁squ -1222 +▁ter -1223 +▁too -1224 +▁vir -1225 +▁web -1226 +▁wik -1227 +▁wor -1228 +▁yel -1229 +▁éšk -1230 +aehes -1231 +aehno -1232 +arten -1233 +elman -1234 +halus -1235 +hamém -1236 +horse -1237 +iland -1238 +irahã -1239 +keéno -1240 +lowst -1241 +mahpe -1242 +omêše -1243 +ondon -1244 +orone -1245 +pulus -1246 +ralia -1247 +sebud -1248 +taneo -1249 +tséno -1250 +éhavo -1251 +énohe -1252 +énéhe -1253 +êškév -1254 +ôhévo -1255 +ôhöne -1256 +öhtse -1257 +▁ange -1258 +▁cura -1259 +▁dong -1260 +▁hesó -1261 +▁hetó -1262 +▁heše -1263 +▁hohp -1264 +▁héve -1265 +▁mato -1266 +▁sage -1267 +▁tséx -1268 +▁voax -1269 +▁wolf -1270 +▁ôhmo -1271 +▁łódź -1272 +ashing -1273 +eétâhé -1274 +ginian -1275 +hovane -1276 +htsená -1277 +kêséhe -1278 +kôhóme -1279 +lahoma -1280 +ligion -1281 +onesto -1282 +onêheo -1283 +onêške -1284 +poland -1285 +staneo -1286 +sėstse -1287 +tamáno -1288 +tinent -1289 +xêšéhe -1290 +émâhéó -1291 +éstove -1292 +êhaseo -1293 +êhasëö -1294 +êškóne -1295 +ôhkêho -1296 +▁becca -1297 +▁congo -1298 +▁czech -1299 +▁hotse -1300 +▁italy -1301 +▁korea -1302 +▁meško -1303 +▁miles -1304 +▁north -1305 +▁pinus -1306 +▁vóhka -1307 +▁áháse -1308 +▁łobez -1309 +ameotse -1310 +hestohe -1311 +náhkohe -1312 +seohtsé -1313 +taheéve -1314 +énėstse -1315 +êheséeo -1316 +êstonem -1317 +▁harper -1318 +▁hetane -1319 +▁háeohé -1320 +▁háesto -1321 +▁mozart -1322 +▁people -1323 +▁sweden -1324 +▁turkey -1325 +cephalus -1326 +elmannii -1327 +enáhkohe -1328 +hamémôxe -1329 +lowstone -1330 +tsénooná -1331 +vóonotse -1332 +▁climate -1333 +▁curaçao -1334 +▁finland -1335 +▁hotohke -1336 +▁maarten -1337 +▁onéhavo -1338 +▁rosebud -1339 +ashington -1340 +axenôhöne -1341 +emestȯtse -1342 +juniperus -1343 +▁geograph -1344 +▁manâhého -1345 +▁mountain -1346 +▁móxêšéhe -1347 +▁okôhkêho -1348 +▁religion -1349 +▁thailand -1350 +▁tsetsêhe -1351 +▁váótséva -1352 +▁vóhpoomé -1353 +enóseoneve -1354 +háatamaahe -1355 +véhestôtse -1356 +▁australia -1357 +▁póevónáne -1358 +▁virginian -1359 +eétâhéstove -1360 +ôhtávaestse -1361 +▁héstanêheo -1362 +▁tsehestohe -1363 +polandnestse -1364 +êsenotováheé -1365 +▁engelmannii -1366 +▁héstánóvaan -1367 +▁vétapâhaeto -1368 +▁yellowstone -1369 +▁vóhkoohetane -1370 +keehoohtsêstse -1371 +▁héstánóvaanéve -1372 +▁tsetsêhestâhese -1373 +", -1374 +". -1375 +)- -1376 +ak -1377 +by -1378 +gs -1379 +hä -1380 +ix -1381 +ja -1382 +ki -1383 +kė -1384 +ns -1385 +pt -1386 +tr -1387 +ua -1388 +ui -1389 +xa -1390 +xé -1391 +yd -1392 +yl -1393 +äö -1394 +′′ -1395 +▁$ -1396 +▁: -1397 +.") -1398 +ade -1399 +aka -1400 +ami -1401 +ans -1402 +ate -1403 +axa -1404 +bel -1405 +boo -1406 +bra -1407 +chi -1408 +cro -1409 +dia -1410 +eho -1411 +ein -1412 +ght -1413 +haa -1414 +haz -1415 +hem -1416 +hir -1417 +hla -1418 +hán -1419 +ich -1420 +ick -1421 +ide -1422 +ima -1423 +ins -1424 +ise -1425 +itu -1426 +ivo -1427 +jug -1428 +kom -1429 +kra -1430 +lat -1431 +lay -1432 +nov -1433 +néó -1434 +oan -1435 +ohé -1436 +ola -1437 +oli -1438 +omá -1439 +omó -1440 +oug -1441 +pom -1442 +rib -1443 +sen -1444 +sóe -1445 +the -1446 +tiv -1447 +toa -1448 +tov -1449 +tsw -1450 +tôx -1451 +uba -1452 +uco -1453 +uel -1454 +ure -1455 +urr -1456 +uru -1457 +ved -1458 +vem -1459 +wan -1460 +xeo -1461 +yat -1462 +zon -1463 +ánó -1464 +âht -1465 +évé -1466 +êna -1467 +êxo -1468 +ëso -1469 +óma -1470 +óvó -1471 +óvô -1472 +ôht -1473 +ôto -1474 +ėjo -1475 +ėše -1476 +▁), -1477 +▁af -1478 +▁at -1479 +▁ea -1480 +▁es -1481 +▁fa -1482 +▁fo -1483 +▁ha -1484 +▁it -1485 +▁ke -1486 +▁ku -1487 +▁kö -1488 +▁kȧ -1489 +▁pe -1490 +▁ph -1491 +▁pé -1492 +▁ru -1493 +▁sc -1494 +▁sȯ -1495 +▁va -1496 +▁vi -1497 +▁še -1498 +apan -1499 +asia -1500 +asus -1501 +bies -1502 +book -1503 +bykh -1504 +cepc -1505 +ctos -1506 +eave -1507 +enen -1508 +eneo -1509 +etan -1510 +eved -1511 +fast -1512 +fire -1513 +gent -1514 +gypt -1515 +hael -1516 +hahk -1517 +here -1518 +heóv -1519 +hone -1520 +hová -1521 +héne -1522 +ibik -1523 +idae -1524 +ifor -1525 +igra -1526 +ills -1527 +inst -1528 +ires -1529 +irne -1530 +khaz -1531 +king -1532 +kome -1533 +késo -1534 +kêsa -1535 +lans -1536 +lect -1537 +lope -1538 +mami -1539 +mark -1540 +meno -1541 +môxe -1542 +nehe -1543 +nâha -1544 +néhe -1545 +néta -1546 +nóse -1547 +ohko -1548 +olia -1549 +pher -1550 +pora -1551 +pper -1552 +póno -1553 +ries -1554 +road -1555 +rope -1556 +seph -1557 +sian -1558 +skie -1559 +stsé -1560 +sêhe -1561 +tohe -1562 +town -1563 +tâhá -1564 +tôxá -1565 +vada -1566 +vian -1567 +vovó -1568 +véhe -1569 +vêsé -1570 +vóhp -1571 +xemé -1572 +xove -1573 +xéve -1574 +xôse -1575 +áahe -1576 +éest -1577 +éstó -1578 +êšev -1579 +óhév -1580 +óseo -1581 +ôhke -1582 +ėstó -1583 +šeno -1584 +škee -1585 +▁ahk -1586 +▁amo -1587 +▁ant -1588 +▁bat -1589 +▁cam -1590 +▁cla -1591 +▁hal -1592 +▁hel -1593 +▁hla -1594 +▁hóx -1595 +▁jim -1596 +▁los -1597 +▁mic -1598 +▁mén -1599 +▁môx -1600 +▁oeš -1601 +▁otá -1602 +▁pra -1603 +▁pre -1604 +▁rho -1605 +▁sco -1606 +▁sil -1607 +▁sás -1608 +▁val -1609 +▁xaó -1610 +▁xäö -1611 +▁ôxa -1612 +▁šan -1613 +arete -1614 +chief -1615 +dagas -1616 +desia -1617 +eetus -1618 +emâho -1619 +ercus -1620 +etane -1621 +etset -1622 +honáe -1623 +house -1624 +háano -1625 +illet -1626 +istan -1627 +kaehe -1628 +máhta -1629 +mêsta -1630 +onebi -1631 +onévo -1632 +river -1633 +selle -1634 +sisto -1635 +ssing -1636 +stati -1637 +stein -1638 +stôxe -1639 +sésto -1640 +tiago -1641 +vakia -1642 +venia -1643 +xésta -1644 +évôxe -1645 +óvemá -1646 +óvéta -1647 +ôhéve -1648 +ôtove -1649 +▁apie -1650 +▁aést -1651 +▁cauc -1652 +▁fire -1653 +▁from -1654 +▁haná -1655 +▁hese -1656 +▁héne -1657 +▁kôsá -1658 +▁left -1659 +▁loca -1660 +▁nēhi -1661 +▁nėse -1662 +▁orig -1663 +▁oéve -1664 +▁rock -1665 +▁sand -1666 +▁tsis -1667 +▁tséh -1668 +▁tôhé -1669 +▁ukra -1670 +▁ural -1671 +▁west -1672 +▁éohk -1673 +▁éveé -1674 +allery -1675 +aneonó -1676 +aénohe -1677 +cebook -1678 +cember -1679 +chived -1680 +eeheso -1681 +eestse -1682 +estone -1683 +etóxeo -1684 +hanehe -1685 +hemêšé -1686 +heséeo -1687 +hkoohé -1688 +hováve -1689 +huania -1690 +hóxovê -1691 +kóhkon -1692 +kôhtse -1693 +mêstaa -1694 +person -1695 +prunus -1696 +stâhem -1697 +telope -1698 +ternet -1699 +tivist -1700 +továto -1701 +tswana -1702 +tšêške -1703 +urasia -1704 +vation -1705 +veotse -1706 +xemâho -1707 +xepóno -1708 +âheone -1709 +âhevan -1710 +âhtseo -1711 +êhésto -1712 +óvôhtó -1713 +▁aruba -1714 +▁edgar -1715 +▁hestá -1716 +▁hesén -1717 +▁horse -1718 +▁hésto -1719 +▁hóxov -1720 +▁india -1721 +▁japan -1722 +▁lasio -1723 +▁leuco -1724 +▁liber -1725 +▁meave -1726 +▁mésta -1727 +▁nigra -1728 +▁obvia -1729 +▁pages -1730 +▁péhpe -1731 +▁retri -1732 +▁right -1733 +▁sibik -1734 +▁spain -1735 +▁sȯsóe -1736 +▁there -1737 +▁trump -1738 +▁tsehe -1739 +▁tónov -1740 +▁ubykh -1741 gentina -1742 -hestohe -1743 -icators -1744 -ligions -1745 -nezuela -1746 -oeškese -1747 -rginian -1748 -ribbean -1749 -searcht -1750 -statssa -1751 -séeotsé -1752 -taheéve -1753 -toháano -1754 -tsêhest -1755 -tâháéno -1756 -vonêheo -1757 -énanóse -1758 -êsevêsé -1759 -▁arctos -1760 -▁hoohëö -1761 -▁háeohé -1762 -▁háesto -1763 -▁london -1764 -▁mozart -1765 -▁móxêšé -1766 -▁norway -1767 -▁people -1768 -▁square -1769 -▁yellow -1770 -enáhkohe -1771 -hamémôxe -1772 -hestȯtse -1773 -republic -1774 -vahtôtse -1775 -vóonotse -1776 -xeeséeto -1777 -xestôtse -1778 -▁angeles -1779 -▁british -1780 -▁century -1781 -▁climate -1782 -▁croatia -1783 -▁finland -1784 -▁history -1785 -▁hotohke -1786 -▁onéhavo -1787 -▁rosebud -1788 -▁toháano -1789 -▁tôhohko -1790 -ashington -1791 -emestȯtse -1792 -gelmannii -1793 -heónemôxe -1794 -hémâhoéve -1795 -juniperus -1796 -konôhtávo -1797 -tôxámâhéó -1798 -▁american -1799 -▁colombia -1800 -▁hohpâháa -1801 -▁mountain -1802 -▁okôhkêho -1803 -▁thailand -1804 -enóseoneve -1805 -indicators -1806 -xôseonéame -1807 -âséhahnoma -1808 -▁heséeotsé -1809 -▁póevónáne -1810 -▁virginian -1811 -▁vótâháéno -1812 -▁éškôseesé -1813 -aenôheéstse -1814 -hemêšéonávo -1815 -▁héstanêheo -1816 -▁móxêšéhevé -1817 -▁population -1818 -▁tsehestohe -1819 -▁tsetsêhest -1820 -ameehestôtse -1821 -polandnestse -1822 -êsenotováheé -1823 -êsevêséhotoa -1824 -▁engelmannii -1825 -▁héstánóvaan -1826 -▁yellowstone -1827 -móhtâhestôtse -1828 -ohketoetanóto -1829 -▁héstánóvaanéve -1830 -▁tsetsêhestâhese -1831 -", -1832 -)- -1833 -.: -1834 -bc -1835 -cu -1836 -cy -1837 -dc -1838 -hā -1839 -hō -1840 -ii -1841 -iv -1842 -iz -1843 -ja -1844 -ks -1845 -ké -1846 -kė -1847 -lu -1848 -ml -1849 -nu -1850 -nô -1851 -of -1852 -oh -1853 -su -1854 -sô -1855 -tâ -1856 -tä -1857 -ua -1858 -vė -1859 -xt -1860 -ym -1861 -áa -1862 -án -1863 -äö -1864 -éé -1865 -óo -1866 -óé -1867 -šė -1868 -▁$ -1869 -▁/ -1870 -%), -1871 -.") -1872 -ach -1873 -ahé -1874 -aka -1875 -app -1876 -ath -1877 -ato -1878 -bia -1879 -bir -1880 -bli -1881 -bou -1882 -bra -1883 -bre -1884 -cel -1885 -cer -1886 -cho -1887 -cib -1888 -cip -1889 -cit -1890 -cta -1891 -dal -1892 -dia -1893 -duc -1894 -ear -1895 -eet -1896 -ell -1897 -elo -1898 -ena -1899 -ené -1900 -esy -1901 -eva -1902 -evê -1903 -haz -1904 -hen -1905 -hil -1906 -hla -1907 -hpa -1908 -htâ -1909 -htö -1910 -ich -1911 -ida -1912 -ima -1913 -imf -1914 -ino -1915 -iza -1916 -kom -1917 -lyn -1918 -mbo -1919 -med -1920 -mâx -1921 -nas -1922 -nes -1923 -ney -1924 -nom -1925 -nor -1926 -néó -1927 -oan -1928 -oem -1929 -ols -1930 -ove -1931 -ppp -1932 -pro -1933 -pén -1934 -res -1935 -rio -1936 -sex -1937 -stâ -1938 -tri -1939 -uba -1940 -uly -1941 -und -1942 -ung -1943 -ure -1944 -uru -1945 -uva -1946 -vec -1947 -voo -1948 -véo -1949 -zer -1950 -áta -1951 -çao -1952 -éoe -1953 -évâ -1954 -êhé -1955 -êna -1956 -êsé -1957 -êxo -1958 -ëhe -1959 -ëse -1960 -óma -1961 -ôsá -1962 -ėst -1963 -▁do -1964 -▁es -1965 -▁fi -1966 -▁ht -1967 -▁ki -1968 -▁ku -1969 -▁kö -1970 -▁oc -1971 -▁pe -1972 -▁ru -1973 -▁sk -1974 -▁ti -1975 -▁ut -1976 -▁wa -1977 -▁ôh -1978 -abwe -1979 -ames -1980 -anat -1981 -ants -1982 -asus -1983 -axaa -1984 -axáa -1985 -cerv -1986 -char -1987 -code -1988 -curi -1989 -down -1990 -edir -1991 -enns -1992 -eolo -1993 -eove -1994 -eral -1995 -esen -1996 -está -1997 -esëö -1998 -eved -1999 -fast -2000 -fire -2001 -gins -2002 -hael -2003 -hahk -2004 -hama -2005 -hamȧ -2006 -hene -2007 -here -2008 -hesó -2009 -hetó -2010 -heše -2011 -hová -2012 -hoxo -2013 -html -2014 -htoo -2015 -héve -2016 -höva -2017 -hāme -2018 -iago -2019 -ibik -2020 -ical -2021 -ific -2022 -ills -2023 -inst -2024 -ires -2025 -isco -2026 -ithu -2027 -khaz -2028 -komê -2029 -koné -2030 -kêsa -2031 -llow -2032 -mala -2033 -mami -2034 -many -2035 -mare -2036 -mark -2037 -mate -2038 -mené -2039 -môse -2040 -nahe -2041 -nene -2042 -nese -2043 -nâha -2044 -néta -2045 -nóse -2046 -olia -2047 -otsw -2048 -page -2049 -peru -2050 -pher -2051 -pper -2052 -ratt -2053 -road -2054 -sane -2055 -sean -2056 -seph -2057 -sian -2058 -star -2059 -tavö -2060 -tiba -2061 -ties -2062 -tivi -2063 -tohe -2064 -táhe -2065 -täso -2066 -tómô -2067 -tóxe -2068 -urne -2069 -utah -2070 -vada -2071 -vata -2072 -voem -2073 -vove -2074 -vâho -2075 -xove -2076 -áahe -2077 -âheo -2078 -âhtö -2079 -éseo -2080 -éstó -2081 -ësta -2082 -óóhe -2083 -ôhta -2084 -ôhéó -2085 -škee -2086 -ȯhnó -2087 -▁air -2088 -▁ara -2089 -▁atl -2090 -▁bay -2091 -▁ben -2092 -▁bit -2093 -▁cal -2094 -▁day -2095 -▁del -2096 -▁din -2097 -▁gla -2098 -▁hea -2099 -▁loc -2100 -▁mel -2101 -▁mex -2102 -▁mos -2103 -▁nan -2104 -▁nat -2105 -▁neg -2106 -▁pac -2107 -▁pra -2108 -▁pre -2109 -▁rho -2110 -▁sac -2111 -▁sco -2112 -▁sho -2113 -▁sto -2114 -▁tra -2115 -▁tre -2116 -▁van -2117 -▁wal -2118 -▁xaó -2119 -▁xäö -2120 -▁šan -2121 -advec -2122 -anohe -2123 -arten -2124 -atone -2125 -bania -2126 -brief -2127 -chief -2128 -dagas -2129 -desia -2130 -elope -2131 -guese -2132 -halus -2133 -hamâx -2134 -hketa -2135 -hkéso -2136 -hotse -2137 -house -2138 -htâhé -2139 -htóht -2140 -illet -2141 -kaeta -2142 -keaho -2143 -miten -2144 -máhta -2145 -néheo -2146 -nôtse -2147 -oemdc -2148 -póevó -2149 -river -2150 -selle -2151 -sisto -2152 -ssing -2153 -stâhe -2154 -tanév -2155 -tooxo -2156 -tšêšk -2157 -vakia -2158 -venia -2159 -vetoo -2160 -ville -2161 -vóhpo -2162 -vôhtó -2163 -xésta -2164 -émene -2165 -énéhe -2166 -évôse -2167 -óvéta -2168 -ôxhoo -2169 -ȧhéno -2170 -▁alge -2171 -▁anna -2172 -▁anta -2173 -▁apar -2174 -▁camp -2175 -▁cauc -2176 -▁cura -2177 -▁dece -2178 -▁desy -2179 -▁fire -2180 -▁hehp -2181 -▁hese -2182 -▁john -2183 -▁kosa -2184 -▁kôsá -2185 -▁mene -2186 -▁mâhö -2187 -▁mótó -2188 -▁néta -2189 -▁nėse -2190 -▁obvi -2191 -▁orig -2192 -▁oéve -2193 -▁page -2194 -▁poly -2195 -▁rock -2196 -▁sand -2197 -▁tono -2198 -▁tséh -2199 -▁tséx -2200 -▁tóno -2201 -▁tôhé -2202 -▁ural -2203 -▁vose -2204 -▁wars -2205 -▁west -2206 -▁wiki -2207 -▁zimb -2208 -▁éveé -2209 -▁ôhmo -2210 -aneonó -2211 -aénohe -2212 -chived -2213 -datama -2214 -edirne -2215 -eestse -2216 -eoestá -2217 -eétâhé -2218 -haeolo -2219 -hanáhe -2220 -hkoohe -2221 -hotóao -2222 -hováve -2223 -htsemo -2224 -hótame -2225 -kemene -2226 -kôhtse -2227 -kôhévé -2228 -marete -2229 -mêstaa -2230 -môtove -2231 -nétâhé -2232 -onesia -2233 -person -2234 -prunus -2235 -rizona -2236 -sósone -2237 -tevfik -2238 -tivist -2239 -turkey -2240 -uation -2241 -urasia -2242 -vation -2243 -veotse -2244 -xemenó -2245 -xemâho -2246 -xepóno -2247 -xêhest -2248 -éestse -2249 -éseohé -2250 -êstóne -2251 -óhtáhe -2252 -ėstane -2253 -šeméhe -2254 -ȯhkėha -2255 -▁aruba -2256 -▁botsw -2257 -▁coast -2258 -▁dutch -2259 -▁hestá -2260 -▁horse -2261 -▁hotóa -2262 -▁hésta -2263 -▁idaho -2264 -▁korea -2265 -▁lasio -2266 -▁lithu -2267 -▁meave -2268 -▁pages -2269 -▁penns -2270 -▁press -2271 -▁retri -2272 -▁sibik -2273 -▁there -2274 -▁tsehe -2275 -▁ubykh -2276 -▁xamae -2277 -antiago -2278 -article -2279 -atonebi -2280 -estséat -2281 -hestôxe -2282 -heóvemá -2283 -hkêsono -2284 -hohtóva -2285 -honáeka -2286 -háhnoma -2287 -hótsêha -2288 -kêsaéve -2289 -mamione -2290 -matôtse -2291 -montana -2292 -noonáhe -2293 -ohtôtse -2294 -panâhke -2295 -populus -2296 -sistots -2297 -sonants -2298 -tsévovó -2299 -tsêheta -2300 -tómôhéó -2301 -vetanév -2302 -vêstséa -2303 -vóneške -2304 -xeméhne -2305 -xâhonoo -2306 -xémâhéó -2307 -átamáno -2308 -âhtötse -2309 -éhestat -2310 -éotóaho -2311 -éstónéó -2312 -óhéhéve -2313 -▁ahkôhe -2314 -▁donald -2315 -▁europe -2316 -▁hoésto -2317 -▁hóvôse -2318 -▁kóhkon -2319 -▁kȧhamȧ -2320 -▁manaan -2321 -▁mariti -2322 -▁matana -2323 -▁ménôhe -2324 -▁mónahe -2325 -▁nanóse -2326 -▁native -2327 -▁nevada -2328 -▁néstse -2329 -▁pirahã -2330 -▁reôhke -2331 -▁tóhtoo -2332 -▁univer -2333 -▁éestse -2334 -▁êstove -2335 -aseohtsé -2336 -cephalus -2337 -etanetse -2338 -external -2339 -hahtsená -2340 -hamâxéve -2341 -háestôhe -2342 -hóxovôho -2343 -keahonoo -2344 -kâsénoma -2345 -kėhanáhe -2346 -language -2347 -manâhéno -2348 -móxêšéne -2349 -peruvian -2350 -tšêškévâ -2351 -vovetäso -2352 -weoworld -2353 -éoeškëso -2354 -êškóneho -2355 -ôhketóxe -2356 -šestôtse -2357 -▁aéstome -2358 -▁chicago -2359 -▁curaçao -2360 -▁denmark -2361 -▁douglas -2362 -▁hahpenó -2363 -▁heévȧhe -2364 -▁heóvâhá -2365 -▁hohamma -2366 -▁islands -2367 -▁maarten -2368 -▁madagas -2369 -▁menôtse -2370 -▁million -2371 -▁mâhpémo -2372 -▁sanders -2373 -▁skillet -2374 -▁tseohke -2375 -▁xamaevo -2376 -anėsóvóne -2377 -enestôtse -2378 -eotsestsé -2379 -hamémâhéó -2380 -netsénóne -2381 -nêškovávo -2382 -ohtsévôse -2383 -xemâhoévé -2384 -ȧhestȯtse -2385 -▁activist -2386 -▁archived -2387 -▁botswana -2388 -▁caucasus -2389 -▁crossing -2390 -▁december -2391 -▁hohtsemo -2392 -▁hotóhkeo -2393 -▁héstooma -2394 -▁manȧhéno -2395 -▁oklahoma -2396 -▁original -2397 -▁pennsylv -2398 -▁rhodesia -2399 -▁tsisinst -2400 -▁tsêhésto -2401 -▁zimbabwe -2402 -california -2403 -datamapper -2404 -hkoestséat -2405 -hánėsóvóne -2406 -hóxeeséeto -2407 -kaetaévôxe -2408 -kévénėstse -2409 -kôhévénéhe -2410 -nêstsevôse -2411 -sêhestôtse -2412 -tsêhéstahe -2413 -xėseotsean -2414 -▁apartheid -2415 -▁argentina -2416 -▁caribbean -2417 -▁geography -2418 -▁hoéstónéó -2419 -▁lithuania -2420 -▁monêškeho -2421 -▁obviative -2422 -▁religions -2423 -▁retrieved -2424 -▁sibikeove -2425 -▁tasemiten -2426 -enenestôtse -2427 -eétâhéstove -2428 -héstánóvaan -2429 -komêšéstótó -2430 -mȧhoestȯtse -2431 -êstonêstove -2432 -ôhkêheóvemá -2433 -ôhtávaestse -2434 -▁consonants -2435 -▁heóvonêheo -2436 -▁hóvôsenâha -2437 -▁hóxâhtsená -2438 -▁koronestse -2439 -▁kȧhamȧxévo -2440 -▁lasiocarpa -2441 -▁mâhóomoehá -2442 -▁môxemarete -2443 -▁nėsesėstse -2444 -▁portuguese -2445 -▁tootómôhéó -2446 -▁tsêhestôxe -2447 -▁university -2448 -▁véhestôtse -2449 -▁véhestȯtse -2450 -▁washington -2451 -eestsestȯtse -2452 -staahtsémeno -2453 -vátameveotse -2454 -▁ahkôheöhtse -2455 -▁desyatonebi -2456 -▁háeohémahpe -2457 -▁nemâmamione -2458 -▁vétapâhaeto -2459 -▁éškôseeséma -2460 -mâhoestôtsene -2461 -vetanévȯhkėha -2462 -xêhestâhtötse -2463 -êstonemâheone -2464 -šenonetsénóne -2465 -▁heóvêháhnoma -2466 -▁reôhkemôtove -2467 -aehesanestôtse -2468 -disambiguation -2469 -hamâxéveóhtáhe -2470 -héstoeotsestsé -2471 -kemâhaemenôtse -2472 -▁otaesémenôtse -2473 -▁héstoomaestôtse -2474 -▁héstánóvaannéta -2475 -▁tsisinstsistots -2476 -▁tsêhéstoestôtse -2477 -▁vóhkoohémâhoéve -2478 -af -2479 -ag -2480 -cf -2481 -cr -2482 -ec -2483 -fe -2484 -fo -2485 -fr -2486 -ip -2487 -jp -2488 -pė -2489 -sz -2490 -yo -2491 -ão -2492 -aga -2493 -anl -2494 -aná -2495 -ban -2496 -bat -2497 -big -2498 -blo -2499 -); -2500 -ai -2501 -bn -2502 -eg -2503 -eó -2504 -gc -2505 -ix -2506 -kȧ -2507 -lm -2508 -my -2509 -nd -2510 -oq -2511 -rh -2512 -up -2513 -wh -2514 -ép -2515 -anz -2516 -cfm -2517 -cot -2518 -cre -2519 -day -2520 -dif -2521 -dil -2522 -din -2523 -don -2524 -dor -2525 -ean -2526 -eat -2527 -esó -2528 -ext -2529 -ffa -2530 -for -2531 -gne -2532 -gta -2533 -hin -2534 -hké -2535 -hna -2536 -hou -2537 -hoé -2538 -htä -2539 -htȧ -2540 -hum -2541 -hôh -2542 -ibe -2543 -ibi -2544 -ics -2545 -ide -2546 -ied -2547 -ils -2548 -inh -2549 -iov -2550 -irc -2551 -iry -2552 -ita -2553 -iti -2554 -jap -2555 -jpg -2556 -kes -2557 -kyo -2558 -lag -2559 -lay -2560 -lsx -2561 -mau -2562 -meo -2563 -mib -2564 -mot -2565 -nee -2566 -odo -2567 -ork -2568 -orn -2569 -out -2570 -ped -2571 -poe -2572 -ppi -2573 -pua -2574 -ron -2575 -sci -2576 -sha -2577 -slo -2578 -sse -2579 -ste -2580 -tah -2581 -táo -2582 -uco -2583 -ump -2584 -une -2585 -usp -2586 -vel -2587 -wad -2588 -yan -2589 -ype -2590 -zco -2591 -äse -2592 -épó -2593 -éšé -2594 -êhá -2595 -êma -2596 -êšé -2597 -óvâ -2598 -ôhö -2599 -.. -2600 - -7943 +~ -7944 +à -7945 +ñ -7946 +ú -7947 +û -7948 +ń -7949 ō -7950 -ź -7951 -ʼ -7952 -р -7953 -< -7954 -@ -7955 -` -7956 -~ -7957 -č -7958 -ʃ -7959 -и -7960 -с -7961 -> -7962 -{ -7963 -à -7964 -ñ -7965 -ø -7966 -ú -7967 -û -7968 -ę -7969 -ń -7970 -ǐ -7971 -ʔ -7972 -н -7973 -п -7974 -ы -7975 -ә -7976 -ર -7977 -ા -7978 -ᐃ -7979 -民 -7980 -è -7981 -î -7982 -ò -7983 -ć -7984 -ī -7985 -ŋ -7986 -ś -7987 -ž -7988 -ǧ -7989 -ɛ -7990 -ɪ -7991 -б -7992 -е -7993 -л -7994 -о -7995 +ǐ -7951 +ʃ -7952 +н -7953 +п -7954 +ы -7955 +ә -7956 +ર -7957 +ા -7958 +ᐃ -7959 +民 -7960 +& -7961 +è -7962 +î -7963 +ò -7964 +ć -7965 +ī -7966 +ŋ -7967 +ś -7968 +ų -7969 +ǧ -7970 +ɛ -7971 +ɪ -7972 +ʔ -7973 +б -7974 +е -7975 +л -7976 +о -7977 +х -7978 +ј -7979 +қ -7980 +ҟ -7981 +ҭ -7982 +ҳ -7983 +ԥ -7984 +ખ -7985 +ફ -7986 +બ -7987 +લ -7988 +સ -7989 +ો -7990 +્ -7991 +တ -7992 +း -7993 +ႆ -7994 +Ꭹ -7995 diff --git a/models/vocabulary/chy_vocabulary.parquet b/models/vocabulary/chy_vocabulary.parquet index d3250908dac1000b045a1613c86c8979658ce153..5f3009ad677a72a74348a5993bf8c804d65599ae 100644 --- a/models/vocabulary/chy_vocabulary.parquet +++ b/models/vocabulary/chy_vocabulary.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:470592a0b112b9a3a65f69af1557219cccb407f27f3a4236e60e71c7c2f40313 -size 29068 +oid sha256:557659dcccab98922c4fc45c22d401a01f92a33330e2d06cbdbafd9e3d8f37d2 +size 22392 diff --git a/models/vocabulary/chy_vocabulary_metadata.json b/models/vocabulary/chy_vocabulary_metadata.json index 33d754655c8c37b5913c6babe30cef985f13ed0f..929cdaba95adcb0396ddbbe9a9cb203f3f0535eb 100644 --- a/models/vocabulary/chy_vocabulary_metadata.json +++ b/models/vocabulary/chy_vocabulary_metadata.json @@ -1,14 +1,15 @@ { "language": "chy", - "vocabulary_size": 1659, + "vocabulary_size": 1237, + "variant": "full", "statistics": { - "type_token_ratio": 0.24974895150333748, + "type_token_ratio": 0.32707120045087357, "coverage": { - "top_100": 0.4891015417331207, - "top_1000": 0.7703939984641739 + "top_100": 0.4324628968626714, + "top_1000": 0.7445989103888785 }, - "hapax_count": 2569, - "hapax_ratio": 0.6076158940397351, - "total_documents": 825 + "hapax_count": 2245, + "hapax_ratio": 0.644744399770247, + "total_documents": 459 } } \ No newline at end of file diff --git a/models/word_markov/chy_markov_ctx1_word.parquet b/models/word_markov/chy_markov_ctx1_word.parquet index 08a4a7d9d7f03e6128afff8c7c6eaa2ee082effd..8acf149959f72ea8c4e48203f71cfd6c798bdfce 100644 --- a/models/word_markov/chy_markov_ctx1_word.parquet +++ b/models/word_markov/chy_markov_ctx1_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:6fb77d185c917a1f8ece3debd54279262bbc2e1a01799fcb691731a7561bee31 -size 113562 +oid sha256:386ae58f4271bb09f2a7300c1ffb43cb66268f9329c798213791212b7375c544 +size 86902 diff --git a/models/word_markov/chy_markov_ctx1_word_metadata.json b/models/word_markov/chy_markov_ctx1_word_metadata.json index 5cdc3d6fc43a79066182c32df25b830eabb01291..79f793a05930f1a4c24f54487e1287ece9e416e5 100644 --- a/models/word_markov/chy_markov_ctx1_word_metadata.json +++ b/models/word_markov/chy_markov_ctx1_word_metadata.json @@ -2,6 +2,6 @@ "context_size": 1, "variant": "word", "language": "chy", - "unique_contexts": 4255, - "total_transitions": 27559 + "unique_contexts": 3383, + "total_transitions": 10187 } \ No newline at end of file diff --git a/models/word_markov/chy_markov_ctx2_word.parquet b/models/word_markov/chy_markov_ctx2_word.parquet index 9a3f7765734ed1f2b0f0b58803c294c5d3576f01..0d99ee534a93e60b5ada7345e8e5d7bc54302e94 100644 --- a/models/word_markov/chy_markov_ctx2_word.parquet +++ b/models/word_markov/chy_markov_ctx2_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:c1602a760151707012bb6aef66d181795727fa9d8cb5780233a0b6c1c90a2312 -size 189750 +oid sha256:25b29a5fd2a035a5761b624f9763e3b2f1bd7218a31608a0a6f94197490478db +size 131833 diff --git a/models/word_markov/chy_markov_ctx2_word_metadata.json b/models/word_markov/chy_markov_ctx2_word_metadata.json index 02ef0b577464a41bc9fd806995cf2e32c51c5a62..61d1fac9fca5fa6d7d5f94e1fba5de4e62a9781c 100644 --- a/models/word_markov/chy_markov_ctx2_word_metadata.json +++ b/models/word_markov/chy_markov_ctx2_word_metadata.json @@ -2,6 +2,6 @@ "context_size": 2, "variant": "word", "language": "chy", - "unique_contexts": 10197, - "total_transitions": 26734 + "unique_contexts": 6516, + "total_transitions": 9728 } \ No newline at end of file diff --git a/models/word_markov/chy_markov_ctx3_word.parquet b/models/word_markov/chy_markov_ctx3_word.parquet index 991dca945f547e533edfa586401fa51db3f6932e..a266a81efaaa552b218d093b93e2289099aa546f 100644 --- a/models/word_markov/chy_markov_ctx3_word.parquet +++ b/models/word_markov/chy_markov_ctx3_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:7a91e6dbb80dac8f5da14abcdfc3ba5c7122ed0a171bc88e3d506ea5a62e4790 -size 251639 +oid sha256:b4b7d0cf8b279e25f0c1dd5d12c4331f6647670846d67a4e626aa888e8687564 +size 158524 diff --git a/models/word_markov/chy_markov_ctx3_word_metadata.json b/models/word_markov/chy_markov_ctx3_word_metadata.json index e33000a870c5215c6bf21af4585b3cd1d9bc4b9a..e0b57ac5abfa0f78354819cea4c154faba32fc60 100644 --- a/models/word_markov/chy_markov_ctx3_word_metadata.json +++ b/models/word_markov/chy_markov_ctx3_word_metadata.json @@ -2,6 +2,6 @@ "context_size": 3, "variant": "word", "language": "chy", - "unique_contexts": 13745, - "total_transitions": 25909 + "unique_contexts": 7515, + "total_transitions": 9269 } \ No newline at end of file diff --git a/models/word_markov/chy_markov_ctx4_word.parquet b/models/word_markov/chy_markov_ctx4_word.parquet index 98b1da880fc3131ffd354e69756ca26873ae63d4..1a23dcf88786cfa6078cf32710ccb96faa83b1bf 100644 --- a/models/word_markov/chy_markov_ctx4_word.parquet +++ b/models/word_markov/chy_markov_ctx4_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:b827d8d8deb6ee8ee5a1c2f7e9164c6009220d08c889406bed208c143c67ef6f -size 299200 +oid sha256:46f60b22a8fcac07a7de898ad66d40c570966ec70737f00b5cf2d7b22bf2c6a2 +size 174153 diff --git a/models/word_markov/chy_markov_ctx4_word_metadata.json b/models/word_markov/chy_markov_ctx4_word_metadata.json index 28dde75ed594460a55b7a0d16cdabc0acb057b70..a7c139ec9a85a098b20d1b2df766e602e4ae888e 100644 --- a/models/word_markov/chy_markov_ctx4_word_metadata.json +++ b/models/word_markov/chy_markov_ctx4_word_metadata.json @@ -2,6 +2,6 @@ "context_size": 4, "variant": "word", "language": "chy", - "unique_contexts": 16004, - "total_transitions": 25085 + "unique_contexts": 7792, + "total_transitions": 8810 } \ No newline at end of file diff --git a/models/word_ngram/chy_2gram_word.parquet b/models/word_ngram/chy_2gram_word.parquet index b8aed7294befca87168337c3a35da6dbd879b91b..632b74c24a0b92b9071d12c3b4dc53a2e3c6d9b8 100644 --- a/models/word_ngram/chy_2gram_word.parquet +++ b/models/word_ngram/chy_2gram_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:0dc44fada7846034391ea2690c56b864a908054940a91ca581f91cf63ba43b79 -size 11396 +oid sha256:f3cf90b3734f097b7881f2bd6766a12f1c1a78a49ef2c9af43b8bb1c04f684de +size 5216 diff --git a/models/word_ngram/chy_2gram_word_metadata.json b/models/word_ngram/chy_2gram_word_metadata.json index 37fdf67c8b77cc258a8e46fc087faa60585c5ebc..56d2bc6055684beeb1fb7fd1ca2f684fb65b1b5c 100644 --- a/models/word_ngram/chy_2gram_word_metadata.json +++ b/models/word_ngram/chy_2gram_word_metadata.json @@ -2,6 +2,6 @@ "n": 2, "variant": "word", "language": "chy", - "unique_ngrams": 654, - "total_ngrams": 27559 + "unique_ngrams": 159, + "total_ngrams": 10187 } \ No newline at end of file diff --git a/models/word_ngram/chy_3gram_word.parquet b/models/word_ngram/chy_3gram_word.parquet index 427d66c3b9c2768d68c5cb44bffe42f9b2af4ada..bbfea6bb6328da419a09d61d4c91f24d9c72a818 100644 --- a/models/word_ngram/chy_3gram_word.parquet +++ b/models/word_ngram/chy_3gram_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:ac58b70ff5ea478cd859a06e2722ba50d36f1d0a7d83c8e35282a144a67b1369 -size 20739 +oid sha256:18e0cd0236f9672b7cdc3748740203bb789cab4d98a9bba4f58941ef3779da01 +size 7252 diff --git a/models/word_ngram/chy_3gram_word_metadata.json b/models/word_ngram/chy_3gram_word_metadata.json index dfd0ad0bad2ac0681a282fda23378e9accd99d9a..c29c0216e28a64b2c83dea0c9724f900b99f5a8f 100644 --- a/models/word_ngram/chy_3gram_word_metadata.json +++ b/models/word_ngram/chy_3gram_word_metadata.json @@ -2,6 +2,6 @@ "n": 3, "variant": "word", "language": "chy", - "unique_ngrams": 1211, - "total_ngrams": 26734 + "unique_ngrams": 245, + "total_ngrams": 9728 } \ No newline at end of file diff --git a/models/word_ngram/chy_4gram_word.parquet b/models/word_ngram/chy_4gram_word.parquet index f1204693a668d69729fa7acfe4695067f2d2a2cd..4cc9d0cc656f3444c80f96679bbd8d91bbd68c82 100644 --- a/models/word_ngram/chy_4gram_word.parquet +++ b/models/word_ngram/chy_4gram_word.parquet @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:8fb9ac65e984d6fb53483e3e4ddaf089e90baeef0b877dc6f9e5067d9f4d0a9d -size 39062 +oid sha256:be5660989ec57f68e126189b60da16af9da01d4a13b7e53d4bc22f9849e36215 +size 11340 diff --git a/models/word_ngram/chy_4gram_word_metadata.json b/models/word_ngram/chy_4gram_word_metadata.json index e380a75a3a33c8bbddab4dd39985a8938c807eb6..61d6abe8a60c9188890608ed9978cf38bc5625ff 100644 --- a/models/word_ngram/chy_4gram_word_metadata.json +++ b/models/word_ngram/chy_4gram_word_metadata.json @@ -2,6 +2,6 @@ "n": 4, "variant": "word", "language": "chy", - "unique_ngrams": 2302, - "total_ngrams": 25909 + "unique_ngrams": 449, + "total_ngrams": 9269 } \ No newline at end of file diff --git a/visualizations/embedding_isotropy.png b/visualizations/embedding_isotropy.png index f9c83093dbbacc8c96f6b8d0b13d80452f73e23a..ea9dc9f6e87929e7366aa2c06604dc6e6a6dda09 100644 Binary files a/visualizations/embedding_isotropy.png and b/visualizations/embedding_isotropy.png differ diff --git a/visualizations/embedding_norms.png b/visualizations/embedding_norms.png index 04157894bd6abb623cf7728919fa8ba8ab278c0c..6bac4c354cf4551a0d05ebcc6c5f5a89f0b7965e 100644 Binary files a/visualizations/embedding_norms.png and b/visualizations/embedding_norms.png differ diff --git a/visualizations/embedding_similarity.png b/visualizations/embedding_similarity.png index 34569d570e461d877e73a044c150aff77884730b..cdbdab745d9a5b59fa842a8ccef21a3a76fa59db 100644 --- a/visualizations/embedding_similarity.png +++ b/visualizations/embedding_similarity.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:125496fee51a0eff7033a820a5c76329e49094e721ffbc2269e6e45e220f6eb8 -size 174316 +oid sha256:64b3cbb726cc52039aef89526c5df1602ede7ff01d5aafa01666629a9683d23c +size 126808 diff --git a/visualizations/markov_branching.png b/visualizations/markov_branching.png index 3d1b9a76d64faa807ecc710b35463c5d2c4d997b..82188252c06cc849a115b2120f3298eda3266ce7 100644 Binary files a/visualizations/markov_branching.png and b/visualizations/markov_branching.png differ diff --git a/visualizations/markov_contexts.png b/visualizations/markov_contexts.png index 09142004f613dcae9d743372b8ba19fe850d83bc..878da0bfeccb8cd12e9bbeb4e155ccde60be4c40 100644 Binary files a/visualizations/markov_contexts.png and b/visualizations/markov_contexts.png differ diff --git a/visualizations/markov_entropy.png b/visualizations/markov_entropy.png index f69b6daa31e8d4c7ecf7d9dbe4e721ecdc422feb..2cb2a33aa3f32f2c9020ff2e7b4ded85951fd36f 100644 Binary files a/visualizations/markov_entropy.png and b/visualizations/markov_entropy.png differ diff --git a/visualizations/model_sizes.png b/visualizations/model_sizes.png index cd57618649af251bf29a959169de322207f5db26..885a4e8611f5767abc6f1341fffea6aa3a6c29a6 100644 Binary files a/visualizations/model_sizes.png and b/visualizations/model_sizes.png differ diff --git a/visualizations/nearest_neighbors.png b/visualizations/nearest_neighbors.png index cb3c0895a189df84db3032c4a9fcfdd130bc23f0..6a761effd37f6e9eb5c3adf08b9a70d938681ddc 100644 Binary files a/visualizations/nearest_neighbors.png and b/visualizations/nearest_neighbors.png differ diff --git a/visualizations/ngram_coverage.png b/visualizations/ngram_coverage.png index 80e8fea86f70b95ea35d4ec0079ad47388993a03..b45518093f24f6ee5cccd3194d70808e25706ae6 100644 Binary files a/visualizations/ngram_coverage.png and b/visualizations/ngram_coverage.png differ diff --git a/visualizations/ngram_entropy.png b/visualizations/ngram_entropy.png index 535ae32da5aed6ed238bf11d9b43f8d475bb0f75..70447cd4707daafceccd6d9f29ba189f50ad7490 100644 Binary files a/visualizations/ngram_entropy.png and b/visualizations/ngram_entropy.png differ diff --git a/visualizations/ngram_perplexity.png b/visualizations/ngram_perplexity.png index 1c541e6bfb131e4fc777744a43d97b7fcc1604d1..00be5b866f4bbaa1cfe56d6e0019c6985cc23ef7 100644 Binary files a/visualizations/ngram_perplexity.png and b/visualizations/ngram_perplexity.png differ diff --git a/visualizations/ngram_unique.png b/visualizations/ngram_unique.png index dc7fa8a01b2392ca7cea2fa35c147f0aade4ece5..ced4bb414a2ef4c8c2411799c46387aa89c72610 100644 Binary files a/visualizations/ngram_unique.png and b/visualizations/ngram_unique.png differ diff --git a/visualizations/performance_dashboard.png b/visualizations/performance_dashboard.png index 6369f638fbcc3ce442ac6b78ff1dcb99bbeaebb3..876dbe7d9c7ec97ccfa5a5a7eae8adb280a308ae 100644 --- a/visualizations/performance_dashboard.png +++ b/visualizations/performance_dashboard.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:b861d72edf7bb6c8cf345d9be70d2d6d0a0a216d6e17021fbe8fa2794b212d3e -size 288321 +oid sha256:0f899824cb6c4547d62da3e7f6e6bf75020586ecf2b7d2b2c5976916c155732c +size 270991 diff --git a/visualizations/position_encoding_comparison.png b/visualizations/position_encoding_comparison.png index a1aeba54650b4a60c936d88216bb44bf1206a6b1..cd2f4289ff4cba7547c156aa481424b7c0b1ff51 100644 Binary files a/visualizations/position_encoding_comparison.png and b/visualizations/position_encoding_comparison.png differ diff --git a/visualizations/tokenizer_compression.png b/visualizations/tokenizer_compression.png index 79cdac463e699214ccf59bd636466018ccbfd553..144fa9f2d16397930b23016f3cc061343d451a2c 100644 Binary files a/visualizations/tokenizer_compression.png and b/visualizations/tokenizer_compression.png differ diff --git a/visualizations/tokenizer_fertility.png b/visualizations/tokenizer_fertility.png index edd4301152cfef7bcea4a6b3a297bcb391f72732..b78937f46dcba6e1ffa2522e1317cb2014b141a4 100644 Binary files a/visualizations/tokenizer_fertility.png and b/visualizations/tokenizer_fertility.png differ diff --git a/visualizations/tokenizer_oov.png b/visualizations/tokenizer_oov.png index 6e8f23f88633158ec088f9d93ef4d98814108b9b..8425f47754399510422fc6cf50fedf5c73262db9 100644 Binary files a/visualizations/tokenizer_oov.png and b/visualizations/tokenizer_oov.png differ diff --git a/visualizations/tokenizer_total_tokens.png b/visualizations/tokenizer_total_tokens.png index 77d89afaadf72acb713ff3350d887d04cd3e2fc5..df04deecf80674d5a3b55a254e9339eeb9a9f2c5 100644 Binary files a/visualizations/tokenizer_total_tokens.png and b/visualizations/tokenizer_total_tokens.png differ diff --git a/visualizations/top20_words.png b/visualizations/top20_words.png index 25960f06e9a43427fcacaeb3ed7a56263f2f6cf2..3b0b881b0807cc6580ac8f34bc3fc94115e3845a 100644 Binary files a/visualizations/top20_words.png and b/visualizations/top20_words.png differ diff --git a/visualizations/tsne_sentences.png b/visualizations/tsne_sentences.png index b7e069ce54a7a8b67690e2490df8f0b47ff58d9d..2577ac4fae041121b7464551bdf94f635cbba1f4 100644 --- a/visualizations/tsne_sentences.png +++ b/visualizations/tsne_sentences.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:0dc88c5899ee297f88854b53b1a881cde76421efd331639ce36eee79f8707972 -size 259359 +oid sha256:5738fc98a5806c2cf9c2b9af8ce05b1bea2476236ad9f4f96f9de41339a47881 +size 266026 diff --git a/visualizations/tsne_words.png b/visualizations/tsne_words.png index 5e526f8acf5033493c7d3d88654282c5a2a56920..55b5263831180e9d37c2d7dace265571784c20ce 100644 --- a/visualizations/tsne_words.png +++ b/visualizations/tsne_words.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:d0724431f0ba5d0023e807f46bde04ec4c5c61dc35cdd7462ba385d4404afd40 -size 264480 +oid sha256:c594af3b95d4f54484fa6f1b4e358b98359869c1d60cd48af9c2f6e7bf4ff9e0 +size 201396 diff --git a/visualizations/vocab_coverage.png b/visualizations/vocab_coverage.png index d4ea22517aa6d105e7c8c881ed6b6accd2798001..fd9bcde8fede172a65f14222b4ac99abef6a3972 100644 Binary files a/visualizations/vocab_coverage.png and b/visualizations/vocab_coverage.png differ diff --git a/visualizations/vocab_freq_dist.png b/visualizations/vocab_freq_dist.png index 530c48f0bae84f719dc56899bacf4248734a1401..d489b4f79d52d6a56c9af52b3e8627fa69cf3047 100644 Binary files a/visualizations/vocab_freq_dist.png and b/visualizations/vocab_freq_dist.png differ diff --git a/visualizations/zipf_law.png b/visualizations/zipf_law.png index 71e4fb16adb7814e9ec304069fc503ad6f0e5b25..bf799f018ae3ad16148e03c5cd4e4b199a86e3ee 100644 --- a/visualizations/zipf_law.png +++ b/visualizations/zipf_law.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:cce6ed8b0dec7d5d68b967ff424b0d29f7404f5c5d77ffdb1d6a44392fea5877 -size 102444 +oid sha256:99ca3b312b85657822250a843bbdcbf32e8d8a66d8904f88265e7b1f4f25df66 +size 99198