diff --git a/.gitattributes b/.gitattributes index a6344aac8c09253b3b630fb776ae94478aa0275b..9800be970571345a39f1f48b1c34335a62f77c48 100644 --- a/.gitattributes +++ b/.gitattributes @@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text *.zip filter=lfs diff=lfs merge=lfs -text *.zst filter=lfs diff=lfs merge=lfs -text *tfevents* filter=lfs diff=lfs merge=lfs -text +visualizations/embedding_similarity.png filter=lfs diff=lfs merge=lfs -text +visualizations/performance_dashboard.png filter=lfs diff=lfs merge=lfs -text +visualizations/tsne_sentences.png filter=lfs diff=lfs merge=lfs -text +visualizations/tsne_words.png filter=lfs diff=lfs merge=lfs -text +visualizations/zipf_law.png filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md new file mode 100644 index 0000000000000000000000000000000000000000..d22267343baf366ffb0f0efa9654743a18f77284 --- /dev/null +++ b/README.md @@ -0,0 +1,709 @@ +--- +language: bdr +language_name: BDR +language_family: austronesian_other +tags: + - wikilangs + - nlp + - tokenizer + - embeddings + - n-gram + - markov + - wikipedia + - monolingual + - family-austronesian_other +license: mit +library_name: wikilangs +pipeline_tag: feature-extraction +datasets: + - omarkamali/wikipedia-monthly +dataset_info: + name: wikipedia-monthly + description: Monthly snapshots of Wikipedia articles across 300+ languages +metrics: + - name: best_compression_ratio + type: compression + value: 4.792 + - name: best_isotropy + type: isotropy + value: 0.0482 + - name: vocabulary_size + type: vocab + value: 0 +generated: 2026-01-03 +--- + +# BDR - Wikilangs Models +## Comprehensive Research Report & Full Ablation Study + +This repository contains NLP models trained and evaluated by Wikilangs, specifically on **BDR** Wikipedia data. +We analyze tokenizers, n-gram models, Markov chains, vocabulary statistics, and word embeddings. + +## 📋 Repository Contents + +### Models & Assets + +- Tokenizers (8k, 16k, 32k, 64k) +- N-gram models (2, 3, 4, 5-gram) +- Markov chains (context of 1, 2, 3, 4 and 5) +- Subword N-gram and Markov chains +- Embeddings in various sizes and dimensions (aligned and unaligned) +- Language Vocabulary +- Language Statistics + +![Performance Dashboard](visualizations/performance_dashboard.png) + +### Analysis and Evaluation + +- [1. Tokenizer Evaluation](#1-tokenizer-evaluation) +- [2. N-gram Model Evaluation](#2-n-gram-model-evaluation) +- [3. Markov Chain Evaluation](#3-markov-chain-evaluation) +- [4. Vocabulary Analysis](#4-vocabulary-analysis) +- [5. Word Embeddings Evaluation](#5-word-embeddings-evaluation) +- [6. Morphological Analysis (Experimental)](#6-morphological-analysis) +- [7. Summary & Recommendations](#7-summary--recommendations) +- [Metrics Glossary](#appendix-metrics-glossary--interpretation-guide) +- [Visualizations Index](#visualizations-index) + +--- +## 1. Tokenizer Evaluation + +![Tokenizer Compression](visualizations/tokenizer_compression.png) + +![Tokenizer Fertility](visualizations/tokenizer_fertility.png) + +![Tokenizer OOV](visualizations/tokenizer_oov.png) + +![Total Tokens](visualizations/tokenizer_total_tokens.png) + +### Results + +| Vocab Size | Compression | Avg Token Len | UNK Rate | Total Tokens | +|------------|-------------|---------------|----------|--------------| +| **8k** | 4.792x 🏆 | 4.81 | 0.1661% | 33,107 | + +### Tokenization Examples + +Below are sample sentences tokenized with each vocabulary size: + +**Sample 1:** `Nimbug iyono indu' manuk nuut ngentelo ta' keteraan manuk lain.` + +| Vocab | Tokens | Count | +|-------|--------|-------| +| 8k | `▁nimbug ▁iyono ▁indu ' ▁manuk ▁nuut ▁ngentelo ▁ta ' ▁keteraan ... (+3 more)` | 13 | + +**Sample 2:** `Raja iyo no' dangan jomo kuleh kuasa diom pemerintah dikau kerajaan.Endo rojo pi...` + +| Vocab | Tokens | Count | +|-------|--------|-------| +| 8k | `▁raja ▁iyo ▁no ' ▁dangan ▁jomo ▁kuleh ▁kuasa ▁diom ▁pemerintah ... (+20 more)` | 30 | + +**Sample 3:** `Para-para iyo no tempat ngena segala barang enjata rak` + +| Vocab | Tokens | Count | +|-------|--------|-------| +| 8k | `▁para - para ▁iyo ▁no ▁tempat ▁ngena ▁segala ▁barang ▁enjata ... (+1 more)` | 11 | + + +### Key Findings + +- **Best Compression:** 8k achieves 4.792x compression +- **Lowest UNK Rate:** 8k with 0.1661% unknown tokens +- **Trade-off:** Larger vocabularies improve compression but increase model size +- **Recommendation:** 32k vocabulary provides optimal balance for production use + +--- +## 2. N-gram Model Evaluation + +![N-gram Perplexity](visualizations/ngram_perplexity.png) + +![N-gram Unique](visualizations/ngram_unique.png) + +![N-gram Coverage](visualizations/ngram_coverage.png) + +### Results + +| N-gram | Variant | Perplexity | Entropy | Unique N-grams | Top-100 Coverage | Top-1000 Coverage | +|--------|---------|------------|---------|----------------|------------------|-------------------| +| **2-gram** | Word | 287 | 8.16 | 401 | 53.3% | 100.0% | +| **2-gram** | Subword | 181 🏆 | 7.50 | 597 | 77.1% | 100.0% | +| **3-gram** | Word | 221 | 7.79 | 271 | 59.6% | 100.0% | +| **3-gram** | Subword | 1,140 | 10.15 | 3,421 | 32.8% | 85.1% | +| **4-gram** | Word | 273 | 8.09 | 346 | 51.1% | 100.0% | +| **4-gram** | Subword | 4,426 | 12.11 | 11,413 | 17.0% | 52.6% | + +### Top 5 N-grams by Size + +**2-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `tungan metelak` | 162 | +| 2 | `iyo no` | 138 | +| 3 | `iyo noh` | 69 | +| 4 | `iyo tu` | 68 | +| 5 | `bioso ni` | 45 | + +**3-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `ma na ni` | 40 | +| 2 | `dewan undangan negeri` | 26 | +| 3 | `undangan negeri sabah` | 25 | +| 4 | `iyo tu dangan` | 19 | +| 5 | `tungan metelak dendo` | 18 | + +**4-grams (Word):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `dewan undangan negeri sabah` | 25 | +| 2 | `tungan metelak dendo malaysia` | 18 | +| 3 | `sama ma na ni` | 14 | +| 4 | `iyo no endangan jomo` | 12 | +| 5 | `no endangan jomo politik` | 12 | + +**2-grams (Subword):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `a n` | 5,437 | +| 2 | `n _` | 3,734 | +| 3 | `n g` | 3,473 | +| 4 | `i _` | 3,019 | +| 5 | `_ t` | 2,998 | + +**3-grams (Subword):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `a n _` | 2,443 | +| 2 | `a n g` | 1,577 | +| 3 | `n g _` | 1,357 | +| 4 | `_ t a` | 1,076 | +| 5 | `_ n i` | 987 | + +**4-grams (Subword):** + +| Rank | N-gram | Count | +|------|--------|-------| +| 1 | `a n g _` | 910 | +| 2 | `_ n i _` | 650 | +| 3 | `_ i y o` | 643 | +| 4 | `n g a n` | 619 | +| 5 | `g a n _` | 579 | + + +### Key Findings + +- **Best Perplexity:** 2-gram (subword) with 181 +- **Entropy Trend:** Decreases with larger n-grams (more predictable) +- **Coverage:** Top-1000 patterns cover ~53% of corpus +- **Recommendation:** 4-gram or 5-gram for best predictive performance + +--- +## 3. Markov Chain Evaluation + +![Markov Entropy](visualizations/markov_entropy.png) + +![Markov Contexts](visualizations/markov_contexts.png) + +![Markov Branching](visualizations/markov_branching.png) + +### Results + +| Context | Variant | Avg Entropy | Perplexity | Branching Factor | Unique Contexts | Predictability | +|---------|---------|-------------|------------|------------------|-----------------|----------------| +| **1** | Word | 0.8053 | 1.747 | 3.60 | 5,241 | 19.5% | +| **1** | Subword | 1.4652 | 2.761 | 11.14 | 104 | 0.0% | +| **2** | Word | 0.1666 | 1.122 | 1.26 | 18,592 | 83.3% | +| **2** | Subword | 1.1951 | 2.290 | 5.74 | 1,154 | 0.0% | +| **3** | Word | 0.0377 | 1.026 | 1.05 | 22,996 | 96.2% | +| **3** | Subword | 0.7985 | 1.739 | 3.15 | 6,603 | 20.2% | +| **4** | Word | 0.0104 🏆 | 1.007 | 1.01 | 23,590 | 99.0% | +| **4** | Subword | 0.5443 | 1.458 | 2.09 | 20,699 | 45.6% | + +### Generated Text Samples (Word-based) + +Below are text samples generated from each word-based Markov chain model: + +**Context Size 1:** + +1. `ni ta lok kuah engko tangsi selegubdi tu terhasil moko dangan pelego dendo malaysia beliau tu` +2. `tu tungan ni un duo ni mediam tepung buas tak sekul tena tana amun pinapi enggo` +3. `iyo boi nilego oleg ni jomo yang bok ni pan akan buan raya kota belud tu` + +**Context Size 2:** + +1. `iyo no nyaun preskripsi toos bineli ta farmasi atau mediam kadai yang nyaun sebarang halangan engko ...` +2. `iyo noh kui tradisional jomo mitu sabah kui tu bentuk ni dokon indung jari engko binuat lua` +3. `iyo tu boi ni urus le ni gua a masi un sampai betiru terutama ni sembiang pardu` + +**Context Size 3:** + +1. `ma na ni teko ta tampat tungan setemu tapi jomo tenemuan ai no lumaan` +2. `dewan undangan negeri sabah ta kewasan tempasuk lua tungan metelak politik malaysia di pertua laat a...` +3. `undangan negeri sabah betiru` + +**Context Size 4:** + +1. `dewan undangan negeri sabah dun lua september tu anggota pertubuhan kebangsaan melayu bersatu malays...` +2. `sama ma na ni ai ngemban matai` +3. `ahli dewan undangan negeri sabah dewan undangan negeri sabah dewan undangan negeri sabah ta kewasan ...` + + +### Generated Text Samples (Subword-based) + +Below are text samples generated from each subword-based Markov chain model: + +**Context Size 1:** + +1. `_njano-bo_cseria` +2. `anegal_8_t_bu"_b` +3. `ngim_nd_bo_isaup` + +**Context Size 2:** + +1. `an_kain_tamuḥamal` +2. `n_jom_no_turi_mud` +3. `ngko_ta_tang_boi_` + +**Context Size 3:** + +1. `an_ni_ana'_nakasal` +2. `ang_jomo_untuan_ta` +3. `ng_teali_pulo_ko'_` + +**Context Size 4:** + +1. `ang_sefalopod_lua'_` +2. `_ni_denga_septembag` +3. `_iyo_no_telia_punya` + + +### Key Findings + +- **Best Predictability:** Context-4 (word) with 99.0% predictability +- **Branching Factor:** Decreases with context size (more deterministic) +- **Memory Trade-off:** Larger contexts require more storage (20,699 contexts) +- **Recommendation:** Context-3 or Context-4 for text generation + +--- +## 4. Vocabulary Analysis + +![Zipf's Law](visualizations/zipf_law.png) + +![Top Words](visualizations/top20_words.png) + +![Coverage Curve](visualizations/vocab_coverage.png) + +### Statistics + +| Metric | Value | +|--------|-------| +| Vocabulary Size | 2,342 | +| Total Tokens | 23,366 | +| Mean Frequency | 9.98 | +| Median Frequency | 3 | +| Frequency Std Dev | 33.27 | + +### Most Common Words + +| Rank | Word | Frequency | +|------|------|-----------| +| 1 | ni | 760 | +| 2 | tu | 584 | +| 3 | iyo | 549 | +| 4 | ta | 455 | +| 5 | yang | 382 | +| 6 | boi | 354 | +| 7 | pan | 303 | +| 8 | kok | 280 | +| 9 | jomo | 275 | +| 10 | tungan | 250 | + +### Least Common Words (from vocabulary) + +| Rank | Word | Frequency | +|------|------|-----------| +| 1 | pelikat | 2 | +| 2 | avi | 2 | +| 3 | me | 2 | +| 4 | jewatan | 2 | +| 5 | michael | 2 | +| 6 | joseph | 2 | +| 7 | ho | 2 | +| 8 | ny | 2 | +| 9 | pembunuh | 2 | +| 10 | mundu | 2 | + +### Zipf's Law Analysis + +| Metric | Value | +|--------|-------| +| Zipf Coefficient | 0.9532 | +| R² (Goodness of Fit) | 0.984280 | +| Adherence Quality | **excellent** | + +### Coverage Analysis + +| Top N Words | Coverage | +|-------------|----------| +| Top 100 | 45.5% | +| Top 1,000 | 85.6% | +| Top 5,000 | 0.0% | +| Top 10,000 | 0.0% | + +### Key Findings + +- **Zipf Compliance:** R²=0.9843 indicates excellent adherence to Zipf's law +- **High Frequency Dominance:** Top 100 words cover 45.5% of corpus +- **Long Tail:** -7,658 words needed for remaining 100.0% coverage + +--- +## 5. Word Embeddings Evaluation + +![Embedding Isotropy](visualizations/embedding_isotropy.png) + +![Similarity Matrix](visualizations/embedding_similarity.png) + +![t-SNE Words](visualizations/tsne_words.png) + +![t-SNE Sentences](visualizations/tsne_sentences.png) + + +### 5.1 Cross-Lingual Alignment + +> *Note: Multilingual alignment visualization not available for this language.* + + +### 5.2 Model Comparison + +| Model | Dimension | Isotropy | Semantic Density | Alignment R@1 | Alignment R@10 | +|-------|-----------|----------|------------------|---------------|----------------| +| **mono_32d** | 32 | 0.0482 🏆 | 0.8825 | N/A | N/A | +| **mono_64d** | 64 | 0.0132 | 0.9050 | N/A | N/A | +| **mono_128d** | 128 | 0.0053 | 0.9273 | N/A | N/A | + +### Key Findings + +- **Best Isotropy:** mono_32d with 0.0482 (more uniform distribution) +- **Semantic Density:** Average pairwise similarity of 0.9049. Lower values indicate better semantic separation. +- **Alignment Quality:** No aligned models evaluated in this run. +- **Recommendation:** 128d aligned for best cross-lingual performance + +--- +## 6. Morphological Analysis (Experimental) + +> ⚠️ **Warning:** This language shows low morphological productivity. The statistical signals used for this analysis may be noisy or less reliable than for morphologically rich languages. + +This section presents an automated morphological analysis derived from the statistical divergence between word-level and subword-level models. By analyzing where subword predictability spikes and where word-level coverage fails, we can infer linguistic structures without supervised data. + +### 6.1 Productivity & Complexity + +| Metric | Value | Interpretation | Recommendation | +|--------|-------|----------------|----------------| +| Productivity Index | **0.000** | Low morphological productivity | ⚠️ Likely unreliable | +| Idiomaticity Gap | **-1.000** | Low formulaic content | - | + +### 6.2 Affix Inventory (Productive Units) + +These are the most productive prefixes and suffixes identified by sampling the vocabulary for global substitutability patterns. A unit is considered an affix if stripping it leaves a valid stem that appears in other contexts. + +#### Productive Prefixes +| Prefix | Examples | +|--------|----------| +| `-pe` | petaling, pekakas, peketa | +| `-se` | sejak, seniram, sejati | +| `-ke` | kenangan, kerita, keratas | +| `-te` | tehe, tempoh, tetiak | +| `-me` | meruma, menurut, melioro | +| `-be` | berukuran, berfikir, benua | + +#### Productive Suffixes +| Suffix | Examples | +|--------|----------| +| `-n` | kumpulan, regisin, haiwan | +| `-an` | kumpulan, haiwan, berukuran | +| `-ng` | petaling, kantung, ngulang | +| `-ang` | ngulang, manang, sayang | +| `-ah` | tah, fatimah, umrah | + +### 6.3 Bound Stems (Lexical Roots) + +Bound stems are high-frequency subword units that are semantically cohesive but rarely appear as standalone words. These often correspond to the 'core' of a word that requires inflection or derivation to be valid. + +*No significant bound stems detected.* + + +### 6.4 Affix Compatibility (Co-occurrence) + +This table shows which prefixes and suffixes most frequently co-occur on the same stems, revealing the 'stacking' rules of the language's morphology. + +| Prefix | Suffix | Frequency | Examples | +|--------|--------|-----------|----------| +| `-pe` | `-n` | 55 words | pelan, pentaran | +| `-pe` | `-an` | 49 words | pelan, pentaran | +| `-ke` | `-n` | 42 words | kenangan, keteraan | +| `-ke` | `-an` | 36 words | kenangan, keteraan | +| `-se` | `-n` | 11 words | sebahagian, selain | +| `-te` | `-n` | 9 words | temban, tenomon | +| `-se` | `-ng` | 9 words | sedong, sepanjang | +| `-me` | `-n` | 9 words | mesimpon, meluman | +| `-pe` | `-ng` | 7 words | petaling, pelancong | +| `-be` | `-n` | 7 words | berukuran, been | + +### 6.5 Recursive Morpheme Segmentation + +Using **Recursive Hierarchical Substitutability**, we decompose complex words into their constituent morphemes. This approach handles nested affixes (e.g., `prefix-prefix-root-suffix`). + +| Word | Suggested Split | Confidence | Stem | +|------|-----------------|------------|------| +| kebenyakan | **`ke-be-nyak-an`** | 7.5 | `nyak` | +| kebangsaan | **`ke-bangsa-an`** | 6.0 | `bangsa` | +| kelebihan | **`ke-lebih-an`** | 6.0 | `lebih` | +| kelahiran | **`ke-lahir-an`** | 6.0 | `lahir` | +| keramaian | **`ke-ramai-an`** | 6.0 | `ramai` | +| kepulauan | **`ke-pulau-an`** | 6.0 | `pulau` | +| kebudayaan | **`ke-budaya-an`** | 6.0 | `budaya` | +| keputeraan | **`ke-putera-an`** | 6.0 | `putera` | +| sedembila | **`se-dembila`** | 4.5 | `dembila` | +| perpisahan | **`pe-rpis-ah-an`** | 4.5 | `rpis` | +| keselamatan | **`ke-se-lamat-an`** | 4.5 | `lamat` | +| pernikahan | **`pe-rnik-ah-an`** | 4.5 | `rnik` | +| perjuangan | **`pe-rjua-ng-an`** | 4.5 | `rjua` | +| kemerdekaan | **`ke-me-rdeka-an`** | 4.5 | `rdeka` | +| kepelbagaian | **`ke-pe-lbagai-an`** | 4.5 | `lbagai` | + +### 6.6 Linguistic Interpretation + +> **Automated Insight:** +The language BDR appears to be more isolating or has a highly fixed vocabulary. Word-level models perform nearly as well as subword models, indicating fewer productive morphological processes. + +--- +## 7. Summary & Recommendations + +![Performance Dashboard](visualizations/performance_dashboard.png) + +### Production Recommendations + +| Component | Recommended | Rationale | +|-----------|-------------|-----------| +| Tokenizer | **8k BPE** | Best compression (4.79x) | +| N-gram | **2-gram** | Lowest perplexity (181) | +| Markov | **Context-4** | Highest predictability (99.0%) | +| Embeddings | **100d** | Balanced semantic capture and isotropy | + + +--- +## Appendix: Metrics Glossary & Interpretation Guide + +This section provides definitions, intuitions, and guidance for interpreting the metrics used throughout this report. + +### Tokenizer Metrics + +**Compression Ratio** +> *Definition:* The ratio of characters to tokens (chars/token). Measures how efficiently the tokenizer represents text. +> +> *Intuition:* Higher compression means fewer tokens needed to represent the same text, reducing sequence lengths for downstream models. A 3x compression means ~3 characters per token on average. +> +> *What to seek:* Higher is generally better for efficiency, but extremely high compression may indicate overly aggressive merging that loses morphological information. + +**Average Token Length (Fertility)** +> *Definition:* Mean number of characters per token produced by the tokenizer. +> +> *Intuition:* Reflects the granularity of tokenization. Longer tokens capture more context but may struggle with rare words; shorter tokens are more flexible but increase sequence length. +> +> *What to seek:* Balance between 2-5 characters for most languages. Arabic/morphologically-rich languages may benefit from slightly longer tokens. + +**Unknown Token Rate (OOV Rate)** +> *Definition:* Percentage of tokens that map to the unknown/UNK token, indicating words the tokenizer cannot represent. +> +> *Intuition:* Lower OOV means better vocabulary coverage. High OOV indicates the tokenizer encounters many unseen character sequences. +> +> *What to seek:* Below 1% is excellent; below 5% is acceptable. BPE tokenizers typically achieve very low OOV due to subword fallback. + +### N-gram Model Metrics + +**Perplexity** +> *Definition:* Measures how "surprised" the model is by test data. Mathematically: 2^(cross-entropy). Lower values indicate better prediction. +> +> *Intuition:* If perplexity is 100, the model is as uncertain as if choosing uniformly among 100 options at each step. A perplexity of 10 means effectively choosing among 10 equally likely options. +> +> *What to seek:* Lower is better. Perplexity decreases with larger n-grams (more context). Values vary widely by language and corpus size. + +**Entropy** +> *Definition:* Average information content (in bits) needed to encode the next token given the context. Related to perplexity: perplexity = 2^entropy. +> +> *Intuition:* High entropy means high uncertainty/randomness; low entropy means predictable patterns. Natural language typically has entropy between 1-4 bits per character. +> +> *What to seek:* Lower entropy indicates more predictable text patterns. Entropy should decrease as n-gram size increases. + +**Coverage (Top-K)** +> *Definition:* Percentage of corpus occurrences explained by the top K most frequent n-grams. +> +> *Intuition:* High coverage with few patterns indicates repetitive/formulaic text; low coverage suggests diverse vocabulary usage. +> +> *What to seek:* Depends on use case. For language modeling, moderate coverage (40-60% with top-1000) is typical for natural text. + +### Markov Chain Metrics + +**Average Entropy** +> *Definition:* Mean entropy across all contexts, measuring average uncertainty in next-word prediction. +> +> *Intuition:* Lower entropy means the model is more confident about what comes next. Context-1 has high entropy (many possible next words); Context-4 has low entropy (few likely continuations). +> +> *What to seek:* Decreasing entropy with larger context sizes. Very low entropy (<0.1) indicates highly deterministic transitions. + +**Branching Factor** +> *Definition:* Average number of unique next tokens observed for each context. +> +> *Intuition:* High branching = many possible continuations (flexible but uncertain); low branching = few options (predictable but potentially repetitive). +> +> *What to seek:* Branching factor should decrease with context size. Values near 1.0 indicate nearly deterministic chains. + +**Predictability** +> *Definition:* Derived metric: (1 - normalized_entropy) × 100%. Indicates how deterministic the model's predictions are. +> +> *Intuition:* 100% predictability means the next word is always certain; 0% means completely random. Real text falls between these extremes. +> +> *What to seek:* Higher predictability for text generation quality, but too high (>98%) may produce repetitive output. + +### Vocabulary & Zipf's Law Metrics + +**Zipf's Coefficient** +> *Definition:* The slope of the log-log plot of word frequency vs. rank. Zipf's law predicts this should be approximately -1. +> +> *Intuition:* A coefficient near -1 indicates the corpus follows natural language patterns where a few words are very common and most words are rare. +> +> *What to seek:* Values between -0.8 and -1.2 indicate healthy natural language distribution. Deviations may suggest domain-specific or artificial text. + +**R² (Coefficient of Determination)** +> *Definition:* Measures how well the linear fit explains the frequency-rank relationship. Ranges from 0 to 1. +> +> *Intuition:* R² near 1.0 means the data closely follows Zipf's law; lower values indicate deviation from expected word frequency patterns. +> +> *What to seek:* R² > 0.95 is excellent; > 0.99 indicates near-perfect Zipf adherence typical of large natural corpora. + +**Vocabulary Coverage** +> *Definition:* Cumulative percentage of corpus tokens accounted for by the top N words. +> +> *Intuition:* Shows how concentrated word usage is. If top-100 words cover 50% of text, the corpus relies heavily on common words. +> +> *What to seek:* Top-100 covering 30-50% is typical. Higher coverage indicates more repetitive text; lower suggests richer vocabulary. + +### Word Embedding Metrics + +**Isotropy** +> *Definition:* Measures how uniformly distributed vectors are in the embedding space. Computed as the ratio of minimum to maximum singular values. +> +> *Intuition:* High isotropy (near 1.0) means vectors spread evenly in all directions; low isotropy means vectors cluster in certain directions, reducing expressiveness. +> +> *What to seek:* Higher isotropy generally indicates better-quality embeddings. Values > 0.1 are reasonable; > 0.3 is good. Lower-dimensional embeddings tend to have higher isotropy. + +**Average Norm** +> *Definition:* Mean magnitude (L2 norm) of word vectors in the embedding space. +> +> *Intuition:* Indicates the typical "length" of vectors. Consistent norms suggest stable training; high variance may indicate some words are undertrained. +> +> *What to seek:* Relatively consistent norms across models. The absolute value matters less than consistency (low std deviation). + +**Cosine Similarity** +> *Definition:* Measures angular similarity between vectors, ranging from -1 (opposite) to 1 (identical direction). +> +> *Intuition:* Words with similar meanings should have high cosine similarity. This is the standard metric for semantic relatedness in embeddings. +> +> *What to seek:* Semantically related words should score > 0.5; unrelated words should be near 0. Synonyms often score > 0.7. + +**t-SNE Visualization** +> *Definition:* t-Distributed Stochastic Neighbor Embedding - a dimensionality reduction technique that preserves local structure for visualization. +> +> *Intuition:* Clusters in t-SNE plots indicate groups of semantically related words. Spread indicates vocabulary diversity; tight clusters suggest semantic coherence. +> +> *What to seek:* Meaningful clusters (e.g., numbers together, verbs together). Avoid over-interpreting distances - t-SNE preserves local, not global, structure. + +### General Interpretation Guidelines + +1. **Compare within model families:** Metrics are most meaningful when comparing models of the same type (e.g., 8k vs 64k tokenizer). +2. **Consider trade-offs:** Better performance on one metric often comes at the cost of another (e.g., compression vs. OOV rate). +3. **Context matters:** Optimal values depend on downstream tasks. Text generation may prioritize different metrics than classification. +4. **Corpus influence:** All metrics are influenced by corpus characteristics. Wikipedia text differs from social media or literature. +5. **Language-specific patterns:** Morphologically rich languages (like Arabic) may show different optimal ranges than analytic languages. + + +### Visualizations Index + +| Visualization | Description | +|---------------|-------------| +| Tokenizer Compression | Compression ratios by vocabulary size | +| Tokenizer Fertility | Average token length by vocabulary | +| Tokenizer OOV | Unknown token rates | +| Tokenizer Total Tokens | Total tokens by vocabulary | +| N-gram Perplexity | Perplexity by n-gram size | +| N-gram Entropy | Entropy by n-gram size | +| N-gram Coverage | Top pattern coverage | +| N-gram Unique | Unique n-gram counts | +| Markov Entropy | Entropy by context size | +| Markov Branching | Branching factor by context | +| Markov Contexts | Unique context counts | +| Zipf's Law | Frequency-rank distribution with fit | +| Vocab Frequency | Word frequency distribution | +| Top 20 Words | Most frequent words | +| Vocab Coverage | Cumulative coverage curve | +| Embedding Isotropy | Vector space uniformity | +| Embedding Norms | Vector magnitude distribution | +| Embedding Similarity | Word similarity heatmap | +| Nearest Neighbors | Similar words for key terms | +| t-SNE Words | 2D word embedding visualization | +| t-SNE Sentences | 2D sentence embedding visualization | +| Position Encoding | Encoding method comparison | +| Model Sizes | Storage requirements | +| Performance Dashboard | Comprehensive performance overview | + +--- +## About This Project + +### Data Source + +Models trained on [wikipedia-monthly](https://huggingface.co/datasets/omarkamali/wikipedia-monthly) - a monthly snapshot of Wikipedia articles across 300+ languages. + +### Project + +A project by **[Wikilangs](https://wikilangs.org)** - Open-source NLP models for every Wikipedia language. + +### Maintainer + +[Omar Kamali](https://omarkamali.com) - [Omneity Labs](https://omneitylabs.com) + +### Citation + +If you use these models in your research, please cite: + +```bibtex +@misc{wikilangs2025, + author = {Kamali, Omar}, + title = {Wikilangs: Open NLP Models for Wikipedia Languages}, + year = {2025}, + doi = {10.5281/zenodo.18073153}, + publisher = {Zenodo}, + url = {https://huggingface.co/wikilangs} + institution = {Omneity Labs} +} +``` + +### License + +MIT License - Free for academic and commercial use. + +### Links + +- 🌐 Website: [wikilangs.org](https://wikilangs.org) +- 🤗 Models: [huggingface.co/wikilangs](https://huggingface.co/wikilangs) +- 📊 Data: [wikipedia-monthly](https://huggingface.co/datasets/omarkamali/wikipedia-monthly) +- 👤 Author: [Omar Kamali](https://huggingface.co/omarkamali) +- 🤝 Sponsor: [Featherless AI](https://featherless.ai) +--- +*Generated by Wikilangs Models Pipeline* + +*Report Date: 2026-01-03 06:44:23* diff --git a/models/embeddings/monolingual/bdr_128d.bin b/models/embeddings/monolingual/bdr_128d.bin new file mode 100644 index 0000000000000000000000000000000000000000..a067f57b9bbc8015cff18d6660d2e70287e942a1 --- /dev/null +++ b/models/embeddings/monolingual/bdr_128d.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14901cc49f6d41fceff30e160e6f74a2d3e4c31128383c0eb0c581df8f55667d +size 1024837728 diff --git a/models/embeddings/monolingual/bdr_128d.meta.json b/models/embeddings/monolingual/bdr_128d.meta.json new file mode 100644 index 0000000000000000000000000000000000000000..d07c2affbea26a58b59d6f820d155a7e31cd46e7 --- /dev/null +++ b/models/embeddings/monolingual/bdr_128d.meta.json @@ -0,0 +1 @@ +{"lang": "bdr", "dim": 128, "max_seq_len": 512, "is_aligned": false} \ No newline at end of file diff --git a/models/embeddings/monolingual/bdr_128d_metadata.json b/models/embeddings/monolingual/bdr_128d_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..c00ceda1995308467f2b1858423be9045b22ceda --- /dev/null +++ b/models/embeddings/monolingual/bdr_128d_metadata.json @@ -0,0 +1,15 @@ +{ + "language": "bdr", + "dimension": 128, + "version": "monolingual", + "training_params": { + "algorithm": "skipgram", + "min_count": 5, + "window": 5, + "negative": 5, + "epochs": 5, + "encoding_method": "rope", + "dim": 128 + }, + "vocab_size": 806 +} \ No newline at end of file diff --git a/models/embeddings/monolingual/bdr_32d.bin b/models/embeddings/monolingual/bdr_32d.bin new file mode 100644 index 0000000000000000000000000000000000000000..8441b56ed48f5929fee1dbcb1c271d09445c6428 --- /dev/null +++ b/models/embeddings/monolingual/bdr_32d.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa045fafcb5ac078ce6dc442fa1420ea920b282300bd6a2013469a40763dfb5e +size 256218720 diff --git a/models/embeddings/monolingual/bdr_32d.meta.json b/models/embeddings/monolingual/bdr_32d.meta.json new file mode 100644 index 0000000000000000000000000000000000000000..f18eae626e6014198816c5c38e49f6286dd62f3c --- /dev/null +++ b/models/embeddings/monolingual/bdr_32d.meta.json @@ -0,0 +1 @@ +{"lang": "bdr", "dim": 32, "max_seq_len": 512, "is_aligned": false} \ No newline at end of file diff --git a/models/embeddings/monolingual/bdr_32d_metadata.json b/models/embeddings/monolingual/bdr_32d_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..eeaca0393941f7b400919d2d59cd042615641d0d --- /dev/null +++ b/models/embeddings/monolingual/bdr_32d_metadata.json @@ -0,0 +1,15 @@ +{ + "language": "bdr", + "dimension": 32, + "version": "monolingual", + "training_params": { + "algorithm": "skipgram", + "min_count": 5, + "window": 5, + "negative": 5, + "epochs": 5, + "encoding_method": "rope", + "dim": 32 + }, + "vocab_size": 806 +} \ No newline at end of file diff --git a/models/embeddings/monolingual/bdr_64d.bin b/models/embeddings/monolingual/bdr_64d.bin new file mode 100644 index 0000000000000000000000000000000000000000..b060ddda9eb1cce0a638db073f80f13d1891d73b --- /dev/null +++ b/models/embeddings/monolingual/bdr_64d.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f57494e6d97c810c174b928ec3c054ba096b896edd93f5bb2951c51ea1469fb +size 512425056 diff --git a/models/embeddings/monolingual/bdr_64d.meta.json b/models/embeddings/monolingual/bdr_64d.meta.json new file mode 100644 index 0000000000000000000000000000000000000000..d85da76924d960fb09b0c0c306238c3fd592b829 --- /dev/null +++ b/models/embeddings/monolingual/bdr_64d.meta.json @@ -0,0 +1 @@ +{"lang": "bdr", "dim": 64, "max_seq_len": 512, "is_aligned": false} \ No newline at end of file diff --git a/models/embeddings/monolingual/bdr_64d_metadata.json b/models/embeddings/monolingual/bdr_64d_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..1f2cab58e729b4b0f878f4117afd1507728475f5 --- /dev/null +++ b/models/embeddings/monolingual/bdr_64d_metadata.json @@ -0,0 +1,15 @@ +{ + "language": "bdr", + "dimension": 64, + "version": "monolingual", + "training_params": { + "algorithm": "skipgram", + "min_count": 5, + "window": 5, + "negative": 5, + "epochs": 5, + "encoding_method": "rope", + "dim": 64 + }, + "vocab_size": 806 +} \ No newline at end of file diff --git a/models/subword_markov/bdr_markov_ctx1_subword.parquet b/models/subword_markov/bdr_markov_ctx1_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..635daf9da99b623a96581318cee3871de1de3a7c --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx1_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:90dd99baf68b20403d569b8e5f8d37366e4ddf2960b74590e45c40e1931d3cf1 +size 12465 diff --git a/models/subword_markov/bdr_markov_ctx1_subword_metadata.json b/models/subword_markov/bdr_markov_ctx1_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..716a10e5f54108bc94be7a8644798556527d09b8 --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx1_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 1, + "variant": "subword", + "language": "bdr", + "unique_contexts": 104, + "total_transitions": 158102 +} \ No newline at end of file diff --git a/models/subword_markov/bdr_markov_ctx2_subword.parquet b/models/subword_markov/bdr_markov_ctx2_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..e4b9806fc36b2342431c094dd1b1ec9d6f19eb46 --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx2_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:de0458067f06b19cd92069895f072cb47f6cdac7b1aec47edc3f54439b9dc8fb +size 51058 diff --git a/models/subword_markov/bdr_markov_ctx2_subword_metadata.json b/models/subword_markov/bdr_markov_ctx2_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..0b4572df2217fa6421884a0cdb2667aa66aaac80 --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx2_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 2, + "variant": "subword", + "language": "bdr", + "unique_contexts": 1154, + "total_transitions": 157564 +} \ No newline at end of file diff --git a/models/subword_markov/bdr_markov_ctx3_subword.parquet b/models/subword_markov/bdr_markov_ctx3_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..606ea343cda1895f600cbd4d2c81a8742f6c8a31 --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx3_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8a603938c76585d0058fba20159b63f48cc8de1a081d60c9e812d5ef4cc856c2 +size 147977 diff --git a/models/subword_markov/bdr_markov_ctx3_subword_metadata.json b/models/subword_markov/bdr_markov_ctx3_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..da9a0bf59698181c576d4d12f0fc79fc5a20e82c --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx3_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 3, + "variant": "subword", + "language": "bdr", + "unique_contexts": 6603, + "total_transitions": 157026 +} \ No newline at end of file diff --git a/models/subword_markov/bdr_markov_ctx4_subword.parquet b/models/subword_markov/bdr_markov_ctx4_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..af9a42cb710e70e9dc74b6058dd0c0acebd3da60 --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx4_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:40b65aa668a1cff884dd7d1981f09619017109f7d40edc49b7916cb9d523100b +size 355053 diff --git a/models/subword_markov/bdr_markov_ctx4_subword_metadata.json b/models/subword_markov/bdr_markov_ctx4_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..e5d2cdd312ebf47df3e9451f650bdf97390abb8c --- /dev/null +++ b/models/subword_markov/bdr_markov_ctx4_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 4, + "variant": "subword", + "language": "bdr", + "unique_contexts": 20699, + "total_transitions": 156488 +} \ No newline at end of file diff --git a/models/subword_ngram/bdr_2gram_subword.parquet b/models/subword_ngram/bdr_2gram_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..aa8d6168bc7cb65cf65a3691fe40aa5a3000c3fa --- /dev/null +++ b/models/subword_ngram/bdr_2gram_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f89d4cd1bab1bda41e8fb6e1bcbdcfb402307503bc759166623f23f6684cb7d8 +size 9761 diff --git a/models/subword_ngram/bdr_2gram_subword_metadata.json b/models/subword_ngram/bdr_2gram_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..e9898e7d2e6f5950c4c038a41d763818407ba9b2 --- /dev/null +++ b/models/subword_ngram/bdr_2gram_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 2, + "variant": "subword", + "language": "bdr", + "unique_ngrams": 597, + "total_ngrams": 158102 +} \ No newline at end of file diff --git a/models/subword_ngram/bdr_3gram_subword.parquet b/models/subword_ngram/bdr_3gram_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..f49e682c809aa9ea4ec220030c1206b2d5af6b6f --- /dev/null +++ b/models/subword_ngram/bdr_3gram_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:82c353ab6740086eface298a08842c1494816358218a9e1924c16d5cc9298c3d +size 37547 diff --git a/models/subword_ngram/bdr_3gram_subword_metadata.json b/models/subword_ngram/bdr_3gram_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..f2099a0d0aa6874ccaa37322bba889deb33ce8cb --- /dev/null +++ b/models/subword_ngram/bdr_3gram_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 3, + "variant": "subword", + "language": "bdr", + "unique_ngrams": 3421, + "total_ngrams": 157564 +} \ No newline at end of file diff --git a/models/subword_ngram/bdr_4gram_subword.parquet b/models/subword_ngram/bdr_4gram_subword.parquet new file mode 100644 index 0000000000000000000000000000000000000000..3e43bc39d5a77cebebb275d2532f964ac18b6a9c --- /dev/null +++ b/models/subword_ngram/bdr_4gram_subword.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:00c9461926041d02b2401848dda26b82b5b87974371b657a4eeb9a268d495f8c +size 128743 diff --git a/models/subword_ngram/bdr_4gram_subword_metadata.json b/models/subword_ngram/bdr_4gram_subword_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..8ae6ee4cfb841a4a15edcbbaa02bcf650d02fffc --- /dev/null +++ b/models/subword_ngram/bdr_4gram_subword_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 4, + "variant": "subword", + "language": "bdr", + "unique_ngrams": 11413, + "total_ngrams": 157026 +} \ No newline at end of file diff --git a/models/tokenizer/bdr_tokenizer_8k.model b/models/tokenizer/bdr_tokenizer_8k.model new file mode 100644 index 0000000000000000000000000000000000000000..340407fcbde40e4fabc699d83e77780560408a94 --- /dev/null +++ b/models/tokenizer/bdr_tokenizer_8k.model @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:60470ba72369b5d6feb2a6ddd8122b4eb757065597901f8d33f1f47db9d61152 +size 373260 diff --git a/models/tokenizer/bdr_tokenizer_8k.vocab b/models/tokenizer/bdr_tokenizer_8k.vocab new file mode 100644 index 0000000000000000000000000000000000000000..f0a3fb673f95009e3375e7e035fcede6a8628835 --- /dev/null +++ b/models/tokenizer/bdr_tokenizer_8k.vocab @@ -0,0 +1,8000 @@ + 0 + 0 + 0 + 0 +an -0 +▁t -1 +▁b -2 +▁m -3 +▁n -4 +▁p -5 +▁s -6 +▁k -7 +la -8 +en -9 +un -10 +in -11 +▁d -12 +ang -13 +er -14 +at -15 +ak -16 +ar -17 +am -18 +▁i -19 +▁ni -20 +on -21 +ah -22 +as -23 +▁j -24 +▁l -25 +om -26 +▁se -27 +▁ta -28 +yo -29 +▁tu -30 +▁di -31 +au -32 +ai -33 +ok -34 +ung -35 +di -36 +ek -37 +al -38 +▁iyo -39 +et -40 +si -41 +uk -42 +▁r -43 +el -44 +em -45 +ela -46 +ala -47 +eng -48 +ul -49 +ing -50 +yang -51 +ab -52 +ur -53 +▁bo -54 +ut -55 +um -56 +▁yang -57 +▁pan -58 +▁no -59 +▁un -60 +▁la -61 +ik -62 +it -63 +▁boi -64 +ag -65 +is -66 +▁g -67 +ko -68 +ua -69 +omo -70 +▁ng -71 +sia -72 +ay -73 +ad -74 +angan -75 +▁kok -76 +aw -77 +eg -78 +il -79 +▁o -80 +▁jomo -81 +▁ko -82 +ungan -83 +alay -84 +ap -85 +akan -86 +us -87 +▁malay -88 +▁eng -89 +▁tungan -90 +end -91 +▁malaysia -92 +▁bin -93 +▁per -94 +elak -95 +es -96 +adi -97 +im -98 +▁an -99 +▁h -100 +▁bu -101 +ara -102 +▁dik -103 +aj -104 +eb -105 +os -106 +ir -107 +▁pin -108 +▁diom -109 +emb -110 +▁pen -111 +tuk -112 +▁engko -113 +▁met -114 +▁am -115 +▁mu -116 +▁sam -117 +▁untuk -118 +▁ter -119 +▁tak -120 +▁metelak -121 +▁lua -122 +▁bi -123 +▁tek -124 +or -125 +ono -126 +▁at -127 +▁( -128 +edi -129 +▁ke -130 +▁ker -131 +ari -132 +and -133 +▁seb -134 +eri -135 +ti -136 +aan -137 +▁mas -138 +yono -139 +ana -140 +uat -141 +▁mak -142 +▁dikau -143 +▁ber -144 +▁jadi -145 +endo -146 +▁a -147 +▁sem -148 +▁iyono -149 +ri -150 +▁atau -151 +ud -152 +▁e -153 +ula -154 +▁taun -155 +▁sab -156 +ol -157 +ent -158 +▁bet -159 +▁w -160 +▁c -161 +ilo -162 +▁f -163 +▁sabah -164 +ia -165 +▁neg -166 +ind -167 +▁mo -168 +ep -169 +▁bul -170 +▁amun -171 +▁as -172 +▁binuat -173 +ong -174 +▁lag -175 +ta -176 +▁ben -177 +▁ant -178 +▁tekilo -179 +eh -180 +aya -181 +▁ku -182 +oon -183 +ya -184 +▁pel -185 +ant -186 +▁ol -187 +▁medi -188 +to -189 +▁ma -190 +na -191 +▁dendo -192 +id -193 +▁man -194 +pi -195 +up -196 +ian -197 +▁bag -198 +▁iko -199 +ub -200 +▁or -201 +uan -202 +▁al -203 +▁noh -204 +angk -205 +ama -206 +emp -207 +▁lek -208 +▁sela -209 +▁jo -210 +▁kul -211 +▁buli -212 +awa -213 +▁lu -214 +▁ti -215 +▁men -216 +▁sama -217 +▁is -218 +agai -219 +enakan -220 +jo -221 +▁du -222 +▁bar -223 +ih -224 +li -225 +oso -226 +ila -227 +go -228 +amb -229 +ata -230 +▁peng -231 +▁dangan -232 +▁bel -233 +ed -234 +▁kin -235 +▁tah -236 +▁mediam -237 +▁negeri -238 +▁dan -239 +▁diam -240 +▁, -241 +▁oron -242 +iru -243 +lah -244 +ada -245 +▁ai -246 +api -247 +ego -248 +tan -249 +ump -250 +enis -251 +▁betiru -252 +▁sebagai -253 +aran -254 +▁ala -255 +▁dun -256 +▁kel -257 +▁pal -258 +▁lagu -259 +lu -260 +uno -261 +▁bad -262 +▁masa -263 +▁lekat -264 +▁pinak -265 +asan -266 +▁dat -267 +▁sin -268 +bi -269 +iti -270 +▁dok -271 +▁tin -272 +aji -273 +▁ab -274 +atan -275 +▁samp -276 +▁bioso -277 +gi -278 +▁sel -279 +▁antawa -280 +amp -281 +alah -282 +▁mela -283 +▁dokon -284 +▁makai -285 +uh -286 +▁ten -287 +▁bana -288 +▁moto -289 +▁oleh -290 +ew -291 +umb -292 +▁en -293 +▁th -294 +onom -295 +▁ind -296 +▁par -297 +▁laat -298 +▁. -299 +▁ro -300 +▁um -301 +▁pol -302 +▁teb -303 +▁buas -304 +'. -305 +▁in -306 +uang -307 +yu -308 +asa -309 +atu -310 +angg -311 +▁bah -312 +▁baj -313 +▁anak -314 +▁pelego -315 +uran -316 +▁leb -317 +▁mes -318 +▁sej -319 +▁tel -320 +▁tapi -321 +▁pinakai -322 +anj -323 +ati -324 +lam -325 +erah -326 +osok -327 +▁eyo -328 +▁lagi -329 +ej -330 +▁nu -331 +▁gula -332 +▁negara -333 +▁paling -334 +iyo -335 +per -336 +inis -337 +▁sek -338 +▁melayu -339 +ro -340 +kan -341 +▁le -342 +anan -343 +▁mok -344 +▁sembi -345 +▁barang -346 +ena -347 +dang -348 +imon -349 +▁ser -350 +▁tim -351 +▁buek -352 +▁poon -353 +elakun -354 +▁sembiang -355 +ug -356 +ita -357 +ku -358 +▁yo -359 +▁bua -360 +akanan -361 +▁bajau -362 +▁islam -363 +unan -364 +▁pem -365 +▁suk -366 +atang -367 +ember -368 +▁asal -369 +▁ment -370 +▁mula -371 +▁makay -372 +▁makanan -373 +ura -374 +itik -375 +▁mek -376 +▁sep -377 +▁selalu -378 +op -379 +so -380 +esi -381 +eso -382 +eta -383 +▁ut -384 +aman -385 +▁mel -386 +▁mer -387 +▁nya -388 +▁alap -389 +▁diki -390 +▁kota -391 +▁pada -392 +▁dunia -393 +▁jenis -394 +▁bangan -395 +ra -396 +ina -397 +yar -398 +▁ae -399 +▁ah -400 +▁bue -401 +▁kui -402 +▁lim -403 +▁niak -404 +▁pert -405 +anding -406 +▁menteri -407 +ip -408 +tu -409 +yi -410 +▁az -411 +abut -412 +▁gai -413 +▁rum -414 +▁akan -415 +▁nger -416 +▁raya -417 +gu -418 +ni -419 +oi -420 +▁" -421 +▁y -422 +▁z -423 +ail -424 +elau -425 +▁ing -426 +▁kaw -427 +▁put -428 +▁tep -429 +▁muat -430 +▁selang -431 +▁politik -432 +ac -433 +pas -434 +▁ak -435 +▁da -436 +▁und -437 +▁bung -438 +▁kerj -439 +▁rojo -440 +▁kulek -441 +▁pinau -442 +ali -443 +▁aw -444 +▁os -445 +▁gin -446 +▁kem -447 +▁nuut -448 +▁tana -449 +▁jinis -450 +▁manang -451 +▁ngabut -452 +▁sampai -453 +oo -454 +uo -455 +ika -456 +▁ib -457 +▁tr -458 +elum -459 +omon -460 +▁duo -461 +▁ket -462 +▁pas -463 +▁sen -464 +emban -465 +▁guno -466 +▁pelakun -467 +ea -468 +jid -469 +zik -470 +▁si -471 +akat -472 +▁abd -473 +▁dew -474 +▁mat -475 +▁nia -476 +▁pek -477 +▁tan -478 +empat -479 +▁lain -480 +ayu -481 +iau -482 +ran -483 +▁tar -484 +▁ahli -485 +▁beng -486 +▁demb -487 +▁kiti -488 +▁luak -489 +▁langk -490 +▁lebih -491 +io -492 +era -493 +man -494 +uman -495 +▁badu -496 +▁dari -497 +▁osom -498 +▁pend -499 +▁abdul -500 +▁dewan -501 +▁suang -502 +▁antara -503 +endangan -504 +', -505 +ga -506 +od -507 +ot -508 +▁ap -509 +jata -510 +uari -511 +▁keb -512 +▁pes -513 +▁dedi -514 +▁dela -515 +▁teng -516 +▁umur -517 +▁bagal -518 +▁masam -519 +▁muzik -520 +▁sekul -521 +oh -522 +▁v -523 +asi -524 +gan -525 +jar -526 +juk -527 +lis -528 +oro -529 +▁ar -530 +ersi -531 +▁sar -532 +▁sul -533 +▁bela -534 +▁haji -535 +▁teko -536 +▁badan -537 +▁binti -538 +▁limau -539 +▁enjata -540 +▁tebeta -541 +▁undangan -542 +az -543 +ch -544 +mi -545 +ber -546 +lim -547 +one -548 +pen -549 +▁im -550 +epas -551 +ungk -552 +▁dau -553 +▁ent -554 +▁ked -555 +▁kep -556 +▁koh -557 +▁moh -558 +▁anug -559 +▁asia -560 +▁bers -561 +▁pand -562 +▁bagas -563 +▁barat -564 +▁belud -565 +▁bungo -566 +▁lapas -567 +▁nyaun -568 +▁salah -569 +▁sebab -570 +▁sebelum -571 +sa -572 +▁ag -573 +ulau -574 +▁beg -575 +▁dis -576 +▁nut -577 +adisi -578 +▁buan -579 +▁kayu -580 +▁reso -581 +▁sing -582 +▁somo -583 +onesia -584 +▁mokok -585 +▁posok -586 +▁terua -587 +andingan -588 +▁langkau -589 +▁tradisi -590 +▁penenakan -591 +ei -592 +iv -593 +wa -594 +▁- -595 +dik -596 +sud -597 +uar -598 +▁sh -599 +ahan -600 +▁kes -601 +▁kur -602 +▁lau -603 +▁lem -604 +▁muh -605 +▁bigi -606 +▁endo -607 +▁hari -608 +ailand -609 +anakan -610 +▁enggo -611 +▁manuk -612 +▁beliau -613 +▁bidang -614 +▁dediri -615 +▁kinakan -616 +▁sejenis -617 +▁anugerah -618 +▁motoelau -619 +▁thailand -620 +▁indonesia -621 +iz -622 +aja -623 +itu -624 +ikat -625 +yiak -626 +▁dar -627 +▁nab -628 +▁oyo -629 +▁rak -630 +▁set -631 +▁tet -632 +▁awal -633 +▁bule -634 +▁laut -635 +▁ling -636 +▁nyiak -637 +▁binatang -638 +▁pertandingan -639 +ij -640 +mu -641 +ain -642 +duk -643 +emu -644 +hir -645 +uta -646 +▁io -647 +onal -648 +ulan -649 +▁tun -650 +aging -651 +engan -652 +▁diri -653 +▁musi -654 +▁nind -655 +▁pela -656 +▁pemb -657 +▁peny -658 +▁univ -659 +▁merup -660 +▁pulau -661 +▁daerah -662 +▁universi -663 +▁u -664 +agi -665 +est -666 +kat -667 +mad -668 +mat -669 +nya -670 +rah -671 +▁ea -672 +▁er -673 +▁pr -674 +jadi -675 +▁dek -676 +▁maj -677 +▁nur -678 +agian -679 +arang -680 +▁ding -681 +▁lump -682 +▁ngel -683 +▁syar -684 +▁kulit -685 +▁lepas -686 +▁mangan -687 +bu -688 +bil -689 +har -690 +ust -691 +▁ch -692 +▁to -693 +anga -694 +▁mus -695 +▁pak -696 +ingko -697 +▁jari -698 +▁perd -699 +▁satu -700 +▁sedi -701 +▁betis -702 +▁ingin -703 +▁kuala -704 +▁tanah -705 +▁ginuno -706 +▁lumpur -707 +penenakan -708 +▁universiti -709 +). -710 +alu -711 +anc -712 +pan -713 +▁ad -714 +▁br -715 +▁ur -716 +ajib -717 +arna -718 +diri -719 +ingo -720 +▁kik -721 +▁mam -722 +▁nil -723 +▁ole -724 +▁pet -725 +▁ses -726 +antik -727 +apura -728 +jaram -729 +▁beta -730 +▁buah -731 +▁moko -732 +▁pent -733 +▁datuk -734 +▁kerna -735 +▁mimon -736 +▁ngent -737 +▁tahun -738 +▁utara -739 +▁bahasa -740 +▁mesjid -741 +▁sultan -742 +▁selatan -743 +▁semimon -744 +▁disember -745 +▁merupakan -746 +▁pinenakan -747 +▁singapura -748 +da -749 +ec -750 +eo -751 +ani -752 +boi -753 +eli -754 +iro -755 +uka -756 +▁dr -757 +▁ek -758 +▁li -759 +▁og -760 +▁ok -761 +▁su -762 +anak -763 +ayat -764 +egun -765 +usia -766 +▁din -767 +▁kek -768 +▁kew -769 +▁ras -770 +▁sub -771 +anang -772 +andar -773 +erika -774 +▁dang -775 +▁gela -776 +▁kamp -777 +▁mapi -778 +▁pant -779 +▁ruma -780 +▁seri -781 +▁dikaw -782 +▁kikon -783 +▁ngena -784 +▁timur -785 +▁bentuk -786 +▁pinang -787 +▁temban -788 +▁gelaran -789 +▁perdana -790 +pa -791 +za -792 +din -793 +hor -794 +ril -795 +▁ju -796 +pert -797 +ulis -798 +usan -799 +▁kak -800 +▁kau -801 +▁pro -802 +▁tik -803 +ungai -804 +▁bang -805 +▁meng -806 +▁olos -807 +▁pers -808 +▁rena -809 +▁allah -810 +▁biduk -811 +▁gulay -812 +▁rumah -813 +▁dengan -814 +▁tempat -815 +▁tepung -816 +▁kawasan -817 +▁terdiri -818 +▁penanakan -819 +▁tradisional -820 +), -821 +st -822 +th -823 +een -824 +ija -825 +ish -826 +ogo -827 +yat -828 +▁pu -829 +▁us -830 +arah -831 +ilau -832 +tang -833 +ujuk -834 +▁bau -835 +▁bor -836 +▁has -837 +▁jan -838 +▁ken -839 +▁mar -840 +▁nir -841 +angga -842 +dikan -843 +▁dato -844 +▁kump -845 +▁mija -846 +▁nabi -847 +▁numb -848 +▁sepi -849 +▁duang -850 +▁kawin -851 +▁kerjo -852 +▁mesti -853 +▁musim -854 +▁semio -855 +▁wajib -856 +▁bengso -857 +▁maksud -858 +▁amerika -859 +▁dinakan -860 +▁januari -861 +▁kampung -862 +▁tenomon -863 +▁endangan -864 +▁kumpulan -865 +". -866 +ib -867 +aat -868 +ans -869 +apa -870 +bir -871 +ebr -872 +ist -873 +ram -874 +tar -875 +yah -876 +▁il -877 +▁sy -878 +▁tv -879 +onto -880 +uhan -881 +▁kad -882 +▁kar -883 +▁kom -884 +▁lum -885 +▁mac -886 +▁pah -887 +▁pop -888 +▁rep -889 +▁sup -890 +acara -891 +ajaan -892 +ambut -893 +antan -894 +ayaan -895 +elaka -896 +▁engg -897 +▁kata -898 +▁kule -899 +▁mitu -900 +▁siti -901 +▁tali -902 +▁tand -903 +▁terb -904 +▁april -905 +▁azhar -906 +▁eyang -907 +▁keduo -908 +▁kilau -909 +▁menik -910 +▁nindo -911 +▁parai -912 +▁pendi -913 +▁tikok -914 +▁utama -915 +▁melaka -916 +▁ningko -917 +▁numbur -918 +▁pantai -919 +▁bintang -920 +▁kerjaya -921 +▁kewasan -922 +▁ngentan -923 +▁tenonom -924 +▁syarikat -925 +ma -926 +tm -927 +aga -928 +air -929 +ait -930 +bum -931 +dil -932 +ida -933 +int -934 +iri -935 +olo -936 +raf -937 +san -938 +sis -939 +▁au -940 +alan -941 +antu -942 +awak -943 +awan -944 +cara -945 +eter -946 +onon -947 +ukan -948 +ulai -949 +▁bes -950 +▁jam -951 +▁lak -952 +▁mah -953 +▁nov -954 +agang -955 +awang -956 +engah -957 +ontoh -958 +tober -959 +ustri -960 +▁aziz -961 +▁batu -962 +▁buat -963 +▁iono -964 +▁main -965 +▁muan -966 +▁shah -967 +▁sumb -968 +▁temb -969 +▁wila -970 +▁antau -971 +▁gulai -972 +▁kakal -973 +▁ketua -974 +▁kinab -975 +▁moham -976 +▁samah -977 +▁sukan -978 +▁batang -979 +▁iyonoh -980 +▁saging -981 +▁oktober -982 +▁sepisis -983 +▁wilayah -984 +▁kerajaan -985 +▁pendidikan -986 +du -987 +fr -988 +den -989 +neo -990 +uri -991 +yan -992 +▁gu -993 +▁ij -994 +▁ph -995 +▁ul -996 +adan -997 +alia -998 +dung -999 +ikan -1000 +ilem -1001 +lang -1002 +▁bal -1003 +▁bil -1004 +▁buk -1005 +▁emb -1006 +▁jaw -1007 +▁kab -1008 +▁kil -1009 +▁pay -1010 +▁ram -1011 +▁sik -1012 +▁sri -1013 +ebagi -1014 +rahim -1015 +uring -1016 +▁febr -1017 +▁imam -1018 +▁luah -1019 +▁ogos -1020 +▁temp -1021 +ambung -1022 +anjung -1023 +embaan -1024 +▁album -1025 +▁johor -1026 +▁panas -1027 +edagang -1028 +▁bagian -1029 +▁bengen -1030 +▁benuat -1031 +▁borneo -1032 +▁dangai -1033 +▁keluar -1034 +▁masjid -1035 +▁muslim -1036 +▁perlis -1037 +▁sambil -1038 +▁selain -1039 +▁sumber -1040 +▁tangan -1041 +▁ibrahim -1042 +▁manusia -1043 +▁mekanan -1044 +▁menjadi -1045 +▁ngemban -1046 +▁tinonom -1047 +▁bahagian -1048 +▁februari -1049 +▁industri -1050 +▁kinabalu -1051 +▁nilantik -1052 +▁perembaan -1053 +ev -1054 +kh -1055 +mp -1056 +no -1057 +▁/ -1058 +ejo -1059 +imp -1060 +ito -1061 +ord -1062 +uap -1063 +usa -1064 +wan -1065 +yak -1066 +▁el -1067 +▁id -1068 +enda -1069 +iang -1070 +inda -1071 +juan -1072 +onem -1073 +saan -1074 +utur -1075 +▁ism -1076 +▁jum -1077 +▁mei -1078 +▁mem -1079 +▁mul -1080 +▁ner -1081 +▁reb -1082 +▁taw -1083 +▁ung -1084 +▁wan -1085 +antau -1086 +estin -1087 +▁bong -1088 +▁guuk -1089 +▁ilmu -1090 +▁rump -1091 +▁sala -1092 +▁semp -1093 +▁seni -1094 +▁seti -1095 +anjang -1096 +▁agama -1097 +▁empat -1098 +▁fonem -1099 +▁juara -1100 +▁kegun -1101 +▁mamis -1102 +▁melua -1103 +▁ngogo -1104 +▁pelua -1105 +▁tarus -1106 +▁teluk -1107 +▁warna -1108 +yarakat -1109 +▁belagu -1110 +▁debagi -1111 +▁kebang -1112 +▁kerita -1113 +▁majlis -1114 +▁pahang -1115 +▁pandan -1116 +▁pinapi -1117 +▁sanang -1118 +▁sungai -1119 +▁supaya -1120 +▁tudung -1121 +▁tujuan -1122 +▁dembila -1123 +▁kerjoon -1124 +▁lakunan -1125 +▁sarawak -1126 +▁sejinis -1127 +▁teposok -1128 +▁keluarga -1129 +▁november -1130 +▁pinjaram -1131 +▁dendangan -1132 +▁masyarakat -1133 +ic -1134 +le -1135 +mb -1136 +agu -1137 +aik -1138 +del -1139 +ehe -1140 +sir -1141 +yen -1142 +▁ri -1143 +abas -1144 +aian -1145 +angi -1146 +ekut -1147 +engk -1148 +ingg -1149 +opah -1150 +undi -1151 +▁afr -1152 +▁ali -1153 +▁amp -1154 +▁anu -1155 +▁bat -1156 +▁bok -1157 +▁boo -1158 +▁bum -1159 +▁eko -1160 +▁gab -1161 +▁gen -1162 +▁har -1163 +▁min -1164 +▁mor -1165 +▁nip -1166 +▁ped -1167 +▁sal -1168 +▁seg -1169 +▁tab -1170 +▁the -1171 +▁wak -1172 +adbir -1173 +angat -1174 +asing -1175 +ejadi -1176 +jaran -1177 +▁akad -1178 +▁arab -1179 +▁been -1180 +▁bert -1181 +▁brit -1182 +▁duun -1183 +▁indi -1184 +▁kela -1185 +▁kupi -1186 +▁luuk -1187 +▁mend -1188 +▁mohd -1189 +▁perb -1190 +▁sapi -1191 +▁sept -1192 +▁sigu -1193 +▁toos -1194 +▁tung -1195 +ungkus -1196 +▁diing -1197 +▁julai -1198 +▁langa -1199 +▁model -1200 +▁parti -1201 +▁pasal -1202 +▁semen -1203 +▁tekul -1204 +▁afrika -1205 +▁bandar -1206 +▁eropah -1207 +▁ismail -1208 +▁mediom -1209 +▁poland -1210 +▁belakun -1211 +▁jawatan -1212 +▁keluman -1213 +▁nironon -1214 +▁peranak -1215 +▁pinakay -1216 +▁pelajaran -1217 +▁pengacara -1218 +▁september -1219 +▁semenanjung -1220 +'- -1221 +dh -1222 +se -1223 +bit -1224 +dah -1225 +kok -1226 +lat -1227 +let -1228 +lik -1229 +odo -1230 +ond -1231 +oyo -1232 +rat -1233 +ris -1234 +sel -1235 +sul -1236 +ter -1237 +tiv -1238 +▁kh -1239 +▁of -1240 +▁st -1241 +adio -1242 +ajah -1243 +akit -1244 +evis -1245 +gara -1246 +gota -1247 +ikir -1248 +inta -1249 +isin -1250 +pada -1251 +▁and -1252 +▁ang -1253 +▁api -1254 +▁but -1255 +▁far -1256 +▁hai -1257 +▁isi -1258 +▁itu -1259 +▁kal -1260 +▁kap -1261 +▁kut -1262 +▁mur -1263 +▁rej -1264 +▁tok -1265 +▁tut -1266 +ammad -1267 +inggo -1268 +lahir -1269 +malay -1270 +orong -1271 +uatan -1272 +uddin -1273 +urian -1274 +utama -1275 +▁alam -1276 +▁alat -1277 +▁atai -1278 +▁bamb -1279 +▁cara -1280 +▁dala -1281 +▁duta -1282 +▁elau -1283 +▁inda -1284 +▁kiro -1285 +▁laan -1286 +▁lang -1287 +▁lodo -1288 +▁perm -1289 +▁pute -1290 +▁song -1291 +▁soro -1292 +▁terk -1293 +▁tert -1294 +▁tomo -1295 +elakon -1296 +ometer -1297 +▁agong -1298 +▁balik -1299 +▁filem -1300 +▁jaman -1301 +▁luman -1302 +▁ngela -1303 +▁nutup -1304 +▁peser -1305 +▁radio -1306 +▁sungk -1307 +▁tanda -1308 +▁tonom -1309 +▁urang -1310 +adbiran -1311 +ekutuan -1312 +▁bawang -1313 +▁contoh -1314 +▁dembua -1315 +▁kadang -1316 +▁meluar -1317 +▁ngendo -1318 +▁rantau -1319 +▁terati -1320 +▁terian -1321 +▁anggota -1322 +▁beluang -1323 +▁betutur -1324 +▁british -1325 +▁abdullah -1326 +▁daripada -1327 +▁kegunoon -1328 +▁menengah -1329 +▁palestin -1330 +▁tenggara -1331 +▁kilometer -1332 +▁kebangsaan -1333 +af -1334 +▁' -1335 +▁q -1336 +amu -1337 +ban -1338 +dek -1339 +edo -1340 +elu -1341 +isa -1342 +pol -1343 +tif -1344 +uah -1345 +vid -1346 +▁fr -1347 +▁ir -1348 +ajar -1349 +akas -1350 +amin -1351 +anyi -1352 +ayan -1353 +eket -1354 +ihan -1355 +okal -1356 +tian -1357 +uduk -1358 +unei -1359 +utan -1360 +▁era -1361 +▁faz -1362 +▁hal -1363 +▁hid -1364 +▁ini -1365 +▁jun -1366 +▁kam -1367 +▁kun -1368 +▁peg -1369 +▁pon -1370 +▁pus -1371 +▁reg -1372 +▁rtm -1373 +▁sed -1374 +▁tip -1375 +▁uun -1376 +balan -1377 +entuk -1378 +eraan -1379 +erian -1380 +gundi -1381 +ingga -1382 +inggi -1383 +limen -1384 +▁amat -1385 +▁badi -1386 +▁bagi -1387 +▁baja -1388 +▁banj -1389 +▁kesi -1390 +▁lawa -1391 +▁luar -1392 +▁luas -1393 +▁mesi -1394 +▁pasu -1395 +▁raja -1396 +▁suka -1397 +▁suku -1398 +▁telu -1399 +anggar -1400 +ingkat -1401 +kataan -1402 +▁asara -1403 +▁berla -1404 +▁butul -1405 +▁cinta -1406 +▁datai -1407 +▁datin -1408 +▁frasa -1409 +▁ialah -1410 +▁keros -1411 +▁nonom -1412 +▁penak -1413 +▁perak -1414 +▁perlu -1415 +▁repub -1416 +▁tarum -1417 +▁vokal -1418 +evisyen -1419 +pertuan -1420 +▁akadem -1421 +▁brunei -1422 +▁daging -1423 +▁pendok -1424 +▁popula -1425 +▁rakyat -1426 +▁santan -1427 +▁setiap -1428 +▁lemiang -1429 +▁pelakon -1430 +▁peserta -1431 +▁banjaran -1432 +▁muhammad -1433 +▁parlimen -1434 +▁penyanyi -1435 +▁terutama -1436 +▁timbalan -1437 +▁sepanjang -1438 +▁televisyen -1439 +▁pentadbiran -1440 +▁persekutuan -1441 +ba -1442 +ck -1443 +ji -1444 +ng -1445 +▁[ -1446 +aiz -1447 +awi -1448 +bah -1449 +dau -1450 +fem -1451 +ger -1452 +ion -1453 +jek -1454 +kah -1455 +kar -1456 +ker -1457 +mah -1458 +mar -1459 +nik -1460 +pus -1461 +sin -1462 +tak -1463 +tro -1464 +uda -1465 +udi -1466 +uns -1467 +usi -1468 +wat -1469 +▁em -1470 +▁ik -1471 +▁kr -1472 +▁om -1473 +▁pi -1474 +▁ru -1475 +▁sp -1476 +abah -1477 +abau -1478 +adak -1479 +amat -1480 +apan -1481 +apat -1482 +asar -1483 +asia -1484 +asil -1485 +awal -1486 +azak -1487 +diom -1488 +egul -1489 +elam -1490 +erna -1491 +gram -1492 +inan -1493 +inum -1494 +lain -1495 +olom -1496 +urus -1497 +▁beb -1498 +▁jen -1499 +▁lat -1500 +▁lok -1501 +▁moo -1502 +▁pil -1503 +▁pun -1504 +▁res -1505 +▁saw -1506 +▁sun -1507 +▁sur -1508 +▁sus -1509 +▁tio -1510 +aliza -1511 +angka -1512 +ayang -1513 +bagai -1514 +esoyo -1515 +itoon -1516 +ubang -1517 +▁amin -1518 +▁asli -1519 +▁besi -1520 +▁bila -1521 +▁bumi -1522 +▁fant -1523 +▁gamb -1524 +▁imon -1525 +▁maku -1526 +▁ngad -1527 +▁niya -1528 +▁omar -1529 +▁pedi -1530 +▁perj -1531 +▁pesi -1532 +▁poso -1533 +▁ratu -1534 +▁sain -1535 +▁sair -1536 +▁sara -1537 +▁tamp -1538 +▁uran -1539 +ambaan -1540 +barang -1541 +esiden -1542 +indung -1543 +pertua -1544 +urunan -1545 +▁ahmad -1546 +▁akhir -1547 +▁aurat -1548 +▁bahan -1549 +▁benda -1550 +▁benoo -1551 +▁daras -1552 +▁hasil -1553 +▁intan -1554 +▁kabun -1555 +▁razak -1556 +▁rujuk -1557 +▁semek -1558 +▁sikot -1559 +▁sukup -1560 +▁tajuk -1561 +▁using -1562 +▁werna -1563 +▁biabas -1564 +▁binagi -1565 +▁enggai -1566 +▁entedo -1567 +▁fazura -1568 +▁jumlah -1569 +▁kemuap -1570 +▁kerejo -1571 +▁lumaan -1572 +▁pertub -1573 +▁puteri -1574 +▁rebung -1575 +▁serita -1576 +▁songom -1577 +▁tengku -1578 +▁wanita -1579 +malaysia -1580 +▁akademi -1581 +▁baginda -1582 +▁denakan -1583 +▁duangan -1584 +▁majalah -1585 +▁mamalia -1586 +▁mediang -1587 +▁ngerati -1588 +▁penakay -1589 +▁popular -1590 +▁program -1591 +▁rakaman -1592 +▁terbaik -1593 +▁tuturan -1594 +▁azharina -1595 +▁enggomon -1596 +▁ginuring -1597 +▁ngerujuk -1598 +▁penduduk -1599 +▁penjaram -1600 +▁penyakit -1601 +▁republik -1602 +▁selangor -1603 +▁peringkat -1604 +", -1605 +bo -1606 +ca -1607 +ci -1608 +ka -1609 +ki -1610 +ty -1611 +▁+ -1612 +aib -1613 +ais -1614 +apu -1615 +cal -1616 +daw -1617 +dha -1618 +eae -1619 +ean -1620 +eet -1621 +elo -1622 +hta -1623 +hum -1624 +iko -1625 +oic -1626 +par -1627 +pos -1628 +str -1629 +uli -1630 +utu -1631 +▁ny -1632 +▁on -1633 +agal -1634 +alap -1635 +aleh -1636 +anti -1637 +arak -1638 +asih -1639 +atas -1640 +egar -1641 +elah -1642 +enal -1643 +graf -1644 +idae -1645 +ipun -1646 +isan -1647 +itar -1648 +kiro -1649 +land -1650 +omok -1651 +tehe -1652 +uali -1653 +ukar -1654 +▁ada -1655 +▁ais -1656 +▁dod -1657 +▁fes -1658 +▁hab -1659 +▁ham -1660 +▁lan -1661 +▁loo -1662 +▁mea -1663 +▁nak -1664 +▁nar -1665 +▁nin -1666 +▁nor -1667 +▁pat -1668 +▁pep -1669 +▁sim -1670 +▁suh -1671 +▁suu -1672 +▁toh -1673 +▁yun -1674 +abang -1675 +aceae -1676 +ampai -1677 +antin -1678 +erang -1679 +grafi -1680 +ihoro -1681 +iling -1682 +ingat -1683 +ologi -1684 +onomi -1685 +perti -1686 +saung -1687 +udung -1688 +▁ayat -1689 +▁berk -1690 +▁enam -1691 +▁hadi -1692 +▁juta -1693 +▁kadi -1694 +▁kaum -1695 +▁kuih -1696 +▁kuya -1697 +▁labu -1698 +▁lebi -1699 +▁lema -1700 +▁liud -1701 +▁masi -1702 +▁memp -1703 +▁mura -1704 +▁ngai -1705 +▁nurh -1706 +▁pila -1707 +▁rend -1708 +▁rock -1709 +▁sali -1710 +▁siam -1711 +▁sini -1712 +▁siri -1713 +▁tena -1714 +▁tipo -1715 +▁turi -1716 +▁unit -1717 +andung -1718 +anggap -1719 +angkat -1720 +▁angan -1721 +▁awang -1722 +▁bakat -1723 +▁begea -1724 +▁binoo -1725 +▁dalam -1726 +▁darag -1727 +▁david -1728 +▁gegar -1729 +▁india -1730 +▁ingko -1731 +▁klang -1732 +▁lahir -1733 +▁lengk -1734 +▁lolom -1735 +▁matai -1736 +▁media -1737 +▁meter -1738 +▁ninda -1739 +▁organ -1740 +▁payau -1741 +▁pinen -1742 +▁ruang -1743 +▁sampa -1744 +▁tawar -1745 +▁tomok -1746 +▁unduk -1747 +▁waktu -1748 +ahagian -1749 +▁adalah -1750 +▁azizah -1751 +▁basing -1752 +▁berita -1753 +▁betong -1754 +▁binara -1755 +▁haiwan -1756 +▁kelong -1757 +▁kurang -1758 +▁malaya -1759 +▁melaat -1760 +▁menduo -1761 +▁morfem -1762 +▁ngadau -1763 +▁ngajar -1764 +▁pemain -1765 +▁peranc -1766 +▁perisa -1767 +▁pindah -1768 +▁rakaat -1769 +▁sebeta -1770 +▁secara -1771 +▁sinsin -1772 +▁timung -1773 +▁ataupan -1774 +▁beladaw -1775 +▁berlaku -1776 +▁bersatu -1777 +▁keratas -1778 +▁mohamed -1779 +▁penulis -1780 +▁pertama -1781 +▁rejukan -1782 +▁rujukan -1783 +▁sejarah -1784 +▁seperti -1785 +▁tonomon -1786 +▁fantasia -1787 +▁perayaan -1788 +▁presiden -1789 +▁keturunan -1790 +▁nurhaliza -1791 +▁pengantin -1792 +▁perbuatan -1793 +▁saligundi -1794 +cm -1795 +ez -1796 +fa -1797 +hi -1798 +ja -1799 +kr -1800 +og -1801 +abi -1802 +aka -1803 +ank -1804 +asy -1805 +bat -1806 +bur -1807 +cak -1808 +car -1809 +col -1810 +dio -1811 +ema -1812 +eti -1813 +eyo -1814 +gal -1815 +gol -1816 +odi -1817 +omp -1818 +ons -1819 +ont -1820 +ori -1821 +pon -1822 +rtm -1823 +sur -1824 +til -1825 +tis -1826 +tol -1827 +tor -1828 +udu -1829 +uma -1830 +und -1831 +unu -1832 +uru -1833 +▁aj -1834 +▁gi -1835 +▁ka -1836 +▁sm -1837 +▁so -1838 +▁sw -1839 +▁ug -1840 +agut -1841 +aitu -1842 +ampu -1843 +anji -1844 +atak -1845 +awar -1846 +buan -1847 +dila -1848 +entu -1849 +esar -1850 +ganu -1851 +guno -1852 +hari -1853 +iman -1854 +indu -1855 +ipta -1856 +itri -1857 +lagu -1858 +lemb -1859 +oses -1860 +oton -1861 +paun -1862 +umin -1863 +urut -1864 +▁adi -1865 +▁air -1866 +▁ann -1867 +▁apo -1868 +▁bak -1869 +▁bed -1870 +▁big -1871 +▁bud -1872 +▁bur -1873 +▁geo -1874 +▁hus -1875 +▁isk -1876 +▁jac -1877 +▁kej -1878 +▁kim -1879 +▁kit -1880 +▁kol -1881 +▁mad -1882 +▁mai -1883 +▁pad -1884 +▁pik -1885 +▁pul -1886 +▁rec -1887 +▁rem -1888 +▁sap -1889 +▁too -1890 +▁tro -1891 +▁ule -1892 +▁ulu -1893 +▁war -1894 +adung -1895 +ahaya -1896 +asara -1897 +buyat -1898 +ejoon -1899 +ekati -1900 +empak -1901 +geris -1902 +impik -1903 +istik -1904 +letak -1905 +ongko -1906 +orang -1907 +poser -1908 +tival -1909 +ubung -1910 +udian -1911 +ukaan -1912 +▁amal -1913 +▁asil -1914 +▁berm -1915 +▁dent -1916 +▁embo -1917 +▁engk -1918 +▁hati -1919 +▁ibni -1920 +▁ijau -1921 +▁inst -1922 +▁jala -1923 +▁jata -1924 +▁limb -1925 +▁lond -1926 +▁lumb -1927 +▁mala -1928 +▁meny -1929 +▁muda -1930 +▁nagu -1931 +▁naib -1932 +▁ngam -1933 +▁nina -1934 +▁noor -1935 +▁nyak -1936 +▁ogok -1937 +▁peno -1938 +▁pens -1939 +▁pida -1940 +▁putu -1941 +▁samb -1942 +▁seli -1943 +▁sian -1944 +▁telo -1945 +▁tent -1946 +▁terd -1947 +▁thai -1948 +▁tinu -1949 +▁tumb -1950 +▁umat -1951 +▁umum -1952 +▁voic -1953 +antara -1954 +antung -1955 +elamat -1956 +empoon -1957 +marhum -1958 +sultan -1959 +undang -1960 +▁astro -1961 +▁bangk -1962 +▁begeh -1963 +▁benua -1964 +▁bersi -1965 +▁biula -1966 +▁bokok -1967 +▁dasar -1968 +▁drama -1969 +▁engai -1970 +▁ensel -1971 +▁eyono -1972 +▁fardu -1973 +▁hamid -1974 +▁iaitu -1975 +▁irama -1976 +▁jakat -1977 +▁jamal -1978 +▁kadai -1979 +▁keput -1980 +▁keroi -1981 +▁kotol -1982 +▁kutak -1983 +▁lemah -1984 +▁manis -1985 +▁mingo -1986 +▁murak -1987 +▁nicol -1988 +▁paray -1989 +▁pener -1990 +▁perub -1991 +▁pilam -1992 +▁rasmi -1993 +▁resmi -1994 +▁sebuo -1995 +▁selat -1996 +▁subur -1997 +▁suhai -1998 +▁thumb -1999 +▁timus -2000 +▁tinut -2001 +▁tunku -2002 +▁watak -2003 +angunan -2004 +diladha -2005 +engganu -2006 +stralia -2007 +▁adunan -2008 +▁andang -2009 +▁batung -2010 +▁bongso -2011 +▁guring -2012 +▁kerusi -2013 +▁kinilo -2014 +▁kuning -2015 +▁lengan -2016 +▁lubang -2017 +▁mangat -2018 +▁meksud -2019 +▁nabila -2020 +▁nambut -2021 +▁ngasil -2022 +▁nginum -2023 +▁pandai -2024 +▁pelanc -2025 +▁penapi -2026 +▁pengel -2027 +▁perang -2028 +▁pingat -2029 +▁proses -2030 +▁record -2031 +▁rumpun -2032 +▁rumput -2033 +▁sebuah -2034 +▁semula -2035 +▁setemu -2036 +▁tampat -2037 +▁tekito -2038 +▁tinemu -2039 +▁undang -2040 +elamatan -2041 +▁ambuyat -2042 +▁andaman -2043 +▁binanan -2044 +▁kesuali -2045 +▁kineket -2046 +▁langkaw -2047 +▁olimpik -2048 +▁pekakas -2049 +▁penting -2050 +▁pinosok -2051 +▁ramadan -2052 +▁seorang -2053 +▁serisir -2054 +▁sungkud -2055 +▁tebadak -2056 +▁tekulek -2057 +▁terusan -2058 +perkataan -2059 +▁almarhum -2060 +▁festival -2061 +▁geografi -2062 +▁inggeris -2063 +▁kelantan -2064 +▁kelekati -2065 +▁kemudian -2066 +▁kerejoon -2067 +▁komposer -2068 +▁mekitoon -2069 +▁perancis -2070 +▁serangga -2071 +▁sinambut -2072 +▁talipaun -2073 +▁tempatan -2074 +▁terkenal -2075 +▁terletak -2076 +▁aidiladha -2077 +▁bambangan -2078 +▁perambaan -2079 +▁pinedagang -2080 +▁sebahagian -2081 +▁keselamatan -2082 +bb -2083 +co -2084 +ff -2085 +hl -2086 +ie -2087 +ks -2088 +ym -2089 +▁) -2090 +"," -2091 +ast -2092 +aud -2093 +auz -2094 +bua -2095 +buh -2096 +dun -2097 +egi -2098 +ero -2099 +fah -2100 +gam -2101 +han -2102 +iah -2103 +imo -2104 +ini -2105 +ins -2106 +iza -2107 +kad -2108 +kak -2109 +kel -2110 +kot -2111 +lak -2112 +lan -2113 +leh -2114 +lus -2115 +mas -2116 +orm -2117 +ort -2118 +rip -2119 +riz -2120 +rus -2121 +sen -2122 +sik -2123 +ton -2124 +tun -2125 +uai -2126 +uba -2127 +ubu -2128 +uul -2129 +wah -2130 +wal -2131 +yai -2132 +▁co -2133 +▁do -2134 +▁gr -2135 +▁ho -2136 +▁kl -2137 +▁mi -2138 +▁pl -2139 +▁up -2140 +akin -2141 +amah -2142 +ampa -2143 +apar -2144 +aruh -2145 +asal -2146 +azah -2147 +bung -2148 +deka -2149 +diki -2150 +dilf -2151 +ebut -2152 +edia -2153 +edio -2154 +enek -2155 +enga -2156 +esan -2157 +faiz -2158 +hani -2159 +hluk -2160 +ilan -2161 +imew -2162 +ingk -2163 +iput -2164 +irin -2165 +jana -2166 +jang -2167 +jomo -2168 +kota -2169 +ling -2170 +odun -2171 +okan -2172 +rael -2173 +raja -2174 +raya -2175 +rojo -2176 +sari -2177 +taya -2178 +ubah -2179 +ukur -2180 +unai -2181 +upan -2182 +▁abu -2183 +▁anj -2184 +▁bab -2185 +▁bek -2186 +▁dem -2187 +▁der -2188 +▁emp -2189 +▁fat -2190 +▁gua -2191 +▁han -2192 +▁hub -2193 +▁ibu -2194 +▁int -2195 +▁ist -2196 +▁kat -2197 +▁kir -2198 +▁kis -2199 +▁kla -2200 +▁laa -2201 +▁leg -2202 +▁les -2203 +▁lew -2204 +▁muk -2205 +▁mun -2206 +▁oki -2207 +▁pau -2208 +▁pit -2209 +▁puk -2210 +▁ran -2211 +▁rat -2212 +▁saj -2213 +▁seh -2214 +▁sku -2215 +▁stu -2216 +▁swa -2217 +▁swt -2218 +▁tap -2219 +▁ula -2220 +▁ums -2221 +▁yah -2222 +abila -2223 +abung -2224 +adaan -2225 +agaan -2226 +akala -2227 +aling -2228 +anyak -2229 +ayung -2230 +bioso -2231 +buaan -2232 +buhan -2233 +dikau -2234 +dikit -2235 +emaah -2236 +endok -2237 +erdek -2238 +erita -2239 +etaan -2240 +hatan -2241 +ingan -2242 +intah -2243 +kolah -2244 +mingo -2245 +mpung -2246 +sabah -2247 +tilia -2248 +tocar -2249 +uanan -2250 +uasan -2251 +unjuk -2252 +unsay -2253 +yakan -2254 +▁abis -2255 +▁aina -2256 +▁angg -2257 +▁atas -2258 +▁atay -2259 +▁bala -2260 +▁bani -2261 +▁beri -2262 +▁berp -2263 +▁daud -2264 +▁daya -2265 +▁doyo -2266 +▁duai -2267 +▁erin -2268 +▁erti -2269 +▁fauz -2270 +▁gaya -2271 +▁giuk -2272 +▁huda -2273 +▁jamb -2274 +▁juur -2275 +▁kamb -2276 +▁kand -2277 +▁kari -2278 +▁keen -2279 +▁kend -2280 +▁khas -2281 +▁kraf -2282 +▁kulu -2283 +▁mara -2284 +▁memb -2285 +▁muka -2286 +▁ngew -2287 +▁ngko -2288 +▁niat -2289 +▁pala -2290 +▁pitu -2291 +▁prof -2292 +▁rang -2293 +▁rans -2294 +▁rata -2295 +▁rego -2296 +▁ruun -2297 +▁sawa -2298 +▁semb -2299 +▁star -2300 +▁tamb -2301 +▁tari -2302 +▁tehe -2303 +▁terp -2304 +▁tiap -2305 +▁tuan -2306 +▁tupi -2307 +▁umno -2308 +▁zain -2309 +amaian -2310 +andang -2311 +angkan -2312 +angkin -2313 +contoh -2314 +elioro -2315 +endoon -2316 +enomon -2317 +golong -2318 +kadang -2319 +terian -2320 +uangan -2321 +udahan -2322 +ulauan -2323 +▁acara -2324 +▁ampun -2325 +▁atlet -2326 +▁bangi -2327 +▁besar -2328 +▁china -2329 +▁darak -2330 +▁dekau -2331 +▁dekit -2332 +▁diken -2333 +▁dodol -2334 +▁gajah -2335 +▁gipun -2336 +▁hadis -2337 +▁haram -2338 +▁humin -2339 +▁idris -2340 +▁impon -2341 +▁inter -2342 +▁kapal -2343 +▁karya -2344 +▁kebud -2345 +▁kedah -2346 +▁kedua -2347 +▁kelas -2348 +▁keper -2349 +▁kolej -2350 +▁ladin -2351 +▁leboh -2352 +▁lumbo -2353 +▁mampu -2354 +▁narus -2355 +▁niman -2356 +▁nipis -2357 +▁nukar -2358 +▁nurul -2359 +▁orang -2360 +▁pakai -2361 +▁papar -2362 +▁payaw -2363 +▁pedih -2364 +▁pekan -2365 +▁pemer -2366 +▁pikap -2367 +▁pusat -2368 +▁rampa -2369 +▁rasun -2370 +▁ringg -2371 +▁sains -2372 +▁sedio -2373 +▁sejak -2374 +▁serup -2375 +▁shari -2376 +▁spesi -2377 +▁tubuh -2378 +▁ugama -2379 +▁ungku -2380 +▁urung -2381 +▁voice -2382 +▁aidilf -2383 +▁aminah -2384 +▁ampuan -2385 +▁belego -2386 +▁berdik -2387 +▁besina -2388 +▁billah -2389 +▁binuka -2390 +▁bongon -2391 +▁budaya -2392 +▁dayang -2393 +▁durian -2394 +▁engkok -2395 +▁entemu -2396 +▁eyonoh -2397 +▁gadung -2398 +▁gempak -2399 +▁harian -2400 +▁hingga -2401 +▁ijazah -2402 +▁jemaah -2403 +▁jumaat -2404 +▁kerana -2405 +▁ketelu -2406 +▁kurban -2407 +▁labuan -2408 +▁langga -2409 +▁limbai -2410 +▁london -2411 +▁mekilo -2412 +▁minggo -2413 +▁nakung -2414 +▁paduka -2415 +▁penari -2416 +▁penger -2417 +▁remaja -2418 +▁sangat -2419 +▁sarung -2420 +▁segala -2421 +▁siniar -2422 +▁skuasy -2423 +▁studio -2424 +▁sulung -2425 +▁tengah -2426 +▁tetapi -2427 +▁torong -2428 +▁tuaran -2429 +erdekaan -2430 +tocarpus -2431 +▁apabila -2432 +▁belajar -2433 +▁dempoon -2434 +▁ekonomi -2435 +▁keadaan -2436 +▁kekuran -2437 +▁kerabau -2438 +▁makhluk -2439 +▁mensari -2440 +▁merdeka -2441 +▁minuman -2442 +▁nambung -2443 +▁ngejadi -2444 +▁pattaya -2445 +▁persegi -2446 +▁pilihan -2447 +▁regisin -2448 +▁sarjana -2449 +▁sedikit -2450 +▁sekolah -2451 +▁seramin -2452 +▁suhaimi -2453 +▁tebagal -2454 +▁telingo -2455 +▁tembaga -2456 +▁temboro -2457 +▁teturan -2458 +▁gabungan -2459 +▁hubungan -2460 +▁kalangan -2461 +▁keluasan -2462 +▁manakala -2463 +▁ngentelo -2464 +▁pelbagai -2465 +▁pemanang -2466 +▁reptilia -2467 +▁sampuran -2468 +▁separang -2469 +▁serudung -2470 +▁terdapat -2471 +▁tertentu -2472 +▁australia -2473 +▁huminodun -2474 +▁kandungan -2475 +▁kemudahan -2476 +▁keramaian -2477 +▁kesihatan -2478 +▁ngeliling -2479 +▁pelabuhan -2480 +▁pelancong -2481 +▁pengasara -2482 +▁permukaan -2483 +▁setanding -2484 +▁sinambung -2485 +▁tergolong -2486 +▁tertinggi -2487 +▁aidilfitri -2488 +▁kebudayaan -2489 +▁ngelindung -2490 +▁perjuangan -2491 +▁pertubuhan -2492 +▁terengganu -2493 +▁kemerdekaan -2494 +▁pembangunan -2495 +." -2496 +., -2497 +ao -2498 +cy -2499 +ey -2500 +fo -2501 +gh -2502 +ha -2503 +hy -2504 +ld -2505 +lo -2506 +px -2507 +qu -2508 +sm -2509 +su -2510 +▁x -2511 +▁– -2512 +adz -2513 +ahu -2514 +aku -2515 +amm -2516 +ard -2517 +bau -2518 +bok -2519 +bol -2520 +bon -2521 +buk -2522 +cap -2523 +cho -2524 +cil -2525 +cip -2526 +cli -2527 +dan -2528 +dis -2529 +dpa -2530 +dut -2531 +eja -2532 +eko -2533 +elm -2534 +eph -2535 +erv -2536 +gas -2537 +gel -2538 +gon -2539 +hab -2540 +haj -2541 +iha -2542 +ine -2543 +ith -2544 +jer -2545 +joh -2546 +jud -2547 +lao -2548 +lau -2549 +ler -2550 +lua -2551 +mak -2552 +min -2553 +mun -2554 +mus -2555 +nab -2556 +neg -2557 +nie -2558 +nil -2559 +oma -2560 +omb -2561 +oni -2562 +pah -2563 +sed -2564 +sem -2565 +ubi -2566 +uga -2567 +ulh -2568 +uso -2569 +usu -2570 +uti -2571 +wak -2572 +zmi -2573 +▁ay -2574 +▁dp -2575 +▁dw -2576 +▁ed -2577 +▁et -2578 +▁hj -2579 +▁ip -2580 +▁it -2581 +▁km -2582 +▁me -2583 +▁mk -2584 +▁my -2585 +▁ra -2586 +▁re -2587 +▁rm -2588 +▁sa -2589 +▁ud -2590 +adah -2591 +adat -2592 +afas -2593 +agar -2594 +ahun -2595 +ajid -2596 +akri -2597 +alis -2598 +alun -2599 +amel -2600 +amuk -2601 +amun -2602 +anda -2603 +anit -2604 +apak -2605 +arab -2606 +arai -2607 +arat -2608 +arch -2609 +arta -2610 +aruk -2611 +arum -2612 +asai -2613 +asut -2614 +atau -2615 +atik -2616 +awah -2617 +ayap -2618 +bapa -2619 +diam -2620 +eben -2621 +ebih -2622 +egah -2623 +elip -2624 +elup -2625 +emia -2626 +enor -2627 +epon -2628 +esor -2629 +etah -2630 +etak -2631 +etik -2632 +gula -2633 +guul -2634 +ibat -2635 +iduk -2636 +ikip -2637 +inae -2638 +ings -2639 +ipit -2640 +iram -2641 +irik -2642 +isha -2643 +itan -2644 +itat -2645 +iton -2646 +itut -2647 +kang -2648 +kata -2649 +khaw -2650 +kson -2651 +laat -2652 +lima -2653 +onan -2654 +onok -2655 +ranc -2656 +sama -2657 +sedi -2658 +sela -2659 +somo -2660 +tafa -2661 +tuut -2662 +umat -2663 +umba -2664 +umno -2665 +unda -2666 +ungg -2667 +unsi -2668 +urga -2669 +urik -2670 +usek -2671 +usul -2672 +usun -2673 +usus -2674 +▁abc -2675 +▁aku -2676 +▁amn -2677 +▁bos -2678 +▁bun -2679 +▁dia -2680 +▁dim -2681 +▁edi -2682 +▁fin -2683 +▁gar -2684 +▁gib -2685 +▁gim -2686 +▁gol -2687 +▁gun -2688 +▁hel -2689 +▁hij -2690 +▁huk -2691 +▁jud -2692 +▁kah -2693 +▁kan -2694 +▁lap -2695 +▁lio -2696 +▁mal -2697 +▁mau -2698 +▁med -2699 +▁mew -2700 +▁mim -2701 +▁nik -2702 +▁pej -2703 +▁rah -2704 +▁sci -2705 +▁sil -2706 +▁soo -2707 +▁tat -2708 +▁teh -2709 +▁ump -2710 +▁ust -2711 +▁wah -2712 +▁yay -2713 +aitul -2714 +akwah -2715 +amani -2716 +anjur -2717 +ariah -2718 +athir -2719 +ation -2720 +bilan -2721 +bungo -2722 +eguul -2723 +ejogo -2724 +elaan -2725 +embah -2726 +engot -2727 +entak -2728 +estor -2729 +haram -2730 +hiran -2731 +ilang -2732 +imewa -2733 +indah -2734 +inton -2735 +jidil -2736 +lapas -2737 +masam -2738 +munah -2739 +oloji -2740 +ompok -2741 +ongso -2742 +order -2743 +panas -2744 +pical -2745 +pulau -2746 +sikot -2747 +thumb -2748 +tifik -2749 +unani -2750 +ungun -2751 +unyai -2752 +urban -2753 +▁alus -2754 +▁asar -2755 +▁asas -2756 +▁ayer -2757 +▁bank -2758 +▁bayu -2759 +▁bend -2760 +▁berb -2761 +▁berf -2762 +▁bino -2763 +▁buau -2764 +▁camp -2765 +▁cerv -2766 +▁chao -2767 +▁ciri -2768 +▁ella -2769 +▁emma -2770 +▁gaba -2771 +▁heli -2772 +▁ikon -2773 +▁isti -2774 +▁iyan -2775 +▁jaya -2776 +▁jono -2777 +▁kasa -2778 +▁kepo -2779 +▁kima -2780 +▁kina -2781 +▁kito -2782 +▁kuta -2783 +▁laak -2784 +▁liau -2785 +▁looh -2786 +▁malu -2787 +▁mana -2788 +▁mata -2789 +▁meli -2790 +▁moso -2791 +▁musa -2792 +▁nasi -2793 +▁nela -2794 +▁ngeb -2795 +▁nged -2796 +▁ngej -2797 +▁nila -2798 +▁nora -2799 +▁pang -2800 +▁pata -2801 +▁pian -2802 +▁pien -2803 +▁poyo -2804 +▁pres -2805 +▁raga -2806 +▁rasu -2807 +▁rela -2808 +▁remb -2809 +▁riau -2810 +▁ribu -2811 +▁saan -2812 +▁sais -2813 +▁sasa -2814 +▁segi -2815 +▁sepu -2816 +▁subu -2817 +▁susi -2818 +▁susu -2819 +▁taan -2820 +▁tamu -2821 +▁tebu -2822 +▁tiba -2823 +▁usia -2824 +▁wala -2825 +▁ydpa -2826 +▁yuna -2827 +▁zeti -2828 +abaran -2829 +aisuri -2830 +ajikan -2831 +akasan -2832 +angkau -2833 +apatan -2834 +arizmi -2835 +berdik -2836 +dillah -2837 +ebihan -2838 +egundi -2839 +elipan -2840 +embali -2841 +enting -2842 +erakan -2843 +faizah -2844 +inggir -2845 +itamin -2846 +jadian -2847 +kurian -2848 +lakuan -2849 +masing -2850 +ongkob -2851 +pahang -2852 +poland -2853 +sedong -2854 +selain -2855 +umbung -2856 +▁aktiv -2857 +▁artis -2858 +▁aznil -2859 +▁bakas -2860 +▁bambu -2861 +▁banan -2862 +▁beriu -2863 +▁berkh -2864 +▁bibit -2865 +▁bilik -2866 +▁bubur -2867 +▁bukit -2868 +▁cipta -2869 +▁dalas -2870 +▁disir -2871 +▁dugal -2872 +▁genus -2873 +▁gitar -2874 +▁habis -2875 +▁hajah -2876 +▁hanim -2877 +▁harus -2878 +▁hutan -2879 +▁jarak -2880 +▁kakan -2881 +▁kalah -2882 +▁kamar -2883 +▁kapur -2884 +▁karan -2885 +▁karna -2886 +▁kasih -2887 +▁kasut -2888 +▁kejoh -2889 +▁kekal -2890 +▁kerab -2891 +▁kerja -2892 +▁kisap -2893 +▁kitab -2894 +▁kuasa -2895 +▁latin -2896 +▁latok -2897 +▁lebar -2898 +▁lesam -2899 +▁lirik -2900 +▁makin -2901 +▁manan -2902 +▁mandi -2903 +▁marak -2904 +▁mesir -2905 +▁misha -2906 +▁mulia -2907 +▁musik -2908 +▁najib -2909 +▁nedio -2910 +▁ngaji -2911 +▁ngala -2912 +▁ngeng -2913 +▁ngera -2914 +▁ninak -2915 +▁nipah -2916 +▁nipit -2917 +▁nobat -2918 +▁nulis -2919 +▁pagut -2920 +▁pakar -2921 +▁pangg -2922 +▁papan -2923 +▁pardu -2924 +▁pasar -2925 +▁pekat -2926 +▁pelep -2927 +▁pepen -2928 +▁perni -2929 +▁peten -2930 +▁petua -2931 +▁piala -2932 +▁pinek -2933 +▁pines -2934 +▁pingo -2935 +▁pulut -2936 +▁putek -2937 +▁putra -2938 +▁raang -2939 +▁rasul -2940 +▁rebus -2941 +▁sajuk -2942 +▁sapek -2943 +▁sarah -2944 +▁sawan -2945 +▁segul -2946 +▁seiko -2947 +▁sekel -2948 +▁selan -2949 +▁serta -2950 +▁sesok -2951 +▁setar -2952 +▁sikap -2953 +▁singa -2954 +▁sumat -2955 +▁suria -2956 +▁susah -2957 +▁tabir -2958 +▁tadau -2959 +▁talam -2960 +▁tawap -2961 +▁tenga -2962 +▁terus -2963 +▁tetap -2964 +▁tigam -2965 +▁tokoh -2966 +▁ungus -2967 +▁ustad -2968 +▁wajid -2969 +▁wakil -2970 +▁wangi -2971 +▁zakat -2972 +▁zakri -2973 +▁zaman -2974 +▁zikir -2975 +agangan -2976 +anganza -2977 +anggang -2978 +empadan -2979 +enangan -2980 +ingkoon -2981 +iputera -2982 +nikahan -2983 +▁abitat -2984 +▁adibah -2985 +▁ajaran -2986 +▁baitul -2987 +▁banggi -2988 +▁bangso -2989 +▁bayang -2990 +▁begian -2991 +▁bilion -2992 +▁cahaya -2993 +▁dekero -2994 +▁derian -2995 +▁dungun -2996 +▁embooi -2997 +▁embuli -2998 +▁farriz -2999 +▁gayung -3000 +▁gibang -3001 +▁hasmah -3002 +▁hassan -3003 +▁iskand -3004 +▁israel -3005 +▁jenama -3006 +▁kambul -3007 +▁kaulah -3008 +▁keenam -3009 +▁keirin -3010 +▁keneet -3011 +▁kepada -3012 +▁klausa -3013 +▁kuanan -3014 +▁kurian -3015 +▁kurung -3016 +▁langau -3017 +▁langit -3018 +▁lemari -3019 +▁mainan -3020 +▁makkah -3021 +▁marudu -3022 +▁mayang -3023 +▁mediak -3024 +▁mekong -3025 +▁menari -3026 +▁metela -3027 +▁nangka -3028 +▁nerimo -3029 +▁ngelag -3030 +▁ngerak -3031 +▁ngukur -3032 +▁ngurus -3033 +▁nirait -3034 +▁nyiduk -3035 +▁parang -3036 +▁patang -3037 +▁pemalu -3038 +▁pesoyo -3039 +▁phraya -3040 +▁pinene -3041 +▁projek -3042 +▁puncak -3043 +▁rambut -3044 +▁rendah -3045 +▁rengot -3046 +▁rentak -3047 +▁sambal -3048 +▁sampak -3049 +▁search -3050 +▁sebana -3051 +▁sedong -3052 +▁seguul -3053 +▁sekali -3054 +▁serupo -3055 +▁sesapu -3056 +▁sesuai -3057 +▁simpan -3058 +▁syamel -3059 +▁syarat -3060 +▁syawal -3061 +▁tabang -3062 +▁tabung -3063 +▁takhta -3064 +▁tekule -3065 +▁tempoh -3066 +▁tinuut -3067 +▁tukang -3068 +▁yunani -3069 +ikipedia -3070 +▁angguta -3071 +▁anjuran -3072 +▁bandung -3073 +▁begiang -3074 +▁belasan -3075 +▁beranti -3076 +▁berasal -3077 +▁bersiri -3078 +▁bungkar -3079 +▁dataran -3080 +▁enselan -3081 +▁entenga -3082 +▁fauziah -3083 +▁gabenor -3084 +▁gentian -3085 +▁jackson -3086 +▁kahanga -3087 +▁kenakan -3088 +▁komersi -3089 +▁lampung -3090 +▁lelipan -3091 +▁lengkap -3092 +▁masalah -3093 +▁menurut -3094 +▁mesakan -3095 +▁minangk -3096 +▁mohamad -3097 +▁ngejogo -3098 +▁ngendok -3099 +▁nianjur -3100 +▁pakatan -3101 +▁pangkat -3102 +▁panjang -3103 +▁pengent -3104 +▁records -3105 +▁ringgit -3106 +▁samping -3107 +▁sekitar -3108 +▁selawat -3109 +▁sempena -3110 +▁seniram -3111 +▁separuh -3112 +▁sesikot -3113 +▁sinelup -3114 +▁teniman -3115 +▁tentera -3116 +▁teralap -3117 +▁terbuka -3118 +▁tinunai -3119 +▁tinutup -3120 +▁tumbuan -3121 +▁yayasan -3122 +anggarkan -3123 +ayatuddin -3124 +▁beserita -3125 +▁binuanan -3126 +▁campuran -3127 +▁dentahun -3128 +▁dikenali -3129 +▁entengah -3130 +▁hidangan -3131 +▁institut -3132 +▁istimewa -3133 +▁kejadian -3134 +▁kejayaan -3135 +▁kelompok -3136 +▁mahathir -3137 +▁maimunah -3138 +▁medagang -3139 +▁ngewarna -3140 +▁panglima -3141 +▁pedelaan -3142 +▁peketaan -3143 +▁pendawan -3144 +▁pendayan -3145 +▁penerbit -3146 +▁pengarah -3147 +▁pentaran -3148 +▁profesor -3149 +▁sebanyak -3150 +▁sehingga -3151 +▁sembilan -3152 +▁sempadan -3153 +▁sensaung -3154 +▁sinsaung -3155 +▁sumatera -3156 +▁terbesar -3157 +▁vanganza -3158 +artocarpus -3159 +khawarizmi -3160 +▁binungkus -3161 +▁disirikan -3162 +▁kebajikan -3163 +▁kejohanan -3164 +▁kelahiran -3165 +▁kepulauan -3166 +▁mempunyai -3167 +▁nurfaizah -3168 +▁pedendoon -3169 +▁pekakasan -3170 +▁pembentuk -3171 +▁pemungkus -3172 +▁perkataan -3173 +▁perlakuan -3174 +▁pertubuan -3175 +▁perubatan -3176 +▁pinelioro -3177 +▁ransangan -3178 +▁saintifik -3179 +▁selegundi -3180 +▁sepanggar -3181 +▁sinelamat -3182 +▁bumiputera -3183 +▁keputeraan -3184 +▁pendapatan -3185 +▁permaisuri -3186 +▁perniagaan -3187 +▁bersempadan -3188 +▁dianggarkan -3189 +▁iskandariah -3190 +") -3191 +bn -3192 +ct -3193 +cu -3194 +dr -3195 +dy -3196 +ft -3197 +ge -3198 +gy -3199 +if -3200 +ju -3201 +ne -3202 +ob -3203 +ou -3204 +rk -3205 +rt -3206 +te -3207 +uy -3208 +vi -3209 +ان -3210 +ين -3211 +▁: -3212 +ace -3213 +ael -3214 +age -3215 +aih -3216 +ale -3217 +anw -3218 +any -3219 +apt -3220 +ass -3221 +aun -3222 +aut -3223 +aza -3224 +baj -3225 +baw -3226 +bet -3227 +bin -3228 +bub -3229 +cti -3230 +der -3231 +die -3232 +don -3233 +eha -3234 +eim -3235 +eki -3236 +ele -3237 +ell -3238 +enc -3239 +ern -3240 +eth -3241 +etr -3242 +eza -3243 +fil -3244 +gai -3245 +gar -3246 +gen -3247 +gul -3248 +had -3249 +hib -3250 +hid -3251 +hyl -3252 +iak -3253 +ias -3254 +iat -3255 +iki -3256 +isi -3257 +isk -3258 +jah -3259 +jaw -3260 +jin -3261 +jok -3262 +jon -3263 +kem -3264 +ket -3265 +lar -3266 +las -3267 +lav -3268 +lie -3269 +maq -3270 +nah -3271 +nam -3272 +oda -3273 +oga -3274 +oko -3275 +osi -3276 +oto -3277 +out -3278 +pet -3279 +pir -3280 +pis -3281 +rum -3282 +sah -3283 +sai -3284 +sak -3285 +sar -3286 +ses -3287 +sif -3288 +ska -3289 +ssa -3290 +taz -3291 +tel -3292 +tik -3293 +tuh -3294 +tut -3295 +uak -3296 +uam -3297 +ugh -3298 +ukt -3299 +ene -3300 +ulu -3301 +umi -3302 +ums -3303 +yaf -3304 +ych -3305 +yit -3306 +zec -3307 +zim -3308 +▁ba -3309 +▁de -3310 +▁eb -3311 +▁fi -3312 +▁hu -3313 +▁ih -3314 +▁ky -3315 +▁lp -3316 +▁na -3317 +▁ne -3318 +▁nt -3319 +▁ob -3320 +▁pa -3321 +▁te -3322 +▁vi -3323 +▁yu -3324 +▁za -3325 +abar -3326 +adar -3327 +adaz -3328 +adun -3329 +agam -3330 +agan -3331 +agia -3332 +ajur -3333 +akil -3334 +akwa -3335 +alal -3336 +alir -3337 +alor -3338 +alsa -3339 +anao -3340 +anau -3341 +andu -3342 +anon -3343 +anug -3344 +anun -3345 +apai -3346 +apeh -3347 +apuk -3348 +apun -3349 +arik -3350 +aris -3351 +asik -3352 +asta -3353 +asun -3354 +atig -3355 +ator -3356 +atri -3357 +atus -3358 +audi -3359 +awin -3360 +ayad -3361 +ayag -3362 +ayau -3363 +bang -3364 +bila -3365 +bisi -3366 +eati -3367 +ebab -3368 +ebar -3369 +ebua -3370 +ecil -3371 +edah -3372 +edik -3373 +ekat -3374 +elap -3375 +elia -3376 +elis -3377 +emba -3378 +engo -3379 +ered -3380 +ergi -3381 +erio -3382 +form -3383 +hamp -3384 +ikas -3385 +imah -3386 +imam -3387 +imat -3388 +inet -3389 +isah -3390 +iyan -3391 +joon -3392 +joto -3393 +lagi -3394 +lamp -3395 +last -3396 +luar -3397 +medi -3398 +nast -3399 +sp -3400 +aki -3401 +duc -3402 +lee -3403 +pse -3404 +rik -3405 +baik -3406 +ikal -3407 +krip -3408 +krit -3409 +liff -3410 +mpon -3411 +ngan -3412 +odop -3413 +ogot -3414 +oksi -3415 +okus -3416 +olok -3417 +orop -3418 +oson -3419 +pers -3420 +putu -3421 +reso -3422 +roco -3423 +sang -3424 +saud -3425 +sein -3426 +sial -3427 +sian -3428 +sium -3429 +soyo -3430 +tacy -3431 +tapi -3432 +tebr -3433 +tomo -3434 +tuan -3435 +tung -3436 +uapu -3437 +uara -3438 +ubat -3439 +udai -3440 +udat -3441 +umur -3442 +unuh -3443 +unuk -3444 +upas -3445 +urah -3446 +urne -3447 +uruh -3448 +urun -3449 +usof -3450 +uyat -3451 +uzik -3452 +vent -3453 +ward -3454 +wata -3455 +wati -3456 +▁agn -3457 +▁alm -3458 +▁amy -3459 +▁art -3460 +▁atl -3461 +▁avi -3462 +▁bas -3463 +▁bay -3464 +▁bir -3465 +▁bra -3466 +▁buh -3467 +▁buj -3468 +▁cat -3469 +▁cit -3470 +▁con -3471 +▁cth -3472 +▁did -3473 +▁dij -3474 +▁dip -3475 +▁dua -3476 +▁eks -3477 +▁ela -3478 +▁end -3479 +▁fen -3480 +▁gag -3481 +▁gam -3482 +▁gel -3483 +▁hek -3484 +▁hmi -3485 +▁hos -3486 +▁ibn -3487 +▁inf -3488 +▁jay -3489 +▁jeb -3490 +▁jew -3491 +▁joh -3492 +▁jos -3493 +▁juh -3494 +▁kaj -3495 +▁kmn -3496 +▁kra -3497 +▁kub -3498 +▁leh -3499 +,” -3500 +ikh -3501 +sun -3502 +ymm -3503 +asni -3504 +elar -3505 +guru -3506 +ilas -3507 +oron -3508 +▁let -3509 +▁lin -3510 +▁log -3511 +▁lrt -3512 +▁mag -3513 +▁meg -3514 +▁mep -3515 +▁mik -3516 +▁mol -3517 +▁mon -3518 +▁mua -3519 +▁naw -3520 +▁nei -3521 +▁net -3522 +▁new -3523 +▁nga -3524 +▁nig -3525 +▁noi -3526 +▁non -3527 +▁nus -3528 +▁oto -3529 +▁pew -3530 +▁phi -3531 +▁pkr -3532 +▁pok -3533 +▁puh -3534 +▁rad -3535 +▁rek -3536 +▁ret -3537 +▁roh -3538 +▁ros -3539 +▁ruh -3540 +▁san -3541 +▁sat -3542 +▁sip -3543 +▁sut -3544 +▁tam -3545 +▁tea -3546 +▁tes -3547 +▁tit -3548 +▁tos -3549 +▁tru -3550 +▁uwa -3551 +▁wor -3552 +▁yan -3553 +▁ziz -3554 +aduan -3555 +alami -3556 +aliti -3557 +ampir -3558 +anani -3559 +ancar -3560 +andas -3561 +andum -3562 +angah -3563 +angis -3564 +angsa -3565 +angsi -3566 +angun -3567 +anjut -3568 +ansel -3569 +antak -3570 +antut -3571 +anwel -3572 +araan -3573 +aribb -3574 +asaan -3575 +asang -3576 +assap -3577 +atkan -3578 +awati -3579 +bajau -3580 +bisin -3581 +cakup -3582 +chael -3583 +chang -3584 +diana -3585 +dilia -3586 +dipen -3587 +egori -3588 +elles -3589 +empen -3590 +emuan -3591 +engko -3592 +engso -3593 +entik -3594 +entim -3595 +erman -3596 +estik -3597 +ezaan -3598 +gulai -3599 +duo -3600 +ght -3601 +cipt -3602 +▁dep -3603 +bahan -3604 +chool -3605 +elmey -3606 +hibur -3607 +iasan -3608 +idmat -3609 +impak -3610 +impas -3611 +inggu -3612 +insar -3613 +ipina -3614 +iskal -3615 +istan -3616 +istem -3617 +katai -3618 +keben -3619 +kegun -3620 +kuasa -3621 +lawan -3622 +lebih -3623 +libat -3624 +manuk -3625 +mbang -3626 +ology -3627 +ombak -3628 +omoon -3629 +onsep -3630 +ophyl -3631 +ormat -3632 +oyoon -3633 +rafik -3634 +taris -3635 +tawan -3636 +thion -3637 +tunku -3638 +ulang -3639 +ullah -3640 +umpul -3641 +umput -3642 +unduk -3643 +ungku -3644 +ungus -3645 +urung -3646 +using -3647 +uyung -3648 +wayat -3649 +yafie -3650 +yarah -3651 +▁abad -3652 +▁abar -3653 +▁abuk -3654 +▁adat -3655 +▁akal -3656 +▁alah -3657 +▁amar -3658 +▁amas -3659 +▁anam -3660 +▁anim -3661 +▁baan -3662 +▁badm -3663 +▁band -3664 +▁baru -3665 +▁bean -3666 +▁bege -3667 +▁berc -3668 +▁berj -3669 +▁bola -3670 +▁boro -3671 +▁budi -3672 +▁bueh -3673 +▁buul -3674 +▁cuma -3675 +▁deki -3676 +▁dewi -3677 +▁doko -3678 +▁dong -3679 +▁doom -3680 +▁dwic -3681 +▁dymm -3682 +▁ebit -3683 +▁ecli -3684 +▁educ -3685 +▁emas -3686 +▁engo -3687 +▁fair -3688 +▁form -3689 +▁gaza -3690 +▁guru -3691 +▁hawa -3692 +▁hend -3693 +▁idol -3694 +▁ijab -3695 +▁indu -3696 +▁ipoh -3697 +▁iram -3698 +▁iran -3699 +▁jawa -3700 +▁juga -3701 +▁kaki -3702 +▁kama -3703 +▁kans -3704 +▁kaut -3705 +▁kebi -3706 +▁keli -3707 +▁king -3708 +▁kira -3709 +▁kont -3710 +▁kose -3711 +▁kosm -3712 +▁kris -3713 +▁kuda -3714 +▁kula -3715 +▁kuri -3716 +▁kutu -3717 +▁lati -3718 +▁lidi -3719 +▁limo -3720 +▁luri -3721 +▁luwa -3722 +▁mahu -3723 +▁maka -3724 +▁mand -3725 +▁mapa -3726 +▁mari -3727 +▁maul -3728 +▁mind -3729 +▁mono -3730 +▁morp -3731 +▁mule -3732 +▁muli -3733 +▁muti -3734 +▁mutu -3735 +▁nama -3736 +▁namp -3737 +▁napu -3738 +▁nawi -3739 +▁nemp -3740 +▁nena -3741 +▁ngap -3742 +▁nget -3743 +▁niid -3744 +▁nika -3745 +▁nimb -3746 +▁norm -3747 +▁nuba -3748 +▁nump -3749 +▁odon -3750 +▁onan -3751 +▁osok -3752 +▁pait -3753 +▁para -3754 +▁perk -3755 +▁pimp -3756 +▁plat -3757 +▁pony -3758 +▁puan -3759 +▁rabi -3760 +▁ranc -3761 +▁redi -3762 +▁repa -3763 +▁reti -3764 +▁sagu -3765 +▁saiz -3766 +▁saji -3767 +▁sans -3768 +▁saud -3769 +▁sino -3770 +▁slav -3771 +▁sudu -3772 +▁sump -3773 +▁suri -3774 +▁tabi -3775 +▁tanj -3776 +▁teen -3777 +▁tekn -3778 +▁tend -3779 +▁trek -3780 +▁trop -3781 +▁tunu -3782 +▁ulam -3783 +▁urea -3784 +▁usul -3785 +▁utok -3786 +▁york -3787 +adilan -3788 +aganza -3789 +anggut -3790 +angkap -3791 +angkit -3792 +anjian -3793 +antuan -3794 +arabni -3795 +arakat -3796 +awasan -3797 +bertak -3798 +betiru -3799 +bourne -3800 +edidik -3801 +ejarah -3802 +ekelum -3803 +elumpi -3804 +emakan -3805 +embang -3806 +embila -3807 +embong -3808 +embung -3809 +endang -3810 +enekan -3811 +entang -3812 +entian -3813 +erlang -3814 +eroton -3815 +hiasan -3816 +hormat -3817 +ikiran -3818 +inakan -3819 +kannya -3820 +kripsi -3821 +lastik -3822 +layang -3823 +maqdis -3824 +mediam -3825 +minggo -3826 +nastik -3827 +negeri -3828 +ondong -3829 +ongkop -3830 +onidae -3831 +perasi -3832 +perpis -3833 +pirasi -3834 +pusing -3835 +rancis -3836 +sanaan -3837 +selang -3838 +soyoon -3839 +suling -3840 +sungai -3841 +tungan -3842 +ukuran -3843 +ulawal -3844 +ulitan -3845 +unding -3846 +ungkul -3847 +▁aktif -3848 +▁ampus -3849 +▁ansur -3850 +▁antap -3851 +▁antar -3852 +▁arnab -3853 +▁asuan -3854 +▁bagai -3855 +▁bakal -3856 +▁balas -3857 +▁batas -3858 +▁batik -3859 +▁baulu -3860 +▁bawah -3861 +▁bebas -3862 +▁bedik -3863 +▁belak -3864 +▁betak -3865 +▁biasa -3866 +▁binan -3867 +▁bisin -3868 +▁boleh -3869 +▁boron -3870 +▁bueak -3871 +▁burau -3872 +▁carta -3873 +▁chong -3874 +▁cliff -3875 +▁coast -3876 +▁croco -3877 +▁darat -3878 +▁datar -3879 +▁dekad -3880 +▁depas -3881 +▁didie -3882 +▁dodok -3883 +▁edisi -3884 +▁eling -3885 +▁ernie -3886 +▁etnik -3887 +▁falsa -3888 +▁farah -3889 +▁filas -3890 +▁fokus -3891 +▁gagut -3892 +▁ganti -3893 +▁ghani -3894 +▁gurun -3895 +▁halal -3896 +▁halim -3897 +▁halus -3898 +▁hayat -3899 +▁helmi -3900 +▁heter -3901 +▁hindu -3902 +▁hukum -3903 +▁human -3904 +▁hussa -3905 +▁ibada -3906 +▁indah -3907 +▁indie -3908 +▁injin -3909 +▁intom -3910 +▁ionoh -3911 +▁iyotu -3912 +▁jalan -3913 +▁jambu -3914 +▁jamek -3915 +▁jaruk -3916 +▁jipun -3917 +▁judul -3918 +▁kabau -3919 +▁kadaz -3920 +▁kamus -3921 +▁kanan -3922 +▁karir -3923 +▁katig -3924 +▁kecil -3925 +▁kepel -3926 +▁keris -3927 +▁kewau -3928 +▁kimia -3929 +▁kuang -3930 +▁kuapu -3931 +▁kuasi -3932 +▁kubur -3933 +▁kudat -3934 +▁kuleh -3935 +▁kurma -3936 +▁kutub -3937 +▁kylie -3938 +▁laila -3939 +▁lamak -3940 +▁laman -3941 +▁latar -3942 +▁lebak -3943 +▁likas -3944 +▁lokan -3945 +▁macam -3946 +▁mahda -3947 +▁manas -3948 +▁mangk -3949 +▁mapak -3950 +▁maraw -3951 +▁masih -3952 +▁masin -3953 +▁matak -3954 +▁mawar -3955 +▁mayat -3956 +▁mener -3957 +▁mengg -3958 +▁mengh -3959 +▁mengk -3960 +▁metio -3961 +▁metro -3962 +▁mosok -3963 +▁mulud -3964 +▁mundu -3965 +▁namat -3966 +▁namuk -3967 +▁nantu -3968 +▁nerus -3969 +▁ngaku -3970 +▁nging -3971 +▁nilai -3972 +▁nissa -3973 +▁niusa -3974 +▁novel -3975 +▁ntawa -3976 +▁nunuk -3977 +▁objek -3978 +▁pakau -3979 +▁panut -3980 +▁paras -3981 +▁parit -3982 +▁pasir -3983 +▁pedia -3984 +▁pekam -3985 +▁pelan -3986 +▁pelgo -3987 +▁pened -3988 +▁penek -3989 +▁penel -3990 +▁penyi -3991 +▁perag -3992 +▁pesak -3993 +▁pesat -3994 +▁pilem -3995 +▁pilit -3996 +▁pipit -3997 +▁pitas -3998 +▁pitra -3999 +▁pukul -4000 +▁pulis -4001 +▁quran -4002 +▁ramai -4003 +▁rangk -4004 +▁ransa -4005 +▁right -4006 +▁rindu -4007 +▁sabri -4008 +▁sabun -4009 +▁salun -4010 +▁sawit -4011 +▁sayap -4012 +▁semem -4013 +▁seych -4014 +▁sinar -4015 +▁sinta -4016 +▁sioko -4017 +▁stacy -4018 +▁sudai -4019 +▁sunat -4020 +▁suntu -4021 +▁super -4022 +▁surau -4023 +▁sybil -4024 +▁tabah -4025 +▁taman -4026 +▁tamat -4027 +▁tanak -4028 +▁tangg -4029 +▁tanpa -4030 +▁tantu -4031 +▁tapak -4032 +▁taruk -4033 +▁tasik -4034 +▁tatap -4035 +▁tekon -4036 +▁tetag -4037 +▁tibau -4038 +▁tikus -4039 +▁tiner -4040 +▁titik -4041 +▁togon -4042 +▁tuhan -4043 +▁udara -4044 +▁ulama -4045 +▁umpan -4046 +▁umrah -4047 +▁ustaz -4048 +▁wings -4049 +▁world -4050 +▁yahya -4051 +▁yusof -4052 +ahuwata -4053 +amerika -4054 +andakan -4055 +ational -4056 +bagaian -4057 +elakkan -4058 +embahan -4059 +esoyoon -4060 +estoran -4061 +etahuan -4062 +lembaga -4063 +manusia -4064 +mustafa -4065 +pandung -4066 +sambung -4067 +semimon -4068 +tomobil -4069 +ulhasni -4070 +▁akibat -4071 +▁bahawa -4072 +▁baltik -4073 +▁bangga -4074 +▁bangku -4075 +▁bangsa -4076 +▁bebeli -4077 +▁beguno -4078 +▁benang -4079 +▁biskal -4080 +▁bongsu -4081 +▁bosoon -4082 +▁burung -4083 +▁cansel -4084 +▁caribb -4085 +▁dangay -4086 +▁danish -4087 +▁dediki -4088 +▁dekilo -4089 +▁destin -4090 +▁diraja -4091 +▁endiom -4092 +▁engkoh -4093 +▁engkon -4094 +▁ensedi -4095 +▁entawa -4096 +▁fatiha -4097 +▁gambar -4098 +▁gambus -4099 +▁gandum -4100 +▁golden -4101 +▁gombak -4102 +▁grafik -4103 +▁gunung -4104 +▁hampir -4105 +▁hektar -4106 +▁helios -4107 +▁hidang -4108 +▁hijrah -4109 +▁ibadat -4110 +▁ibarat -4111 +▁ilanun -4112 +▁iyopan -4113 +▁iyotuh -4114 +▁jarang -4115 +▁jenaja -4116 +▁joseph -4117 +▁kaabah -4118 +▁kampus -4119 +▁karang -4120 +▁kerang -4121 +▁khusus -4122 +▁kinaji -4123 +▁konsep -4124 +▁kreati -4125 +▁kuling -4126 +▁kunyit -4127 +▁langah -4128 +▁lantik -4129 +▁legend -4130 +▁lentik -4131 +▁letian -4132 +▁liabas -4133 +▁lokasi -4134 +▁markah -4135 +▁masing -4136 +▁meling -4137 +▁melodi -4138 +▁memper -4139 +▁meruma -4140 +▁meteko -4141 +▁nampai -4142 +▁nandar -4143 +▁nangga -4144 +▁ngeket -4145 +▁ngenda -4146 +▁ngeruo -4147 +▁ngguno -4148 +▁ngodop -4149 +▁ngupas -4150 +▁niajur -4151 +▁nianit -4152 +▁nintan -4153 +▁nipers -4154 +▁panday -4155 +▁payung -4156 +▁peguam -4157 +▁peketa -4158 +▁penapa -4159 +▁pengor -4160 +▁pergel -4161 +▁petani -4162 +▁pinggo -4163 +▁polska -4164 +▁produk -4165 +▁pusing -4166 +▁putera -4167 +▁rabung -4168 +▁rahmat -4169 +▁ramlee -4170 +▁rangup -4171 +▁rejiki -4172 +▁relamu -4173 +▁rohani -4174 +▁rumini -4175 +▁rungai -4176 +▁runsay -4177 +▁sabyan -4178 +▁sahaya -4179 +▁saleha -4180 +▁salleh -4181 +▁sampay -4182 +▁sampul -4183 +▁santak -4184 +▁sapind -4185 +▁sayang -4186 +▁school -4187 +▁sebgai -4188 +▁sedaka -4189 +▁sejati -4190 +▁seluar -4191 +▁semoga -4192 +▁sentim -4193 +▁sesair -4194 +▁sharif -4195 +▁siasai -4196 +▁simbol -4197 +▁sistem -4198 +▁sosial -4199 +▁spesis -4200 +▁suling -4201 +▁sungku -4202 +▁sutung -4203 +▁syarak -4204 +▁syurga -4205 +▁takkan -4206 +▁tanduk -4207 +▁tangsi -4208 +▁tarang -4209 +▁tarian -4210 +▁taubat -4211 +▁tebiat -4212 +▁tebung -4213 +▁tekook -4214 +▁tenunu -4215 +▁tetamu -4216 +▁tetemu -4217 +▁tetiak -4218 +▁tinagu -4219 +▁tinggi -4220 +▁tinubu -4221 +▁unggul -4222 +▁wassap -4223 +▁yahudi -4224 +▁yamani -4225 +▁zainal -4226 +elidikan -4227 +intangan -4228 +jerantut -4229 +kegunoon -4230 +ophyllus -4231 +▁akhirat -4232 +▁anugera -4233 +▁baharum -4234 +▁banding -4235 +▁benafas -4236 +▁bersama -4237 +▁berubah -4238 +▁bitamin -4239 +▁bungkak -4240 +▁cabaran -4241 +▁dangdut -4242 +▁dembuah -4243 +▁demburi -4244 +▁dentaun -4245 +▁eclipse -4246 +▁embunda -4247 +▁enderio -4248 +▁england -4249 +▁fatimah -4250 +▁finalis -4251 +▁francis -4252 +▁genting -4253 +▁ginunoh -4254 +▁harapan -4255 +▁hussein -4256 +▁idangan -4257 +▁janggut -4258 +▁jebatan -4259 +▁jewatan -4260 +▁kabinet -4261 +▁kalsium -4262 +▁kansang -4263 +▁keempat -4264 +▁kekanak -4265 +▁kekapeh -4266 +▁keoyoon -4267 +▁kepesan -4268 +▁keramat -4269 +▁kinunsi -4270 +▁kumbung -4271 +▁machang -4272 +▁makaleh -4273 +▁mangkin -4274 +▁masakan -4275 +▁mekaleh -4276 +▁melioro -4277 +▁meluman -4278 +▁meranau -4279 +▁meronok -4280 +▁mesegul -4281 +▁michael -4282 +▁mutiara -4283 +▁naungan -4284 +▁nembali -4285 +▁ngeraih -4286 +▁ngubung -4287 +▁ngulang -4288 +▁nongkob -4289 +▁organik -4290 +▁pakayan -4291 +▁palikat -4292 +▁pasaran -4293 +▁pekaian -4294 +▁pelajar -4295 +▁pelamin -4296 +▁pelawak -4297 +▁pelepah -4298 +▁pelikat -4299 +▁pelumba -4300 +▁pemebua -4301 +▁pemuzik -4302 +▁penawar -4303 +▁penemia -4304 +▁pengeet -4305 +▁perabut -4306 +▁pesisir -4307 +▁pesukan -4308 +▁petenab -4309 +▁pineneh -4310 +▁pinggir -4311 +▁plastik -4312 +▁ratusan -4313 +▁realiti -4314 +▁rendang -4315 +▁ruangan -4316 +▁salamat -4317 +▁sangkan -4318 +▁sebelah -4319 +▁sekelon -4320 +▁selepas -4321 +▁selipar -4322 +▁seluran -4323 +▁semakin -4324 +▁senarai -4325 +▁senator -4326 +▁senegul -4327 +▁seramah -4328 +▁setanga -4329 +▁setelah -4330 +▁sinaran -4331 +▁sinebut -4332 +▁sinegul -4333 +▁spesies -4334 +▁sungkak -4335 +▁tamadun -4336 +▁tanjung -4337 +▁tarikan -4338 +▁tekakan -4339 +▁tekanan -4340 +▁tekuleh -4341 +▁telepon -4342 +▁tembara -4343 +▁tentang -4344 +▁tepenek -4345 +▁tinarik -4346 +▁tinulis -4347 +▁tinutur -4348 +▁tunggal -4349 +▁tungkul -4350 +▁upacara -4351 +▁vaganza -4352 +angkerang -4353 +bertakhta -4354 +▁abdillah -4355 +▁aktiviti -4356 +▁atlantik -4357 +▁bangunan -4358 +▁bayangan -4359 +▁berdikir -4360 +▁berfikir -4361 +▁berkanji -4362 +▁bersuara -4363 +▁berunsay -4364 +▁bilangan -4365 +▁binentuk -4366 +▁bujangan -4367 +▁cervinae -4368 +▁clifford -4369 +▁dayangku -4370 +▁dembangi -4371 +▁dembilak -4372 +▁depasang -4373 +▁entorong -4374 +▁falsafah -4375 +▁gambaran -4376 +▁iskandar -4377 +▁jambatan -4378 +▁kategori -4379 +▁kawasaan -4380 +▁keadilan -4381 +▁kegunaan -4382 +▁kenangan -4383 +▁keteraan -4384 +▁komanwel -4385 +▁kosmetik -4386 +▁maembong -4387 +▁melihoro -4388 +▁mencakup -4389 +▁mengalir -4390 +▁mesampai -4391 +▁mesanang -4392 +▁mesimpon -4393 +▁mesjidil -4394 +▁metangun -4395 +▁mikologi -4396 +▁mohammad -4397 +▁muslimat -4398 +▁muslimin -4399 +▁nganggap -4400 +▁ngedidik -4401 +▁ngelagan -4402 +▁ngelumpi -4403 +▁ngengung -4404 +▁ngerasun -4405 +▁nianggap -4406 +▁nordiana -4407 +▁pahlawan -4408 +▁panggung -4409 +▁pasangan -4410 +▁pegacara -4411 +▁peginsar -4412 +▁pelestik -4413 +▁pelihoro -4414 +▁pelimpas -4415 +▁pelistik -4416 +▁pembunuh -4417 +▁pendapat -4418 +▁penejadi -4419 +▁penganan -4420 +▁penginum -4421 +▁pengurik -4422 +▁peranggi -4423 +▁perbagai -4424 +▁periasan -4425 +▁perintah -4426 +▁pesorong -4427 +▁pimpinan -4428 +▁pinejadi -4429 +▁pinesoyo -4430 +▁pinusing -4431 +▁rambutan -4432 +▁rembutan -4433 +▁sambutan -4434 +▁sanskrit -4435 +▁sebanani -4436 +▁selangur -4437 +▁sembatan -4438 +▁seneguul -4439 +▁serimpak -4440 +▁tandasan -4441 +▁terkecil -4442 +▁terlibat -4443 +▁terposok -4444 +▁tropical -4445 +▁tumbuhan -4446 +▁wartawan -4447 +ahuwataala -4448 +kebenyakan -4449 +perpisahan -4450 +▁automobil -4451 +▁badminton -4452 +▁bajausama -4453 +▁bedinakan -4454 +▁berangkit -4455 +▁berbongso -4456 +▁berukuran -4457 +▁betenomon -4458 +▁caribbean -4459 +▁destinasi -4460 +▁education -4461 +▁gimnastik -4462 +▁kelebihan -4463 +▁kenderaan -4464 +▁keputusan -4465 +▁kinulitan -4466 +▁komersial -4467 +▁lendangan -4468 +▁malathion -4469 +▁malaysian -4470 +▁melbourne -4471 +▁membentuk -4472 +▁mepanakan -4473 +▁merupokan -4474 +▁morpoloji -4475 +▁ngelanjut -4476 +▁nusantara -4477 +▁pembagian -4478 +▁pembuatan -4479 +▁pendakwah -4480 +▁penenekan -4481 +▁pengelego -4482 +▁pengentan -4483 +▁penghibur -4484 +▁penongkop -4485 +▁pensyarah -4486 +▁perbezaan -4487 +▁perhiasan -4488 +▁pesindung -4489 +▁pinekiton -4490 +▁rancangan -4491 +▁sedembila -4492 +▁selangkau -4493 +▁seligundi -4494 +▁sempangan -4495 +▁sesambung -4496 +▁sesangkap -4497 +▁shahelmey -4498 +▁silanggar -4499 +▁talibisin -4500 +▁teknologi -4501 +▁tengkatai -4502 +▁tenguyung -4503 +▁tinangkin -4504 +▁wikipedia -4505 +dipenanakan -4506 +▁berkhidmat -4507 +▁bertanding -4508 +▁bertemakan -4509 +▁crocodilia -4510 +▁kemenangan -4511 +▁kepesoyoon -4512 +▁lengkubang -4513 +▁nilanggang -4514 +▁pemerintah -4515 +▁penelihoro -4516 +▁peningkoon -4517 +▁perjanjian -4518 +▁pernikahan -4519 +▁preskripsi -4520 +▁rabiulawal -4521 +▁sentimeter -4522 +▁seychelles -4523 +▁terpanjang -4524 +▁university -4525 +▁azizulhasni -4526 +▁kepersoyoon -4527 +▁minangkabau -4528 +▁nipersembah -4529 +▁pelancongan -4530 +▁pembentukan -4531 +▁pengerumput -4532 +▁pengetahuan -4533 +▁perdagangan -4534 +▁pergelangan -4535 +▁persembahan -4536 +▁sapindaceae -4537 +▁baitulmaqdis -4538 +▁jawatankuasa -4539 +▁kebangsanaan -4540 +▁kepelbagaian -4541 +▁penyelidikan -4542 +▁dwicangkerang -4543 +▁heterophyllus -4544 +▁menjadikannya -4545 +fb -4546 +gr -4547 +pk -4548 +pu -4549 +ru -4550 +wi -4551 +yw -4552 +zc -4553 +zl -4554 +”, -4555 +.), -4556 +adu -4557 +aip -4558 +apr -4559 +ayn -4560 +bor -4561 +ebi -4562 +eol -4563 +eor -4564 +esa -4565 +esm -4566 +fis -4567 +fti -4568 +gin -4569 +grs -4570 +hai -4571 +hij -4572 +jaj -4573 +jil -4574 +jol -4575 +kil -4576 +kir -4577 +lor -4578 +maj -4579 +men -4580 +mor -4581 +oja -4582 +oll -4583 +puk -4584 +ros -4585 +seb -4586 +sor -4587 +spm -4588 +sus -4589 +tab -4590 +tad -4591 +tes -4592 +tib -4593 +tri -4594 +ush -4595 +uss -4596 +uud -4597 +uus -4598 +vas -4599 +cr -4600 +but -4601 +inh -4602 +lul -4603 +ous -4604 +raj -4605 +tip -4606 +uay -4607 +wor -4608 +yol -4609 +▁ep -4610 +▁ex -4611 +▁ga -4612 +▁gy -4613 +▁hy -4614 +▁py -4615 +▁tk -4616 +▁ww -4617 +▁yb -4618 +acam -4619 +adil -4620 +agad -4621 +akir -4622 +aksi -4623 +alil -4624 +ance -4625 +anth -4626 +apis -4627 +apus -4628 +araf -4629 +arec -4630 +arki -4631 +asir -4632 +ayas -4633 +cret -4634 +demb -4635 +dygu -4636 +ebat -4637 +ebra -4638 +edit -4639 +eger -4640 +eikh -4641 +ejal -4642 +ejar -4643 +elua -4644 +elud -4645 +emui -4646 +epor -4647 +eril -4648 +erso -4649 +esen -4650 +esol -4651 +ette -4652 +face -4653 +hadi -4654 +heim -4655 +ifik -4656 +ildr -4657 +ilis -4658 +inyw -4659 +irai -4660 +iran -4661 +irap -4662 +irus -4663 +itak -4664 +itte -4665 +iung -4666 +kela -4667 +khus -4668 +kira -4669 +koyo -4670 +kuih -4671 +llah -4672 +lusk -4673 +mana -4674 +mara -4675 +mpau -4676 +ngon -4677 +oany -4678 +ojok -4679 +omad -4680 +osos -4681 +ouis -4682 +outh -4683 +pace -4684 +pent -4685 +poda -4686 +rani -4687 +riap -4688 +ropa -4689 +saha -4690 +sara -4691 +siar -4692 +somp -4693 +tiup -4694 +uage -4695 +ueen -4696 +uket -4697 +ulen -4698 +ulti -4699 +gag -4700 +gos -4701 +uit -4702 +adir -4703 +agum -4704 +asis -4705 +dong -4706 +elch -4707 +elim -4708 +emit -4709 +ipat -4710 +ital -4711 +itis -4712 +simp -4713 +uitm -4714 +ulus -4715 +uska -4716 +usti -4717 +wiec -4718 +work -4719 +yian -4720 +ysia -4721 +ytuk -4722 +zaki -4723 +zana -4724 +zecz -4725 +▁aco -4726 +▁act -4727 +▁ain -4728 +▁anc -4729 +▁ani -4730 +▁bia -4731 +▁boh -4732 +▁buw -4733 +▁car -4734 +▁cin -4735 +▁daw -4736 +▁dut -4737 +▁eur -4738 +▁fiq -4739 +▁gau -4740 +▁gek -4741 +▁gem -4742 +▁gut -4743 +▁hap -4744 +▁hem -4745 +▁hui -4746 +▁ilo -4747 +▁juj -4748 +▁kec -4749 +▁kij -4750 +▁kor -4751 +▁meb -4752 +▁myr -4753 +▁naj -4754 +▁ngo -4755 +▁npk -4756 +▁oro -4757 +▁pud -4758 +▁pyh -4759 +▁qun -4760 +▁sau -4761 +▁teg -4762 +▁tis -4763 +▁tor -4764 +▁ukt -4765 +▁vii -4766 +▁wit -4767 +▁www -4768 +abeth -4769 +about -4770 +adiam -4771 +agamu -4772 +aguay -4773 +aheim -4774 +ajang -4775 +akili -4776 +allul -4777 +amban -4778 +andan -4779 +anyol -4780 +areca -4781 +asana -4782 +atuan -4783 +aynag -4784 +azhad -4785 +baiki -4786 +bubur -4787 +capai -4788 +cetak -4789 +cipta -4790 +dusun -4791 +edion -4792 +egaan -4793 +ektif -4794 +elela -4795 +eliga -4796 +emand -4797 +enduo -4798 +ental -4799 +ich -4800 +lum -4801 +rak -4802 +uth -4803 +▁), -4804 +▁iu -4805 +adas -4806 +dium -4807 +ebus -4808 +emik -4809 +erit -4810 +eton -4811 +iche -4812 +jogo -4813 +line -4814 +▁iuc -4815 +▁nat -4816 +▁rez -4817 +▁rud -4818 +▁spm -4819 +eping -4820 +etnam -4821 +etrom -4822 +eyang -4823 +filum -4824 +garet -4825 +gincu -4826 +higas -4827 +ictor -4828 +ikasi -4829 +ikian -4830 +imoni -4831 +inaat -4832 +jawab -4833 +kface -4834 +kngon -4835 +langa -4836 +logot -4837 +luang -4838 +macam -4839 +mitte -4840 +nadil -4841 +nasir -4842 +nibaw -4843 +oklat -4844 +oksid -4845 +olleg -4846 +onjol -4847 +ontok -4848 +onton -4849 +oskel -4850 +ouise -4851 +palap -4852 +penek -4853 +pital -4854 +rings -4855 +rinyw -4856 +siapa -4857 +spesi -4858 +sudin -4859 +sunam -4860 +suntu -4861 +tapuk -4862 +tasan -4863 +teluk -4864 +trict -4865 +tyman -4866 +uasai -4867 +uatik -4868 +udera -4869 +uktur -4870 +ulian -4871 +untuk -4872 +untut -4873 +uppet -4874 +uraan -4875 +urang -4876 +ushpi -4877 +usnah -4878 +ustic -4879 +yanyi -4880 +yarat -4881 +zczec -4882 +zlina -4883 +▁abid -4884 +▁amel -4885 +▁amis -4886 +▁anna -4887 +▁apoh -4888 +▁awie -4889 +▁baol -4890 +▁bapt -4891 +▁benk -4892 +▁bent -4893 +▁bero -4894 +▁biak -4895 +▁biar -4896 +▁bios -4897 +▁braz -4898 +▁buka -4899 +wh -4900 +bia -4901 +dhi -4902 +ier -4903 +pur -4904 +reb -4905 +vak -4906 +yur -4907 +▁(" -4908 +▁ac -4909 +alma -4910 +amil -4911 +amis -4912 +gout -4913 +ierz -4914 +temu -4915 +ural -4916 +▁bij -4917 +▁dur -4918 +▁jur -4919 +▁rab -4920 +▁sad -4921 +▁sih -4922 +▁tay -4923 +amily -4924 +andal -4925 +ching -4926 +lovak -4927 +mamis -4928 +onial -4929 +osomo -4930 +strum -4931 +ungur -4932 +whist -4933 +▁alun -4934 +▁apun -4935 +▁bokh -4936 +▁boyo -4937 +▁cany -4938 +▁chik -4939 +▁choo -4940 +▁cony -4941 +▁dani -4942 +▁depr -4943 +▁desm -4944 +▁dipl -4945 +▁efis -4946 +▁gaai -4947 +▁gary -4948 +▁gauk -4949 +▁gekk -4950 +▁geor -4951 +▁gimb -4952 +▁grup -4953 +▁hani -4954 +▁hemb -4955 +▁huff -4956 +▁hugh -4957 +▁inth -4958 +▁iucn -4959 +▁john -4960 +▁khat -4961 +▁kind -4962 +▁koon -4963 +▁kuiz -4964 +▁kuno -4965 +▁laem -4966 +▁laos -4967 +▁lauk -4968 +▁lebo -4969 +▁liga -4970 +▁liha -4971 +▁linn -4972 +▁ltes -4973 +▁luat -4974 +▁mahe -4975 +▁mang -4976 +▁mast -4977 +▁mimi -4978 +▁minn -4979 +▁nadi -4980 +▁ngab -4981 +▁ngar -4982 +▁niab -4983 +▁niir -4984 +▁nike -4985 +▁nium -4986 +▁nius -4987 +▁nooh -4988 +▁nuur -4989 +▁oleg -4990 +▁otot -4991 +▁paip -4992 +▁panj -4993 +▁penu -4994 +▁perc -4995 +▁perl -4996 +▁perp -4997 +▁port -4998 +▁pugh -4999 +aam -5000 +ima -5001 +kul -5002 +tah -5003 +vil -5004 +▁mr -5005 +awat -5006 +ding -5007 +eeng -5008 +elit -5009 +ingu -5010 +ituh -5011 +jing -5012 +king -5013 +kule -5014 +peng -5015 +yana -5016 +▁bib -5017 +▁guk -5018 +▁hin -5019 +▁mis -5020 +▁tup -5021 +▁zir -5022 +alela -5023 +aring -5024 +diing -5025 +elemb -5026 +impin -5027 +minya -5028 +pengh -5029 +tahip -5030 +uleng -5031 +uling -5032 +ulung -5033 +wkule -5034 +▁berg -5035 +▁khor -5036 +▁lemb -5037 +▁miss -5038 +▁nona -5039 +▁pyhm -5040 +▁rara -5041 +▁rcti -5042 +▁renc -5043 +▁rosn -5044 +▁rsul -5045 +▁sari -5046 +▁sauh -5047 +▁sema -5048 +▁sigo -5049 +▁silv -5050 +▁takl -5051 +▁tasy -5052 +▁teba -5053 +▁tech -5054 +▁test -5055 +▁tisu -5056 +▁toro -5057 +▁trip -5058 +▁tuas -5059 +▁unik -5060 +▁wari -5061 +abagai -5062 +adijah -5063 +agendo -5064 +ahawan -5065 +ajahmu -5066 +akulti -5067 +alaman -5068 +almaah -5069 +alonan -5070 +amaele -5071 +amping -5072 +angkul -5073 +angsar -5074 +antaan -5075 +anthus -5076 +anyang -5077 +aptera -5078 +chheim -5079 +dangai -5080 +dembua -5081 +densia -5082 +dhiyah -5083 +ebaran -5084 +ecapai -5085 +eijing -5086 +ejelis -5087 +elaksa -5088 +elergi -5089 +ematik -5090 +embara -5091 +emoson -5092 +empuan -5093 +endaki -5094 +ermann -5095 +ernaan -5096 +ertebr -5097 +eviche -5098 +foloji -5099 +"- -5100 +va -5101 +/"- -5102 +csi -5103 +cuk -5104 +hks -5105 +mud -5106 +ser -5107 +"/"- -5108 +elae -5109 +engs -5110 +eran -5111 +eris -5112 +hksn -5113 +java -5114 +mang -5115 +masi -5116 +▁bac -5117 +▁rut -5118 +▁sha -5119 +cukil -5120 +emler -5121 +irang -5122 +landa -5123 +osisi -5124 +urinh -5125 +ustas -5126 +ymeso -5127 +▁bach -5128 +▁berw -5129 +▁left -5130 +▁meth -5131 +▁ucsi -5132 +ansong -5133 +eeting -5134 +ejaruk -5135 +ekaran -5136 +elmaan -5137 +formes -5138 +gaggan -5139 +genous -5140 +gibang -5141 +hampar -5142 +hidang -5143 +ildren -5144 +inding -5145 +ingkot -5146 +intern -5147 +istari -5148 +jotobo -5149 +lampau -5150 +laysia -5151 +majlis -5152 +market -5153 +mawati -5154 +morfem -5155 +nganak -5156 +nology -5157 +oksida -5158 +omatik -5159 +omposi -5160 +ongsot -5161 +otogen -5162 +paceum -5163 +pemand -5164 +penghu -5165 +pentes -5166 +pinggo -5167 +polani -5168 +rajaya -5169 +rebung -5170 +rikula -5171 +runduk -5172 +seguul -5173 +serama -5174 +sidium -5175 +sihksn -5176 +smouth -5177 +suddin -5178 +tebrat -5179 +tinaat -5180 +uistik -5181 +untung -5182 +uppets -5183 +uruheb -5184 +ussala -5185 +ustics -5186 +▁aeril -5187 +▁agnez -5188 +▁aktip -5189 +▁asyur -5190 +▁atlas -5191 +▁azura -5192 +▁bahru -5193 +▁belut -5194 +▁berak -5195 +▁berri -5196 +▁biarp -5197 +▁bingk -5198 +▁birch -5199 +mt -5200 +ph -5201 +akb -5202 +ers -5203 +hol -5204 +kek -5205 +opa -5206 +ose -5207 +uku -5208 +▁fa -5209 +alon -5210 +laan -5211 +lopa -5212 +oken -5213 +pait -5214 +usin -5215 +▁afo -5216 +▁anw -5217 +▁kod -5218 +▁umb -5219 +▁zar -5220 +akbar -5221 +angai -5222 +clopa -5223 +erson -5224 +holic -5225 +khati -5226 +lerai -5227 +▁afov -5228 +▁book -5229 +▁dpmt -5230 +▁kila -5231 +▁koot -5232 +▁kuku -5233 +▁madu -5234 +akanda -5235 +ancang -5236 +elamin -5237 +ishamm -5238 +khatib -5239 +omoter -5240 +onokan -5241 +philis -5242 +tisari -5243 +utawak -5244 +▁anwar -5245 +▁berun -5246 +▁books -5247 +▁bovid -5248 +▁brach -5249 +▁bumba -5250 +▁buwau -5251 +▁calon -5252 +▁citra -5253 +▁curah -5254 +▁dadar -5255 +▁dakwa -5256 +▁dayag -5257 +▁dekan -5258 +▁dengs -5259 +▁depok -5260 +▁derur -5261 +▁desar -5262 +▁detik -5263 +▁dibub -5264 +▁didap -5265 +▁dikir -5266 +▁diput -5267 +▁disif -5268 +▁durah -5269 +▁dutai -5270 +▁ellum -5271 +▁engam -5272 +▁enzim -5273 +▁expan -5274 +▁felis -5275 +▁fenem -5276 +▁fenom -5277 +▁fiqah -5278 +▁fizik -5279 +▁getah -5280 +▁gorop -5281 +▁hadas -5282 +▁hairi -5283 +▁hanif -5284 +▁hebat -5285 +▁hemid -5286 +▁hijau -5287 +▁hindi -5288 +▁hyper -5289 +▁ihram -5290 +▁ihsan -5291 +▁infin -5292 +▁infra -5293 +▁iropa -5294 +▁itali -5295 +▁jamil -5296 +▁jangk -5297 +▁juhur -5298 +▁kagum -5299 +aq -5300 +dn -5301 +anb -5302 +dnk -5303 +kun -5304 +nin -5305 +oen -5306 +uni -5307 +▁hb -5308 +▁ps -5309 +▁sq -5310 +▁vo -5311 +able -5312 +alat -5313 +aqad -5314 +erun -5315 +ihak -5316 +ilik -5317 +illi -5318 +khla -5319 +memb -5320 +oeni -5321 +ujud -5322 +▁jom -5323 +▁psb -5324 +▁wau -5325 +ahaja -5326 +edoom -5327 +ninak -5328 +oenix -5329 +pusal -5330 +ulhij -5331 +▁kaor -5332 +▁kdnk -5333 +▁maal -5334 +▁niut -5335 +▁panc -5336 +▁psbs -5337 +▁rawa -5338 +ankton -5339 +antiti -5340 +asiswa -5341 +ausaha -5342 +higasu -5343 +illiam -5344 +tangan -5345 +tisuni -5346 +▁buluh -5347 +▁kajap -5348 +▁kaoru -5349 +▁kapor -5350 +▁kelok -5351 +▁kelul -5352 +▁kenek -5353 +▁keras -5354 +▁kesan -5355 +▁ketio -5356 +▁ketul -5357 +▁kilat -5358 +▁kinel -5359 +▁komun -5360 +▁korok -5361 +▁kulat -5362 +▁kupuk -5363 +▁kurta -5364 +▁kusus -5365 +▁lemba -5366 +▁lepor -5367 +▁lingk -5368 +▁logam -5369 +▁logot -5370 +▁luang -5371 +▁lulus -5372 +▁lumba -5373 +▁magri -5374 +▁magus -5375 +▁malek -5376 +▁malim -5377 +▁marin -5378 +▁masak -5379 +▁masap -5380 +▁masau -5381 +▁masay -5382 +▁maser -5383 +▁medan -5384 +▁medir -5385 +▁megah -5386 +▁megat -5387 +▁mercu -5388 +▁metal -5389 +▁mimpi -5390 +▁monic -5391 +▁mosos -5392 +▁mufti -5393 +▁munaf -5394 +▁najwa -5395 +▁napis -5396 +▁narik -5397 +▁natri -5398 +▁nebar -5399 +.. -5400 +km -5401 +ny -5402 +yn -5403 +zb -5404 +’. -5405 +lin -5406 +lyn -5407 +nan -5408 +pri -5409 +zah -5410 +▁tp -5411 +arut -5412 +bukk -5413 +egik -5414 +ekom -5415 +itam -5416 +jamb -5417 +pula -5418 +zbek -5419 +zzah -5420 +▁doa -5421 +▁fot -5422 +▁mab -5423 +▁sai -5424 +▁tai -5425 +adzam -5426 +cinta -5427 +eclin -5428 +endak -5429 +johor -5430 +nulis -5431 +ontin -5432 +osint -5433 +prion -5434 +ungsi -5435 +▁antu -5436 +▁bern -5437 +▁gang -5438 +▁kaam -5439 +▁maba -5440 +▁muaw -5441 +▁said -5442 +▁tela -5443 +▁tele -5444 +aranya -5445 +contin -5446 +ekomun -5447 +pulaun -5448 +sibukk -5449 +unsung -5450 +▁adnan -5451 +▁aiwan -5452 +▁basri -5453 +▁berik -5454 +▁datay -5455 +▁dekap -5456 +▁dikel -5457 +▁hidup -5458 +▁hitam -5459 +▁izzah -5460 +▁kedud -5461 +▁neelo -5462 +▁nendo -5463 +▁nesol -5464 +▁netap -5465 +▁ngetu -5466 +▁ngias -5467 +▁nikah -5468 +▁nilep -5469 +▁ningo -5470 +▁numur -5471 +▁order -5472 +▁patuh -5473 +▁pedus -5474 +▁pejab -5475 +▁pelid -5476 +▁penan -5477 +▁penau -5478 +▁penet -5479 +▁penoh -5480 +▁penok -5481 +▁penuh -5482 +▁penut -5483 +▁phang -5484 +▁pikir -5485 +▁pinad -5486 +▁pinat -5487 +▁pitri -5488 +▁plate -5489 +▁pokok -5490 +▁press -5491 +▁prima -5492 +▁pudar -5493 +▁punso -5494 +▁pyhmy -5495 +▁queen -5496 +▁qunut -5497 +▁raden -5498 +▁radin -5499 +fk -5500 +yc -5501 +’, -5502 +bdr -5503 +jam -5504 +olv -5505 +rid -5506 +sad -5507 +sim -5508 +▁eh -5509 +▁gh -5510 +elek -5511 +etek -5512 +otor -5513 +oung -5514 +siti -5515 +ycle -5516 +▁grs -5517 +▁kaf -5518 +▁kon -5519 +alang -5520 +jamal -5521 +musim -5522 +olver -5523 +ounge -5524 +parai -5525 +ridge -5526 +ubual -5527 +▁cate -5528 +▁irom -5529 +▁kafe -5530 +▁konf -5531 +▁nara -5532 +▁nggo -5533 +▁niuk -5534 +▁rung -5535 +▁shai -5536 +▁syed -5537 +▁tema -5538 +▁temu -5539 +▁ters -5540 +▁zait -5541 +▁zodi -5542 +anghai -5543 +ushpig -5544 +yalang -5545 +▁arsad -5546 +▁cetek -5547 +▁cycle -5548 +▁darul -5549 +▁ehwal -5550 +▁garpu -5551 +▁ghaib -5552 +▁ngiro -5553 +▁ninea -5554 +▁redie -5555 +▁rediu -5556 +▁retis -5557 +▁ruddy -5558 +▁sabar -5559 +▁samad -5560 +▁satun -5561 +▁saudi -5562 +▁sciur -5563 +▁sebag -5564 +▁sebap -5565 +▁secti -5566 +▁segay -5567 +▁seisi -5568 +▁sejah -5569 +▁semim -5570 +▁semul -5571 +▁seram -5572 +▁sigol -5573 +▁sihat -5574 +▁sijil -5575 +▁simin -5576 +▁sipak -5577 +▁sipit -5578 +▁sirap -5579 +▁sivil -5580 +▁somad -5581 +▁squar -5582 +▁stage -5583 +▁strat -5584 +▁subda -5585 +▁swafo -5586 +▁tabii -5587 +▁tangk -5588 +▁taraf -5589 +▁tarud -5590 +▁teenn -5591 +▁tegas -5592 +▁terap -5593 +▁terat -5594 +▁terim -5595 +▁teruk -5596 +▁thong -5597 +▁timpa -5598 +▁tirai -5599 +/. -5600 +av -5601 +mk -5602 +hat -5603 +him -5604 +jak -5605 +med -5606 +sha -5607 +aliq -5608 +avon -5609 +bija -5610 +buas -5611 +gian -5612 +ihah -5613 +kasi -5614 +piyo -5615 +tiar -5616 +▁lee -5617 +▁nan -5618 +▁oan -5619 +▁puj -5620 +▁sah -5621 +▁sef -5622 +allah -5623 +chong -5624 +ibril -5625 +rumah -5626 +sifat -5627 +▁babi -5628 +▁bird -5629 +ekutif -5630 +hatian -5631 +itaran -5632 +linton -5633 +ombong -5634 +pingat -5635 +ulaman -5636 +usiaan -5637 +▁ajian -5638 +▁bgian -5639 +▁bobok -5640 +▁faliq -5641 +▁imjak -5642 +▁ipiyo -5643 +▁lanan -5644 +▁peeng -5645 +▁putri -5646 +▁sahel -5647 +▁sefal -5648 +▁titoo -5649 +▁tojok -5650 +▁tokyo -5651 +▁tomoh -5652 +▁torom -5653 +▁uktub -5654 +▁umbut -5655 +▁union -5656 +▁wahyu -5657 +▁wakap -5658 +▁walau -5659 +▁welch -5660 +▁wierz -5661 +▁wujud -5662 +▁yaudi -5663 +▁zarai -5664 +abangsa -5665 +ahallul -5666 +andusun -5667 +anugera -5668 +asarkan -5669 +atangan -5670 +binanan -5671 +bisiana -5672 +bukhari -5673 +calunan -5674 +chowiec -5675 +ciation -5676 +ciptaan -5677 +cyclopa -5678 +dyguard -5679 +ebabkan -5680 +eclinic -5681 +edangan -5682 +empitan -5683 +empuyat -5684 +engkaan -5685 +eolinae -5686 +eratkan -5687 +eringgi -5688 +erundau -5689 +ettyman -5690 +eyangni -5691 +gelisah -5692 +hadiran -5693 +hampton -5694 +ilopoda -5695 +imewaan -5696 +iruddin -5697 +izabeth -5698 +jajahan -5699 +], -5700 +fn -5701 +jh -5702 +uz -5703 +▁] -5704 +▁ا -5705 +ane -5706 +asb -5707 +jan -5708 +pin -5709 +utk -5710 +▁bs -5711 +▁fm -5712 +adem -5713 +anai -5714 +baan -5715 +dana -5716 +inat -5717 +jhea -5718 +jung -5719 +onsv -5720 +ulah -5721 +ulla -5722 +▁bod -5723 +▁bsc -5724 +▁cua -5725 +▁kru -5726 +▁qib -5727 +▁rai -5728 +▁run -5729 +▁tau -5730 +ademy -5731 +anduk -5732 +angap -5733 +asbih -5734 +canai -5735 +endul -5736 +engai -5737 +isten -5738 +▁bens -5739 +▁chen -5740 +▁itam -5741 +▁kaji -5742 +▁lamb -5743 +▁mila -5744 +▁nias -5745 +▁send -5746 +ammmad -5747 +anteri -5748 +evised -5749 +lionsv -5750 +tanduk -5751 +ullail -5752 +▁bodym -5753 +▁cuaca -5754 +▁dapat -5755 +▁diane -5756 +▁hasli -5757 +▁kasar -5758 +▁murid -5759 +▁pesta -5760 +▁ranau -5761 +▁tinat -5762 +▁unutk -5763 +▁yaman -5764 +academy -5765 +endulek -5766 +jawatan -5767 +jheains -5768 +kekasih -5769 +kerjoon -5770 +lakunan -5771 +lampung -5772 +lovakia -5773 +menurut -5774 +mutawak -5775 +nadilla -5776 +ollegev -5777 +oroplas -5778 +pelamin -5779 +perdana -5780 +pinosok -5781 +politon -5782 +rancang -5783 +revised -5784 +riapoda -5785 +sarawak -5786 +sebagai -5787 +spesies -5788 +tabilan -5789 +telepon -5790 +tenomon -5791 +unggaan -5792 +unggong -5793 +unsulan -5794 +untukmu -5795 +ussalam -5796 +venteen -5797 +yanyian -5798 +zczecin -5799 +bk -5800 +kk -5801 +ze -5802 +▁; -5803 +amg -5804 +bal -5805 +fed -5806 +jik -5807 +rep -5808 +anuk -5809 +aziz -5810 +eten -5811 +irah -5812 +ster -5813 +▁gyp -5814 +▁hil -5815 +▁jbk -5816 +▁tid -5817 +▁yen -5818 +ammum -5819 +butir -5820 +feder -5821 +lobal -5822 +uanya -5823 +ulaai -5824 +▁iris -5825 +▁iyoo -5826 +▁kili -5827 +▁ning -5828 +▁pamg -5829 +▁zila -5830 +azizah -5831 +erasan -5832 +pospol -5833 +report -5834 +unakan -5835 +▁babag -5836 +▁dahan -5837 +▁gypsi -5838 +▁hajar -5839 +▁harib -5840 +▁kubis -5841 +▁lanun -5842 +▁lawak -5843 +▁liamu -5844 +▁mujik -5845 +▁music -5846 +▁nguri -5847 +▁nibaw -5848 +▁sarap -5849 +▁sukar -5850 +▁tidak -5851 +▁titze -5852 +▁tular -5853 +emubual -5854 +federal -5855 +utamani -5856 +▁abidin -5857 +▁acumin -5858 +▁almari -5859 +▁almbum -5860 +▁amelia -5861 +▁annuus -5862 +▁antart -5863 +▁asmara -5864 +▁atauwa -5865 +▁baraan -5866 +▁bendah -5867 +▁bentle -5868 +▁beramu -5869 +▁berhar -5870 +▁berhij -5871 +▁berket -5872 +▁berkum -5873 +▁bermak -5874 +▁bernam -5875 +▁brachy -5876 +▁brazil -5877 +▁buddha -5878 +▁candas -5879 +▁canyon -5880 +▁catech -5881 +▁citrus -5882 +▁dabang -5883 +▁dakwah -5884 +▁danial -5885 +▁daynag -5886 +▁deakin -5887 +▁desaru -5888 +▁dijadi -5889 +▁doktor -5890 +▁durabi -5891 +▁dwngan -5892 +▁editor -5893 +▁edward -5894 +▁elergi -5895 +▁emerit -5896 +▁endiam -5897 +▁entelo -5898 +▁fairuz -5899 +pd -5900 +tr -5901 +are -5902 +hak -5903 +rar -5904 +sam -5905 +wet -5906 +arin -5907 +atai -5908 +bica -5909 +epak -5910 +erot -5911 +inar -5912 +kong -5913 +libu -5914 +maha -5915 +mila -5916 +usni -5917 +▁hot -5918 +▁phd -5919 +arkan -5920 +ecare -5921 +egala -5922 +erapa -5923 +musta -5924 +riana -5925 +tanya -5926 +ybica -5927 +▁astr -5928 +▁disk -5929 +▁masu -5930 +▁quba -5931 +▁rodh -5932 +▁stil -5933 +▁suai -5934 +epatai -5935 +itelia -5936 +samrah -5937 +segala -5938 +yokong -5939 +▁ikrar -5940 +▁ishak -5941 +▁juhar -5942 +▁junya -5943 +▁kisah -5944 +▁lantu -5945 +▁mesyu -5946 +▁osama -5947 +▁perab -5948 +▁punca -5949 +▁regis -5950 +▁round -5951 +▁subai -5952 +eyampai -5953 +ilipina -5954 +wetland -5955 +▁duktur -5956 +▁family -5957 +▁formal -5958 +▁gangsa -5959 +▁gelang -5960 +▁geliga -5961 +▁george -5962 +▁gilang -5963 +▁global -5964 +▁gundah -5965 +▁gutasi -5966 +▁haliza -5967 +▁haplor -5968 +▁hilang -5969 +▁hillar -5970 +▁hormat -5971 +▁howard -5972 +▁ibrani -5973 +▁irawan -5974 +▁jaclyn -5975 +▁jangan -5976 +▁jangka -5977 +▁jaring -5978 +▁jerman -5979 +▁jibril -5980 +▁judeth -5981 +▁jungur -5982 +▁jurdan -5983 +▁kajian -5984 +▁kanada -5985 +▁kandis -5986 +▁kanvas -5987 +▁kebios -5988 +▁keenom -5989 +▁keeten -5990 +▁kement -5991 +▁kempen -5992 +▁kenuri -5993 +▁kerabu -5994 +▁kereso -5995 +▁khorat -5996 +▁kiroon -5997 +▁kursus -5998 +▁lambat -5999 +.( -6000 +cep -6001 +get -6002 +ipa -6003 +ner -6004 +ory -6005 +ras -6006 +roz -6007 +▁uk -6008 +▁ym -6009 +aron -6010 +ceps -6011 +iana -6012 +inah -6013 +isya -6014 +paut -6015 +raci -6016 +ukri -6017 +▁aed -6018 +▁bla -6019 +▁cop -6020 +▁jak -6021 +▁nij -6022 +ampay -6023 +ejaam -6024 +esota -6025 +etrok -6026 +racie -6027 +▁chai -6028 +▁daat -6029 +▁dati -6030 +▁fono -6031 +▁gawa -6032 +▁kemb -6033 +▁kipa -6034 +▁luka -6035 +▁tuak -6036 +▁ukra -6037 +bitkan -6038 +cawang -6039 +istory -6040 +puddin -6041 +rozita -6042 +sunami -6043 +tunduk -6044 +umbuan -6045 +ustras -6046 +▁aedes -6047 +▁black -6048 +▁dihab -6049 +▁joget -6050 +▁lanka -6051 +▁megun -6052 +▁namun -6053 +▁nyawa -6054 +▁patut -6055 +▁tekok -6056 +▁ziana -6057 +aniceps -6058 +berpaut -6059 +etroket -6060 +▁alisya -6061 +▁awalni -6062 +▁beduso -6063 +▁bujang -6064 +▁daatuk -6065 +▁dejaam -6066 +▁fonolo -6067 +▁genuul -6068 +▁jenner -6069 +▁kepitu -6070 +▁lembah -6071 +▁lesung -6072 +▁liyana -6073 +▁louise -6074 +▁lounge -6075 +▁ltesbi -6076 +▁lulaai -6077 +▁lybica -6078 +▁madiam -6079 +▁magrib -6080 +▁mahmud -6081 +▁malayu -6082 +▁mandat -6083 +▁mangam -6084 +▁marsha -6085 +▁masang -6086 +▁medaan -6087 +▁mekale -6088 +▁meluak -6089 +▁memula -6090 +▁menara -6091 +▁mentor -6092 +▁merbah -6093 +▁merent -6094 +▁meroto -6095 +▁mesele -6096 +▁meturi -6097 +▁mindam -6098 +▁monica -6099 +ye -6100 +’- -6101 +aau -6102 +kau -6103 +rai -6104 +sek -6105 +tok -6106 +▁hi -6107 +▁ja -6108 +adik -6109 +alik -6110 +anid -6111 +ekad -6112 +esis -6113 +kiye -6114 +oman -6115 +ukuk -6116 +utub -6117 +yaak -6118 +▁bou -6119 +▁kai -6120 +▁pom -6121 +▁tum -6122 +empor -6123 +itusi -6124 +kedah -6125 +rkiye -6126 +tokoh -6127 +unsai -6128 +utube -6129 +▁amir -6130 +▁beni -6131 +▁diak -6132 +▁enom -6133 +▁high -6134 +▁lant -6135 +▁liza -6136 +▁lori -6137 +▁naat -6138 +▁olok -6139 +▁rasa -6140 +▁saka -6141 +▁sapa -6142 +▁sisa -6143 +▁suma -6144 +▁yaak -6145 +anidah -6146 +ingkap -6147 +kelapa -6148 +▁agnes -6149 +▁begen -6150 +▁hatik -6151 +▁keben -6152 +▁linut -6153 +▁pilah -6154 +▁rawan -6155 +▁rzecz -6156 +▁setat -6157 +▁tahil -6158 +▁terag -6159 +▁turai -6160 +beradik -6161 +intahan -6162 +▁bahaya -6163 +▁belaku -6164 +▁berpad -6165 +▁bersih -6166 +▁gangau -6167 +▁hatiku -6168 +▁kaitan -6169 +▁kardah -6170 +▁kiniro -6171 +▁lantai -6172 +▁menaau -6173 +▁montok -6174 +▁murtad -6175 +▁musiba -6176 +▁nabawi -6177 +▁nampar -6178 +▁nandas -6179 +▁nangis -6180 +▁nasiat -6181 +▁nawawi -6182 +▁nawwal -6183 +▁nempah -6184 +▁nempol -6185 +▁nerang -6186 +▁nerond -6187 +▁ngabur -6188 +▁ngadir -6189 +▁ngakui -6190 +▁ngalak -6191 +▁ngangk -6192 +▁ngapit -6193 +▁ngebin -6194 +▁ngejej -6195 +▁ngelaf -6196 +▁ngeraa -6197 +▁ngerso -6198 +▁ngetan -6199 +ee -6200 +aly -6201 +hah -6202 +iol -6203 +rob -6204 +sup -6205 +vor -6206 +▁ov -6207 +afah -6208 +alak -6209 +andi -6210 +asap -6211 +atal -6212 +ekon -6213 +erak -6214 +etin -6215 +pata -6216 +ukut -6217 +▁ngu -6218 +▁pur -6219 +▁zee -6220 +akhta -6221 +endol -6222 +ibadi -6223 +nivor -6224 +onsom -6225 +▁alor -6226 +▁booh -6227 +▁daly -6228 +▁elak -6229 +▁papa -6230 +▁taim -6231 +▁west -6232 +▁yong -6233 +▁zara -6234 +ransak -6235 +uanita -6236 +▁aisah -6237 +▁asmah -6238 +▁bulau -6239 +▁buyai -6240 +▁conto -6241 +▁imran -6242 +▁kenem -6243 +▁korob -6244 +▁kulam -6245 +▁matay -6246 +▁ngisi -6247 +▁ovari -6248 +▁peded -6249 +▁purba -6250 +▁putih -6251 +▁sabau -6252 +▁warak -6253 +duanita -6254 +langkau -6255 +▁aishah -6256 +▁cawang -6257 +▁cendol -6258 +▁dekiit -6259 +▁ngapal -6260 +▁nguasa -6261 +▁ngunsi -6262 +▁ngusek -6263 +▁niagad -6264 +▁niasas -6265 +▁nigeri -6266 +▁nigger -6267 +▁nikmat -6268 +▁nimbug -6269 +▁nimbun -6270 +▁nindok -6271 +▁niolok -6272 +▁niukir -6273 +▁niusah -6274 +▁niutus -6275 +▁nonong -6276 +▁normal -6277 +▁ongkob -6278 +▁online -6279 +▁padang -6280 +▁paning -6281 +▁panjut -6282 +▁pansak -6283 +▁pantas -6284 +▁pawang -6285 +▁pejata -6286 +▁pekaja -6287 +▁pelepa -6288 +▁pemapa -6289 +▁pemila -6290 +▁pemiri -6291 +▁pencap -6292 +▁penene -6293 +▁perind -6294 +▁perkub -6295 +▁pesigi -6296 +▁petemp -6297 +▁petiol -6298 +▁phuket -6299 +lr -6300 +aju -6301 +ngo -6302 +▁"- -6303 +▁wp -6304 +aria -6305 +asuk -6306 +ayah -6307 +dela -6308 +erlr -6309 +iaah -6310 +isla -6311 +left -6312 +ohiz -6313 +olos -6314 +onoh -6315 +▁cas -6316 +▁lah -6317 +▁rin -6318 +badan -6319 +bohiz -6320 +elang -6321 +ining -6322 +orban -6323 +▁aduk -6324 +▁berd -6325 +▁buay -6326 +▁diil -6327 +▁enko -6328 +▁kala -6329 +▁kasi -6330 +▁lusu -6331 +▁ruru -6332 +afiaah -6333 +asanya -6334 +buahan -6335 +eperlr -6336 +iningr -6337 +kempen -6338 +teraan -6339 +undati -6340 +▁badar -6341 +▁enngo -6342 +▁keket -6343 +▁keman -6344 +▁makau -6345 +▁malam -6346 +▁milan -6347 +▁niara -6348 +▁salam -6349 +▁solat -6350 +▁somon -6351 +▁tapuk -6352 +▁utoma -6353 +bahagia -6354 +bohizan -6355 +dangdut -6356 +islativ -6357 +oundati -6358 +ustasea -6359 +▁adukor -6360 +▁castan -6361 +▁dawlah -6362 +▁hashim -6363 +▁lionoh -6364 +▁memaju -6365 +▁nedion -6366 +▁ngesan -6367 +▁perala -6368 +▁piknik -6369 +▁piniko -6370 +▁plural -6371 +▁posisi -6372 +▁primat -6373 +▁pujian -6374 +▁qiblat -6375 +▁qurban -6376 +▁rahman -6377 +▁rangit -6378 +▁redieu -6379 +▁rejeki -6380 +▁republ -6381 +▁resort -6382 +▁restor -6383 +▁rezeki -6384 +▁rinong -6385 +▁rodham -6386 +▁rosmah -6387 +▁rungau -6388 +▁saloma -6389 +▁santik -6390 +▁secret -6391 +▁sedang -6392 +▁sejaht -6393 +▁selesa -6394 +▁sempur -6395 +▁semulo -6396 +▁separa -6397 +▁sesara -6398 +▁sesomo -6399 +epi -6400 +kla -6401 +ldm -6402 +que -6403 +tep -6404 +tuu -6405 +▁cm -6406 +agas -6407 +erik -6408 +ilah -6409 +ilot -6410 +indo -6411 +sepi -6412 +tldm -6413 +tong -6414 +▁cor -6415 +▁fit -6416 +▁jep -6417 +▁moj -6418 +▁yas -6419 +edean -6420 +ekilo -6421 +indak -6422 +klang -6423 +ongom -6424 +ongsi -6425 +▁dind -6426 +▁jasa -6427 +▁jawi -6428 +▁kino -6429 +▁nong -6430 +▁pene -6431 +▁pula -6432 +▁roti -6433 +▁ucap -6434 +▁wong -6435 +asihat -6436 +clique -6437 +klausa -6438 +langit -6439 +tekilo -6440 +tepung -6441 +▁asing -6442 +▁jepun -6443 +▁kemai -6444 +▁lahad -6445 +▁lindo -6446 +▁menak -6447 +▁paksa -6448 +▁paung -6449 +▁pilot -6450 +▁rashi -6451 +▁semua -6452 +▁setuu -6453 +▁tarik -6454 +adzilah -6455 +agaskar -6456 +▁benila -6457 +▁corbin -6458 +▁fitrah -6459 +▁kongsi -6460 +▁shaari -6461 +▁shaikh -6462 +▁shakir -6463 +▁sharon -6464 +▁sheikh -6465 +▁siaran -6466 +▁soklat -6467 +▁spoken -6468 +▁square -6469 +▁status -6470 +▁subjan -6471 +▁subjek -6472 +▁sumbul -6473 +▁sunnah -6474 +▁swasta -6475 +▁syafie -6476 +▁tablik -6477 +▁taiwan -6478 +▁tandek -6479 +▁tanjak -6480 +▁tarikh -6481 +▁taunan -6482 +▁tealap -6483 +▁tebing -6484 +▁tekale -6485 +▁tenaga -6486 +▁tendes -6487 +▁tepaam -6488 +▁tepene -6489 +▁teraat -6490 +▁teriko -6491 +▁tertib -6492 +▁thanon -6493 +▁tinaat -6494 +▁tindak -6495 +▁tinuju -6496 +▁tuavon -6497 +▁tumgan -6498 +▁tuntut -6499 +.' -6500 +eda -6501 +sui -6502 +tae -6503 +wid -6504 +yid -6505 +anci -6506 +anto -6507 +ants -6508 +berj -6509 +beta -6510 +heda -6511 +inya -6512 +mati -6513 +piah -6514 +pila -6515 +useh -6516 +wide -6517 +▁jul -6518 +▁mae -6519 +▁mio -6520 +▁pru -6521 +ikoon -6522 +naung -6523 +patri -6524 +ulata -6525 +umpal -6526 +yanto -6527 +▁adik -6528 +▁cina -6529 +▁nahu -6530 +▁nant -6531 +▁saya -6532 +▁tawa -6533 +▁with -6534 +estris -6535 +itinya -6536 +iyanto -6537 +yayang -6538 +yiddin -6539 +▁batam -6540 +▁bendi -6541 +▁bunga -6542 +▁dikeh -6543 +▁hanya -6544 +▁klong -6545 +▁makna -6546 +▁manik -6547 +▁mekar -6548 +▁mpila -6549 +▁nungk -6550 +▁panda -6551 +▁peleb -6552 +▁peluk -6553 +▁ragam -6554 +▁tawaf -6555 +▁unsui -6556 +egumpal -6557 +patrica -6558 +seniman -6559 +▁balada -6560 +▁benaja -6561 +▁beraya -6562 +▁julita -6563 +▁kopiah -6564 +▁minyak -6565 +▁niarah -6566 +▁pahala -6567 +▁pasang -6568 +▁plants -6569 +▁serama -6570 +▁stemui -6571 +▁twkule -6572 +▁umbisi -6573 +▁umpama -6574 +▁unytuk -6575 +▁victor -6576 +▁walaub -6577 +▁wetrom -6578 +▁wolver -6579 +▁yaakob -6580 +▁yassin -6581 +▁zaiton -6582 +▁zarith -6583 +▁zodiak -6584 +▁zulhij -6585 +agaimana -6586 +ahteraan -6587 +aidillah -6588 +amansara -6589 +ambridge -6590 +anugerah -6591 +bangunan -6592 +berbagai -6593 +berjanji -6594 +berunsai -6595 +ciptakan -6596 +ciptanya -6597 +emerlang -6598 +emporari -6599 +chi -6600 +▁ub -6601 +egub -6602 +etub -6603 +isar -6604 +lian -6605 +mama -6606 +pura -6607 +upus -6608 +yama -6609 +▁alg -6610 +▁mit -6611 +▁tie -6612 +agama -6613 +ajuan -6614 +alian -6615 +ambat -6616 +ayani -6617 +emput -6618 +ganis -6619 +michi -6620 +ubong -6621 +▁gaut -6622 +▁kali -6623 +▁ngep -6624 +▁oran -6625 +▁oyoh -6626 +▁pole -6627 +▁unta -6628 +angkum -6629 +aslian -6630 +egubdi -6631 +elahan -6632 +empute -6633 +▁anuar -6634 +▁bagus -6635 +▁disko -6636 +▁kelab -6637 +▁kolam -6638 +▁lupus -6639 +▁moden -6640 +▁rabia -6641 +▁senti -6642 +▁sinok -6643 +▁virus -6644 +angkang -6645 +angkung -6646 +ungaian -6647 +▁abisno -6648 +▁asalni -6649 +▁boyama -6650 +▁galian -6651 +▁heliza -6652 +▁kamarz -6653 +▁mentuk -6654 +▁ngepam -6655 +▁niubah -6656 +▁panama -6657 +▁pemama -6658 +▁perian -6659 +▁scienc -6660 +▁sekait -6661 +▁serata -6662 +▁setuut -6663 +▁tabiat -6664 +▁tunuan -6665 +ebelahan -6666 +empangan -6667 +endingan -6668 +eperlrgo -6669 +epulauan -6670 +eriksaan -6671 +etubuhan -6672 +filipina -6673 +ganisasi -6674 +impangan -6675 +inggeris -6676 +iningrad -6677 +islative -6678 +ismawati -6679 +kelantan -6680 +khususan -6681 +mangatan -6682 +michigan -6683 +national -6684 +oksidasi -6685 +olombang -6686 +oritinya -6687 +otogenic -6688 +pedagang -6689 +pencukil -6690 +philisti -6691 +polymeso -6692 +prionail -6693 +restoran -6694 +rikulasi -6695 +rosaceae -6696 +sandakan -6697 +saudagar -6698 +selangor -6699 +ual -6700 +agih -6701 +gawa -6702 +ille -6703 +kamb -6704 +▁mic -6705 +▁ogo -6706 +ambar -6707 +kerjo -6708 +▁amri -6709 +▁capr -6710 +▁duet -6711 +▁isan -6712 +▁isbn -6713 +▁isya -6714 +▁maha -6715 +▁maju -6716 +▁namb -6717 +▁ujud -6718 +▁wawa -6719 +▁zizi -6720 +antawa -6721 +nabawi -6722 +▁babun -6723 +▁benih -6724 +▁bibas -6725 +▁endil -6726 +▁jalil -6727 +▁keban -6728 +▁latik -6729 +▁menap -6730 +▁menek -6731 +▁menep -6732 +▁micro -6733 +▁murni -6734 +▁ngito -6735 +▁uzbek -6736 +▁wahid -6737 +▁zizie -6738 +kambing -6739 +▁diraya -6740 +▁hambar -6741 +▁kemual -6742 +▁megaya -6743 +▁menoon -6744 +▁nambah -6745 +▁penamb -6746 +▁penyar -6747 +▁perjal -6748 +▁singki -6749 +▁tenusu -6750 +angkutan -6751 +sompoton -6752 +strument -6753 +tropical -6754 +ubungkan -6755 +undangan -6756 +ungjawab -6757 +ungkinan -6758 +urangkan -6759 +usnahkan -6760 +wayatkan -6761 +▁akuatik -6762 +▁alallah -6763 +▁algebra -6764 +▁anaheim -6765 +▁ancaman -6766 +▁animasi -6767 +▁antaran -6768 +▁bahagia -6769 +▁bahanda -6770 +▁bantuan -6771 +▁baptist -6772 +▁bekawan -6773 +▁bekawin -6774 +▁belakar -6775 +▁belarus -6776 +▁belilik -6777 +▁beliung -6778 +▁bendang -6779 +▁bendera -6780 +▁bendira -6781 +▁benkayu -6782 +▁bensana -6783 +▁bentley -6784 +▁beragam -6785 +▁berguno -6786 +▁berjaya -6787 +▁bermain -6788 +▁bermula -6789 +▁berrien -6790 +▁bersaiz -6791 +▁berseri -6792 +▁bersolo -6793 +▁bersomo -6794 +▁bertiup -6795 +▁berumur -6796 +▁betitik -6797 +▁betonom -6798 +▁biarpun -6799 +edu -6800 +lui -6801 +▁," -6802 +aini -6803 +anja -6804 +asaa -6805 +esok -6806 +inea -6807 +ioes -6808 +luah -6809 +luak -6810 +sapi -6811 +tani -6812 +upak -6813 +▁aim -6814 +▁ned -6815 +▁yoo -6816 +gayad -6817 +lembu -6818 +nanak -6819 +soton -6820 +▁biru -6821 +▁duek -6822 +▁gitu -6823 +▁lalu -6824 +▁nada -6825 +▁napi -6826 +▁nejo -6827 +▁same -6828 +▁urus -6829 +▁waja -6830 +▁zafr -6831 +anking -6832 +eligan -6833 +epasan -6834 +lembut -6835 +raniza -6836 +tanian -6837 +umpama -6838 +▁belen -6839 +▁bilah -6840 +▁kamil -6841 +▁kinan -6842 +▁kiner -6843 +▁luara -6844 +▁pales -6845 +▁pardh -6846 +▁samaa -6847 +▁siput -6848 +▁tahan -6849 +▁tahap -6850 +▁tahir -6851 +andaran -6852 +emparan -6853 +kembung -6854 +▁alatan -6855 +▁asahan -6856 +▁begoon -6857 +▁beleng -6858 +▁berkat -6859 +▁botani -6860 +▁chelae -6861 +▁guinea -6862 +▁kalori -6863 +▁kelang -6864 +▁lumpis -6865 +▁minggu -6866 +▁ngundi -6867 +▁oranga -6868 +▁pardhu -6869 +▁pengen -6870 +▁sameon -6871 +▁zafrel -6872 +▁begerak -6873 +▁belanja -6874 +▁binanci -6875 +▁binarak -6876 +▁binayad -6877 +▁binekat -6878 +▁bingkai -6879 +▁binuang -6880 +▁biology -6881 +▁bistari -6882 +▁boijadi -6883 +▁buasboi -6884 +▁budiman -6885 +▁bumbung -6886 +▁catechu -6887 +▁cinipta -6888 +▁clinton -6889 +▁composi -6890 +▁convent -6891 +▁dekapod -6892 +▁delaksa -6893 +▁dembuak -6894 +▁demingu -6895 +▁dendang -6896 +▁derurat -6897 +▁desmond -6898 +▁dicanai -6899 +”. -6900 +oob -6901 +eluh -6902 +enuh -6903 +gaan -6904 +kiti -6905 +viti -6906 +▁akh -6907 +▁noo -6908 +▁rer -6909 +▁ufo -6910 +akiti -6911 +akoob -6912 +ampap -6913 +katan -6914 +ketul -6915 +zakat -6916 +▁awan -6917 +▁baga -6918 +▁buku -6919 +▁datu -6920 +▁enak -6921 +▁enau -6922 +▁gaji -6923 +▁jaji -6924 +▁kool -6925 +▁miyo -6926 +▁this -6927 +▁thur -6928 +▁tina -6929 +alahan -6930 +ampang -6931 +▁abadi -6932 +▁deeng -6933 +▁iklim -6934 +▁indak -6935 +▁kabul -6936 +▁keamp -6937 +▁kesej -6938 +▁malus -6939 +▁riada -6940 +▁tenah -6941 +▁tinum -6942 +gantian -6943 +hargaan -6944 +suratan -6945 +▁akhtar -6946 +▁dikaji -6947 +▁habsah -6948 +▁indung -6949 +▁kesimp -6950 +▁mangga -6951 +▁memang -6952 +▁pantun -6953 +▁sampar -6954 +▁sampin -6955 +▁sampit -6956 +▁selong -6957 +▁tejadi -6958 +▁tenata -6959 +▁tinara -6960 +▁yakoob -6961 +andarkan -6962 +pulaunya -6963 +▁azharin -6964 +▁balajar -6965 +▁dicapai -6966 +▁didakwa -6967 +▁dihabis -6968 +▁dijalan -6969 +▁dikenal -6970 +▁dilanda -6971 +▁dimakan -6972 +▁dimulai -6973 +▁dinagih -6974 +▁dindang -6975 +▁editors -6976 +▁elektif -6977 +▁embelud -6978 +▁embukut -6979 +▁empuyat -6980 +▁endilud -6981 +▁englang -6982 +▁entegah -6983 +▁eurasia -6984 +▁expansa -6985 +▁fakulti -6986 +▁farmasi -6987 +▁fazelah -6988 +▁fikiran -6989 +▁finance -6990 +▁fizikal -6991 +▁formasi -6992 +▁gagasan -6993 +▁ganyang -6994 +▁genelar -6995 +▁gerakan -6996 +▁geroton -6997 +▁ginalak -6998 +▁ginelar -6999 +jw -7000 +pod -7001 +sea -7002 +masa -7003 +nica -7004 +pida -7005 +ratu -7006 +▁hib -7007 +▁jjw -7008 +▁nih -7009 +▁pec -7010 +▁rou -7011 +▁smk -7012 +eline -7013 +kanul -7014 +perak -7015 +perlu -7016 +quran -7017 +sesok -7018 +suang -7019 +▁adam -7020 +▁baya -7021 +▁fasa -7022 +▁grah -7023 +▁inah -7024 +▁misi -7025 +▁natu -7026 +▁pita -7027 +▁suez -7028 +anggah -7029 +anggam -7030 +angguh -7031 +anggup -7032 +annica -7033 +gayung -7034 +nabang -7035 +tropod -7036 +uturan -7037 +▁adjek -7038 +▁aisha -7039 +▁bahai -7040 +▁bahas -7041 +▁husna -7042 +▁lebin -7043 +▁lemon -7044 +▁nampu -7045 +▁parak -7046 +▁parsi -7047 +▁pecut -7048 +▁pemek -7049 +▁pemen -7050 +▁sakit -7051 +▁sawah -7052 +▁sekap -7053 +▁starz -7054 +▁telok -7055 +queline -7056 +▁bicara -7057 +▁graham -7058 +▁handal -7059 +▁island -7060 +▁kluang -7061 +▁salmah -7062 +▁sunari -7063 +▁tebata -7064 +▁tebuan -7065 +▁tebuat -7066 +▁teling -7067 +▁teliti -7068 +▁bijirin -7069 +▁graduan -7070 +▁guajava -7071 +▁gulasan -7072 +▁gypsies -7073 +▁hasliah -7074 +▁hemidac -7075 +▁hendrus -7076 +▁hermann -7077 +▁hiburan -7078 +▁hidayah -7079 +▁hidupan -7080 +▁hillary -7081 +▁hishamm -7082 +▁huffadz -7083 +▁hukuman -7084 +▁hussain -7085 +▁indomon -7086 +▁ingsang -7087 +▁integer -7088 +▁intelek -7089 +▁istilah -7090 +▁jakarta -7091 +▁jantung -7092 +▁jawapan -7093 +▁jelmaan -7094 +▁kadazan -7095 +▁kalagan -7096 +▁kalimat -7097 +▁kamilia -7098 +▁kangsar -7099 +phy -7100 +acau -7101 +ekan -7102 +kita -7103 +lica -7104 +opak -7105 +raga -7106 +ugis -7107 +▁gug -7108 +▁sug -7109 +mimon -7110 +parut -7111 +▁asso -7112 +▁booi -7113 +▁delu -7114 +▁dram -7115 +▁laku -7116 +▁leon -7117 +▁luth -7118 +▁serr -7119 +▁terh -7120 +▁utuk -7121 +ahanan -7122 +▁covid -7123 +▁idrus -7124 +▁iyang -7125 +▁kesha -7126 +▁mekir -7127 +▁merlu -7128 +▁ngeta -7129 +▁niena -7130 +▁pusik -7131 +▁regim -7132 +▁sepuh -7133 +▁sibor -7134 +▁sunda -7135 +▁tayar -7136 +▁terso -7137 +▁timah -7138 +▁timun -7139 +▁utaan -7140 +makanan -7141 +manapun -7142 +ortugis -7143 +pinggir -7144 +whistle -7145 +▁berman -7146 +▁buanan -7147 +▁deluut -7148 +▁dirojo -7149 +▁englis -7150 +▁murphy -7151 +▁ngacau -7152 +▁ngamal -7153 +▁pememb -7154 +▁pemuat -7155 +▁sepeng -7156 +▁sering -7157 +▁siguul -7158 +▁stesen -7159 +▁suasta -7160 +▁sugkak -7161 +▁tereso -7162 +jambatan -7163 +portugis -7164 +uruhebah -7165 +▁bersara -7166 +▁biosoni -7167 +▁bokhara -7168 +▁bushpig -7169 +▁english -7170 +▁gugusan -7171 +▁kawalan -7172 +▁kebawah -7173 +▁kecuali -7174 +▁keeping -7175 +▁kekayau -7176 +▁kelopak -7177 +▁kembali -7178 +▁kemboja -7179 +▁kenawin -7180 +▁kenerot -7181 +▁kepayas -7182 +▁kerabat -7183 +▁kerabaw -7184 +▁keteluh -7185 +▁ketioan -7186 +▁kinayau -7187 +▁kiniloh -7188 +▁kinisar -7189 +▁kinurik -7190 +▁kinusek -7191 +▁konflik -7192 +▁kontrak -7193 +▁kuburan -7194 +▁kuching -7195 +▁lambang -7196 +▁langgam -7197 +▁lapuran -7198 +▁lebaran -7199 +ipu -7200 +moh -7201 +umn -7202 +yio -7203 +ates -7204 +idea -7205 +ifah -7206 +mohd -7207 +ngit -7208 +tiwa -7209 +umup -7210 +▁doi -7211 +▁kac -7212 +▁kay -7213 +▁maz -7214 +▁otu -7215 +▁spa -7216 +bakat -7217 +itnya -7218 +pasal -7219 +ranen -7220 +▁adun -7221 +▁akir -7222 +▁arah -7223 +▁awah -7224 +▁chad -7225 +▁daau -7226 +▁elum -7227 +▁iyio -7228 +▁jika -7229 +▁orop -7230 +▁sene -7231 +▁sidi -7232 +▁wali -7233 +syafie -7234 +▁alien -7235 +▁banak -7236 +▁danau -7237 +▁ingos -7238 +▁kemas -7239 +▁kumup -7240 +▁matah -7241 +▁pekaw -7242 +▁puteh -7243 +▁romon -7244 +▁senin -7245 +▁siari -7246 +▁sujud -7247 +▁tenda -7248 +▁trang -7249 +▁trump -7250 +▁undur -7251 +▁yuran -7252 +▁bunuat -7253 +▁kerjon -7254 +▁ketika -7255 +▁maulud -7256 +▁mazhab -7257 +▁melayi -7258 +▁memban -7259 +▁mengen -7260 +▁ngawal -7261 +▁pemapi -7262 +▁states -7263 +▁syaria -7264 +▁tenipu -7265 +▁umurni -7266 +ngitokan -7267 +ulistiwa -7268 +ungguhan -7269 +▁barisan -7270 +▁bungair -7271 +▁gandang -7272 +▁kacukan -7273 +▁kayuhan -7274 +▁kelurga -7275 +▁kendadu -7276 +▁kenendo -7277 +▁kerjaan -7278 +▁lebihan -7279 +▁lembaga -7280 +▁lewatan -7281 +▁linggar -7282 +▁madinah -7283 +▁mangkuk -7284 +▁manisan -7285 +▁manteri -7286 +▁maranao -7287 +▁mariani -7288 +▁maserba -7289 +▁matster -7290 +▁mebagal -7291 +▁medayag -7292 +▁meeting -7293 +▁mejelis -7294 +▁mekerso -7295 +▁melalui -7296 +▁memandu -7297 +▁membeza -7298 +▁memikat -7299 +.[ -7300 +otw -7301 +zch -7302 +aban -7303 +gani -7304 +list -7305 +mula -7306 +opod -7307 +▁aii -7308 +▁ruo -7309 +asiat -7310 +azura -7311 +gilan -7312 +manis -7313 +penen -7314 +sezch -7315 +ungki -7316 +▁apid -7317 +▁arau -7318 +▁aren -7319 +▁diik -7320 +▁ikoh -7321 +▁iman -7322 +▁libu -7323 +▁olod -7324 +▁pesa -7325 +▁potw -7326 +▁tasi -7327 +▁tumi -7328 +bandar -7329 +harian -7330 +making -7331 +ubahat -7332 +ungkin -7333 +▁airku -7334 +▁arena -7335 +▁azman -7336 +▁bahat -7337 +▁bayak -7338 +▁bedau -7339 +▁diman -7340 +▁ingot -7341 +▁kebat -7342 +▁kejar -7343 +▁manch -7344 +▁masig -7345 +▁match -7346 +▁meneh -7347 +▁nikki -7348 +▁osman -7349 +▁penio -7350 +▁ratus -7351 +▁tarap -7352 +▁tekio -7353 +▁tugan -7354 +▁border -7355 +▁conyon -7356 +▁csezch -7357 +▁dembay -7358 +▁dinagu -7359 +▁kinasi -7360 +▁klasik -7361 +▁maulid -7362 +▁niimon -7363 +▁pesaan -7364 +▁sabagi -7365 +▁benersi -7366 +▁berkelu -7367 +▁dembuan -7368 +▁diikuti -7369 +▁entendo -7370 +▁fantasi -7371 +▁garisan -7372 +▁kebersi -7373 +▁menegah -7374 +▁menenga -7375 +▁menerus -7376 +▁mengkak -7377 +▁menjogo -7378 +▁mentera -7379 +▁menulis -7380 +▁menusia -7381 +▁merupoi -7382 +▁methodi -7383 +▁mlaysia -7384 +▁mollusk -7385 +▁moluska -7386 +▁monarki -7387 +▁motolau -7388 +▁muljoto -7389 +▁munafik -7390 +▁mungkin -7391 +▁munsung -7392 +▁musical -7393 +▁mustafa -7394 +▁muzikni -7395 +▁nagendo -7396 +▁nakawan -7397 +▁nampung -7398 +▁nangguh -7399 +ilmu -7400 +iver -7401 +ivia -7402 +thir -7403 +▁com -7404 +▁wei -7405 +ahmad -7406 +ainan -7407 +ainya -7408 +banan -7409 +ester -7410 +onald -7411 +uulan -7412 +▁agro -7413 +▁awam -7414 +▁ieyo -7415 +▁nemu -7416 +▁paya -7417 +▁pogo -7418 +▁ummu -7419 +▁upin -7420 +afasmu -7421 +debagi -7422 +edikat -7423 +insang -7424 +purata -7425 +syarat -7426 +usahan -7427 +yanmar -7428 +▁baduh -7429 +▁buani -7430 +▁didik -7431 +▁iklan -7432 +▁kuruk -7433 +▁river -7434 +▁setul -7435 +bolivia -7436 +engkeng -7437 +gerakan -7438 +selepas -7439 +▁anggap -7440 +▁antler -7441 +▁atusan -7442 +▁bineli -7443 +▁dekwat -7444 +▁donald -7445 +▁nabang -7446 +▁nganak -7447 +▁phthir -7448 +▁suklat -7449 +▁sumpah -7450 +▁syukri -7451 +▁tenemu -7452 +andainya -7453 +empatnya -7454 +▁bengkok -7455 +▁berbija -7456 +▁buletin -7457 +▁efisyen -7458 +▁kekasih -7459 +▁kesangg -7460 +▁lanjang -7461 +▁lautnya -7462 +▁myanmar -7463 +▁nafasmu -7464 +▁nantang -7465 +▁nasihat -7466 +▁natrium -7467 +▁nedaran -7468 +▁neelofa -7469 +▁negarak -7470 +▁nejaruk -7471 +▁nejogon -7472 +▁nelagan -7473 +▁nelamat -7474 +▁nelasai -7475 +▁nelayan -7476 +▁neligan -7477 +▁nentang -7478 +▁network -7479 +▁ngangah -7480 +▁ngangap -7481 +▁ngelans -7482 +▁ngelasa -7483 +▁ngelebi -7484 +▁ngendak -7485 +▁ngerait -7486 +▁ngerjon -7487 +▁ngingot -7488 +▁ngorban -7489 +▁ngormat -7490 +▁ngurang -7491 +▁niidanb -7492 +▁niidang -7493 +▁niiring -7494 +▁nilalao -7495 +▁nilonon -7496 +▁ningkok -7497 +▁nirebus -7498 +▁normala -7499 +"). -7500 +khw -7501 +the -7502 +boso -7503 +ecap -7504 +liza -7505 +ungo -7506 +▁lth -7507 +▁meh -7508 +▁toy -7509 +ecara -7510 +elium -7511 +ramay -7512 +singo -7513 +tarap -7514 +thman -7515 +urpan -7516 +▁azza -7517 +▁gigi -7518 +▁ijin -7519 +▁kauh -7520 +▁liar -7521 +▁liat -7522 +▁moyo -7523 +▁nusa -7524 +▁pulo -7525 +▁sele -7526 +▁sulu -7527 +▁wani -7528 +aminah -7529 +derita -7530 +faizal -7531 +▁agust -7532 +▁altil -7533 +▁decap -7534 +▁karas -7535 +▁kitra -7536 +▁lapat -7537 +▁nuang -7538 +▁nunai -7539 +▁pepik -7540 +▁petio -7541 +▁sedih -7542 +▁tungg -7543 +▁wahah -7544 +kawasan -7545 +ustrian -7546 +▁bangen -7547 +▁banyak -7548 +▁beneli -7549 +▁kekura -7550 +▁keperc -7551 +▁kereta -7552 +▁kikoon -7553 +▁marang -7554 +▁melabu -7555 +▁neraka -7556 +▁nilego -7557 +▁niumum -7558 +▁othman -7559 +▁pakong -7560 +▁pentas -7561 +▁sardin -7562 +▁sempor -7563 +▁urusan -7564 +ephelium -7565 +▁agustus -7566 +▁altilis -7567 +▁bebungo -7568 +▁berbeza -7569 +▁berdiri -7570 +▁berusia -7571 +▁ceviche -7572 +▁condong -7573 +▁ekologi -7574 +▁ikonomi -7575 +▁mekelum -7576 +▁melanga -7577 +▁numpang -7578 +▁numping -7579 +▁operasi -7580 +▁pakaian -7581 +▁palesin -7582 +▁pandang -7583 +▁panduan -7584 +▁pansang -7585 +▁paraian -7586 +▁pasifik -7587 +▁pediang -7588 +▁pedoman -7589 +▁pejabat -7590 +▁pejuang -7591 +▁pekaran -7592 +▁pekayan -7593 +▁pekemit -7594 +▁pelidas -7595 +▁peluang -7596 +▁pemandi -7597 +▁pembeli -7598 +▁pembulu -7599 +bbs -7600 +dua -7601 +far -7602 +ico -7603 +ite -7604 +biri -7605 +ciri -7606 +ibit -7607 +ibut -7608 +xico -7609 +▁dio -7610 +▁puu -7611 +▁usa -7612 +abiti -7613 +agram -7614 +hantu -7615 +ister -7616 +jerat -7617 +orida -7618 +ramah -7619 +tuhan -7620 +▁area -7621 +▁bapa -7622 +▁base -7623 +▁biri -7624 +▁buak -7625 +▁drma -7626 +▁ipol -7627 +▁lama -7628 +▁mary -7629 +▁mbbs -7630 +▁paga -7631 +▁raat -7632 +▁rait -7633 +▁solo -7634 +▁syur -7635 +▁wang -7636 +dangan -7637 +ebenen -7638 +ilingi -7639 +intaan -7640 +▁azraf -7641 +▁diego -7642 +▁elite -7643 +▁ensan -7644 +▁karos -7645 +▁keist -7646 +▁mantu -7647 +▁mewah -7648 +▁milau -7649 +▁mimun -7650 +▁mudah -7651 +▁nanda -7652 +▁niogo -7653 +▁palma -7654 +▁ramay -7655 +▁ribut -7656 +▁sepak -7657 +▁sesap -7658 +▁siapa -7659 +▁zarah -7660 +apuskan -7661 +hidupan -7662 +kantung -7663 +▁bedoom -7664 +▁binila -7665 +▁komedi -7666 +▁kompon -7667 +▁mariam -7668 +▁memtor -7669 +▁mexico -7670 +▁moktar -7671 +▁number -7672 +▁polans -7673 +▁binibit -7674 +▁bukhari -7675 +▁grabiti -7676 +▁klorida -7677 +▁nijajah -7678 +▁ningkoo -7679 +▁ningkot -7680 +▁pasukan -7681 +▁peminat -7682 +▁penamba -7683 +▁penanam -7684 +▁penereg -7685 +▁penerlu -7686 +▁pengara -7687 +▁pengelu -7688 +▁pengesi -7689 +▁peninda -7690 +▁penioto -7691 +▁penosok -7692 +▁penutup -7693 +▁penuyan -7694 +▁pepejal -7695 +▁perabot -7696 +▁perangi -7697 +▁peratus -7698 +▁perlemb -7699 +leu -7700 +tyt -7701 +icak -7702 +musa -7703 +usno -7704 +▁elk -7705 +▁jab -7706 +▁sos -7707 +▁uli -7708 +filem -7709 +frasa -7710 +gunan -7711 +kuala -7712 +mpuan -7713 +ordan -7714 +▁baik -7715 +▁bleu -7716 +▁idea -7717 +▁reka -7718 +▁sikh -7719 +▁suap -7720 +▁susa -7721 +▁ulun -7722 +biabas -7723 +daulat -7724 +engkap -7725 +kerbau -7726 +rantau -7727 +▁apong -7728 +▁arwah -7729 +▁beker -7730 +▁cicak -7731 +▁dinda -7732 +▁fairy -7733 +▁idaan -7734 +▁ismai -7735 +▁lakun -7736 +▁layak -7737 +▁lugam -7738 +▁memia -7739 +▁mulek -7740 +▁ngusa -7741 +▁sajak -7742 +▁salad -7743 +▁setio -7744 +▁tawau -7745 +▁trans -7746 +▁vocal -7747 +abahasa -7748 +aiannya -7749 +esional -7750 +polanie -7751 +▁buatni -7752 +▁buntar -7753 +▁disuap -7754 +▁istila -7755 +▁jordan -7756 +▁kesili -7757 +▁kijang -7758 +▁kilang -7759 +▁mahram -7760 +▁masker -7761 +▁ngogoh -7762 +▁niguno -7763 +▁polone -7764 +▁ramuan -7765 +▁rebana -7766 +▁rembau -7767 +▁rumpai -7768 +▁sempit -7769 +▁single -7770 +▁subhan -7771 +▁usukan -7772 +entengah -7773 +▁bekerjo -7774 +▁besukan -7775 +▁binatan -7776 +▁bugunan -7777 +▁jabatan -7778 +▁lagenda -7779 +▁menduga -7780 +▁neronon -7781 +▁nungkus -7782 +▁perlumb -7783 +▁persimp -7784 +▁peselap -7785 +▁petenob -7786 +▁pewangi -7787 +▁pewaris -7788 +▁phoenix -7789 +▁pienpen -7790 +▁pikiran -7791 +▁pinadul -7792 +▁pinarut -7793 +▁pinelua -7794 +▁pinemia -7795 +▁pinisah -7796 +▁plateau -7797 +▁poloney -7798 +▁pongsot -7799 +dat -7800 +ebu -7801 +gor -7802 +kit -7803 +ood -7804 +daau -7805 +fiah -7806 +haba -7807 +jawi -7808 +▁ria -7809 +afood -7810 +angie -7811 +badah -7812 +beras -7813 +datin -7814 +ionis -7815 +latok -7816 +okrat -7817 +surah -7818 +tilau -7819 +▁beja -7820 +▁empu -7821 +▁igor -7822 +▁khut -7823 +▁laur -7824 +▁lime -7825 +▁muam -7826 +inggah -7827 +malays -7828 +zionis -7829 +▁aliff -7830 +▁angie -7831 +▁bambo -7832 +▁butir -7833 +▁diter -7834 +▁gagah -7835 +▁kapak -7836 +▁laris -7837 +▁misin -7838 +▁pangi -7839 +▁pihak -7840 +▁tabak -7841 +▁talua -7842 +▁tutup -7843 +inggian -7844 +▁anding -7845 +▁angkan -7846 +▁gasing -7847 +▁haisan -7848 +▁keladi -7849 +▁mekkah -7850 +▁nerbit -7851 +▁niundi -7852 +▁pejadi -7853 +▁persik -7854 +▁petuah -7855 +▁sofiah -7856 +▁songot -7857 +▁surang -7858 +▁termas -7859 +▁tertua -7860 +ingkasan -7861 +kitingan -7862 +petaling -7863 +terutama -7864 +▁antarab -7865 +▁bertapa -7866 +▁indiana -7867 +▁kelapan -7868 +▁khutbah -7869 +▁kristen -7870 +▁mastura -7871 +▁memanuk -7872 +▁penakai -7873 +▁pesakit -7874 +▁psidium -7875 +▁puppets -7876 +▁putatan -7877 +▁rafiaah -7878 +▁ranggup -7879 +▁ranking -7880 +▁rashidi -7881 +▁reakiti -7882 +▁regimes -7883 +▁rekaman -7884 +▁rencana -7885 +▁renddan -7886 +▁residen -7887 +▁retusan -7888 +▁riwayat -7889 +▁ronokan -7890 +▁rosnani -7891 +▁sabagai -7892 +▁salasai -7893 +▁salawat -7894 +▁saliran -7895 +▁saluran -7896 +▁samapai -7897 +▁samihah -7898 +▁sangkul -7899 +kuah -7900 +kuda -7901 +ngah -7902 +ngai -7903 +para -7904 +tiff -7905 +tuju -7906 +▁ema -7907 +▁lps -7908 +▁uci -7909 +bunga -7910 +eraat -7911 +ngena -7912 +watak -7913 +▁ipin -7914 +▁keji -7915 +▁miri -7916 +▁niah -7917 +▁taop -7918 +▁tipu -7919 +ayanya -7920 +bahasa -7921 +endiom -7922 +kerang -7923 +sesapu -7924 +umbang -7925 +▁getak -7926 +▁halis -7927 +▁jihan -7928 +▁krabi -7929 +▁kunit -7930 +▁limbo -7931 +▁monok -7932 +▁najar -7933 +▁punsa -7934 +▁punya -7935 +▁pusal -7936 +▁surat -7937 +▁tampu -7938 +▁ -7939 +a -7940 +n -7941 +i -7942 +e -7943 +u -7944 +o -7945 +t -7946 +k -7947 +m -7948 +g -7949 +l -7950 +s -7951 +r -7952 +b -7953 +d -7954 +p -7955 +y -7956 +h -7957 +. -7958 +j -7959 +' -7960 +, -7961 +w -7962 +c -7963 +- -7964 +f -7965 +z -7966 +( -7967 +1 -7968 +v -7969 +) -7970 +2 -7971 +’ -7972 +" -7973 +0 -7974 +3 -7975 +5 -7976 +8 -7977 +/ -7978 +: -7979 +6 -7980 +4 -7981 +7 -7982 +9 -7983 +q -7984 +| -7985 +; -7986 +x -7987 +– -7988 +[ -7989 +] -7990 ++ -7991 +” -7992 +ا -7993 +ن -7994 +ي -7995 diff --git a/models/vocabulary/bdr_vocabulary.parquet b/models/vocabulary/bdr_vocabulary.parquet new file mode 100644 index 0000000000000000000000000000000000000000..1d858f2e685f3c14576346f592f87f0cfcc1b2b3 --- /dev/null +++ b/models/vocabulary/bdr_vocabulary.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a58092d4262f6aa43f6194c3af31f93b2c60e2e1f1457610992d6848f4156246 +size 38048 diff --git a/models/vocabulary/bdr_vocabulary_metadata.json b/models/vocabulary/bdr_vocabulary_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..52a3b0037c94416a49a958a4e25f746e4ee49106 --- /dev/null +++ b/models/vocabulary/bdr_vocabulary_metadata.json @@ -0,0 +1,16 @@ +{ + "language": "bdr", + "vocabulary_size": 2342, + "variant": "full", + "statistics": { + "type_token_ratio": 0.20221606648199447, + "coverage": { + "top_100": 0.4034075816795052, + "top_1000": 0.7593442871779305, + "top_5000": 0.9875156528668463 + }, + "hapax_count": 2987, + "hapax_ratio": 0.5605179208106587, + "total_documents": 538 + } +} \ No newline at end of file diff --git a/models/word_markov/bdr_markov_ctx1_word.parquet b/models/word_markov/bdr_markov_ctx1_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..84885f2f564cea689428dd685a3b713df54e7fc8 --- /dev/null +++ b/models/word_markov/bdr_markov_ctx1_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:50da15bb2d596e55b1af74da418be234ae20d7994f359665a73457c179ace2c4 +size 149999 diff --git a/models/word_markov/bdr_markov_ctx1_word_metadata.json b/models/word_markov/bdr_markov_ctx1_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..d653c59f86c3635d975017677961fd0720ce9ba7 --- /dev/null +++ b/models/word_markov/bdr_markov_ctx1_word_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 1, + "variant": "word", + "language": "bdr", + "unique_contexts": 5241, + "total_transitions": 25815 +} \ No newline at end of file diff --git a/models/word_markov/bdr_markov_ctx2_word.parquet b/models/word_markov/bdr_markov_ctx2_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..64cce2a2432736a5e20ed72623590b3ac1016b93 --- /dev/null +++ b/models/word_markov/bdr_markov_ctx2_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a70c0745d64f93f951581ccf1e770bc0198416279a90b1c76af931bc397bb65f +size 308032 diff --git a/models/word_markov/bdr_markov_ctx2_word_metadata.json b/models/word_markov/bdr_markov_ctx2_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..a9ab68e2b7f650b144fdd81c4d3372503fbcec0c --- /dev/null +++ b/models/word_markov/bdr_markov_ctx2_word_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 2, + "variant": "word", + "language": "bdr", + "unique_contexts": 18592, + "total_transitions": 25277 +} \ No newline at end of file diff --git a/models/word_markov/bdr_markov_ctx3_word.parquet b/models/word_markov/bdr_markov_ctx3_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..95296d6327d20062a3713eb320b73fe7103ab60e --- /dev/null +++ b/models/word_markov/bdr_markov_ctx3_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:223447078bb309ba5d758145f41c4571806689ffaf1efe8aef015c68c49c7526 +size 388681 diff --git a/models/word_markov/bdr_markov_ctx3_word_metadata.json b/models/word_markov/bdr_markov_ctx3_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..eaad942fd12b05dc2448572e1ec6ae37f5632681 --- /dev/null +++ b/models/word_markov/bdr_markov_ctx3_word_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 3, + "variant": "word", + "language": "bdr", + "unique_contexts": 22996, + "total_transitions": 24739 +} \ No newline at end of file diff --git a/models/word_markov/bdr_markov_ctx4_word.parquet b/models/word_markov/bdr_markov_ctx4_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..2dc281bc69a0450fc16fa86ba9173dc0ef1fa0ab --- /dev/null +++ b/models/word_markov/bdr_markov_ctx4_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66c6e307074a65e79883d656ec156325dab1be18c7c2082eab0c9665ad6e3133 +size 429090 diff --git a/models/word_markov/bdr_markov_ctx4_word_metadata.json b/models/word_markov/bdr_markov_ctx4_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..522f3c1dc8c0208d60fac080c6b335caff5919ba --- /dev/null +++ b/models/word_markov/bdr_markov_ctx4_word_metadata.json @@ -0,0 +1,7 @@ +{ + "context_size": 4, + "variant": "word", + "language": "bdr", + "unique_contexts": 23590, + "total_transitions": 24201 +} \ No newline at end of file diff --git a/models/word_ngram/bdr_2gram_word.parquet b/models/word_ngram/bdr_2gram_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..e85f1f9e73e4cd1b2389c506ecdc60d43b2d1015 --- /dev/null +++ b/models/word_ngram/bdr_2gram_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f7dbbf4075f6494f4e948bc0417ec1b327b33cb3676c74c44c5cf7b15c76e93 +size 7711 diff --git a/models/word_ngram/bdr_2gram_word_metadata.json b/models/word_ngram/bdr_2gram_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..00521872f8a6938d64c7658b7c01705528c55274 --- /dev/null +++ b/models/word_ngram/bdr_2gram_word_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 2, + "variant": "word", + "language": "bdr", + "unique_ngrams": 401, + "total_ngrams": 25815 +} \ No newline at end of file diff --git a/models/word_ngram/bdr_3gram_word.parquet b/models/word_ngram/bdr_3gram_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..81f9346099779624d88ab89f1b2d38bda6e3b4a3 --- /dev/null +++ b/models/word_ngram/bdr_3gram_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03582a17ce01397bddd0d583a4eb8301b1d9ad4a0ebaa6b95bafe86a66321950 +size 6657 diff --git a/models/word_ngram/bdr_3gram_word_metadata.json b/models/word_ngram/bdr_3gram_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..ee5751abad9243facbfde204d19b4c8cd37b4faa --- /dev/null +++ b/models/word_ngram/bdr_3gram_word_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 3, + "variant": "word", + "language": "bdr", + "unique_ngrams": 271, + "total_ngrams": 25277 +} \ No newline at end of file diff --git a/models/word_ngram/bdr_4gram_word.parquet b/models/word_ngram/bdr_4gram_word.parquet new file mode 100644 index 0000000000000000000000000000000000000000..d3731e16ae01bac8a6ae98361caea17bb3ce1fb5 --- /dev/null +++ b/models/word_ngram/bdr_4gram_word.parquet @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2b39ba29b04354ddf98ba4fefae173d2708834a6a0814a4f9cd3970bc844751a +size 8746 diff --git a/models/word_ngram/bdr_4gram_word_metadata.json b/models/word_ngram/bdr_4gram_word_metadata.json new file mode 100644 index 0000000000000000000000000000000000000000..8e945d9349e1f7d7e09c83a9668c802b929b32d5 --- /dev/null +++ b/models/word_ngram/bdr_4gram_word_metadata.json @@ -0,0 +1,7 @@ +{ + "n": 4, + "variant": "word", + "language": "bdr", + "unique_ngrams": 346, + "total_ngrams": 24739 +} \ No newline at end of file diff --git a/visualizations/embedding_isotropy.png b/visualizations/embedding_isotropy.png new file mode 100644 index 0000000000000000000000000000000000000000..8972f7c3c4fb9f1d26ad88ab6be0a39e789af237 Binary files /dev/null and b/visualizations/embedding_isotropy.png differ diff --git a/visualizations/embedding_norms.png b/visualizations/embedding_norms.png new file mode 100644 index 0000000000000000000000000000000000000000..49c86a34115df2676619f8711fc744d7d6f029ee Binary files /dev/null and b/visualizations/embedding_norms.png differ diff --git a/visualizations/embedding_similarity.png b/visualizations/embedding_similarity.png new file mode 100644 index 0000000000000000000000000000000000000000..5810a05c0985335b2c9f6356f45ba5635d23b51b --- /dev/null +++ b/visualizations/embedding_similarity.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:da4b54bf3df5c6b313bb7952ce15e1df9bbc5b23053604a4aeca175d4f7b1ffd +size 157285 diff --git a/visualizations/markov_branching.png b/visualizations/markov_branching.png new file mode 100644 index 0000000000000000000000000000000000000000..2a641544e4446af100fff29bb49ecfbf9c5023de Binary files /dev/null and b/visualizations/markov_branching.png differ diff --git a/visualizations/markov_contexts.png b/visualizations/markov_contexts.png new file mode 100644 index 0000000000000000000000000000000000000000..9b8740c46f9ae36768477195e16f005991e754fb Binary files /dev/null and b/visualizations/markov_contexts.png differ diff --git a/visualizations/markov_entropy.png b/visualizations/markov_entropy.png new file mode 100644 index 0000000000000000000000000000000000000000..89d1cd9a560e1d486c2c52008c04473c109cad6d Binary files /dev/null and b/visualizations/markov_entropy.png differ diff --git a/visualizations/model_sizes.png b/visualizations/model_sizes.png new file mode 100644 index 0000000000000000000000000000000000000000..0dfe9577e0abb5f16808136c9e96b1b1d830d5fa Binary files /dev/null and b/visualizations/model_sizes.png differ diff --git a/visualizations/nearest_neighbors.png b/visualizations/nearest_neighbors.png new file mode 100644 index 0000000000000000000000000000000000000000..f874656085e79a042049c53a951e1d0910ac739c Binary files /dev/null and b/visualizations/nearest_neighbors.png differ diff --git a/visualizations/ngram_coverage.png b/visualizations/ngram_coverage.png new file mode 100644 index 0000000000000000000000000000000000000000..2529804438b5003edc1f35421f919060b4846ae5 Binary files /dev/null and b/visualizations/ngram_coverage.png differ diff --git a/visualizations/ngram_entropy.png b/visualizations/ngram_entropy.png new file mode 100644 index 0000000000000000000000000000000000000000..40cdf443a88b44432e604e5d86e58615c06d8650 Binary files /dev/null and b/visualizations/ngram_entropy.png differ diff --git a/visualizations/ngram_perplexity.png b/visualizations/ngram_perplexity.png new file mode 100644 index 0000000000000000000000000000000000000000..5ae693827e0df4be329c760d68af061e2b888b5d Binary files /dev/null and b/visualizations/ngram_perplexity.png differ diff --git a/visualizations/ngram_unique.png b/visualizations/ngram_unique.png new file mode 100644 index 0000000000000000000000000000000000000000..028d249f873aeda7213b952ebf87d68b45ff068e Binary files /dev/null and b/visualizations/ngram_unique.png differ diff --git a/visualizations/performance_dashboard.png b/visualizations/performance_dashboard.png new file mode 100644 index 0000000000000000000000000000000000000000..deca3fefbe856f4aad1e426ee42978c5955b71ca --- /dev/null +++ b/visualizations/performance_dashboard.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8fe5f82b2d9c2ebbbc14caa6d6481784bd7680974b44c98b19117310baa170f0 +size 263632 diff --git a/visualizations/position_encoding_comparison.png b/visualizations/position_encoding_comparison.png new file mode 100644 index 0000000000000000000000000000000000000000..7ffcab49ca872d89f20e01b732c61a9ffbe0fef7 Binary files /dev/null and b/visualizations/position_encoding_comparison.png differ diff --git a/visualizations/tokenizer_compression.png b/visualizations/tokenizer_compression.png new file mode 100644 index 0000000000000000000000000000000000000000..d6ec6567911a593ac7b1c5f0bb7f560eab0b0b32 Binary files /dev/null and b/visualizations/tokenizer_compression.png differ diff --git a/visualizations/tokenizer_fertility.png b/visualizations/tokenizer_fertility.png new file mode 100644 index 0000000000000000000000000000000000000000..7de82e4007e20d38873385568ca2758d0c9477dc Binary files /dev/null and b/visualizations/tokenizer_fertility.png differ diff --git a/visualizations/tokenizer_oov.png b/visualizations/tokenizer_oov.png new file mode 100644 index 0000000000000000000000000000000000000000..7905999cd81b29923fb45f06e7fa14dfb7e82515 Binary files /dev/null and b/visualizations/tokenizer_oov.png differ diff --git a/visualizations/tokenizer_total_tokens.png b/visualizations/tokenizer_total_tokens.png new file mode 100644 index 0000000000000000000000000000000000000000..f9949382b4d45e0d1297b827fb3e26a094ccc588 Binary files /dev/null and b/visualizations/tokenizer_total_tokens.png differ diff --git a/visualizations/top20_words.png b/visualizations/top20_words.png new file mode 100644 index 0000000000000000000000000000000000000000..0232fb8afd0fed690fc5890f6e2accb61569e8fb Binary files /dev/null and b/visualizations/top20_words.png differ diff --git a/visualizations/tsne_sentences.png b/visualizations/tsne_sentences.png new file mode 100644 index 0000000000000000000000000000000000000000..007975e3029974520cda8be1dafcc8562624e567 --- /dev/null +++ b/visualizations/tsne_sentences.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3355188555ae4d2cabb57b512f2bc70a7c0d876410c496de0f49af0cf9e857cf +size 293446 diff --git a/visualizations/tsne_words.png b/visualizations/tsne_words.png new file mode 100644 index 0000000000000000000000000000000000000000..f2a06bbf7b9872ee9f97ae496c85f0bdb4b5e954 --- /dev/null +++ b/visualizations/tsne_words.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1f8c844164335fff6ceaacaaf5acc297c88c72a5594d9e0f06b9b662f517a47a +size 423276 diff --git a/visualizations/vocab_coverage.png b/visualizations/vocab_coverage.png new file mode 100644 index 0000000000000000000000000000000000000000..cc4868b85341a4760803fdf60a1ab0ea6897a461 Binary files /dev/null and b/visualizations/vocab_coverage.png differ diff --git a/visualizations/vocab_freq_dist.png b/visualizations/vocab_freq_dist.png new file mode 100644 index 0000000000000000000000000000000000000000..0be5513ca9de03cdc0cbef886f79ad8d219ccc73 Binary files /dev/null and b/visualizations/vocab_freq_dist.png differ diff --git a/visualizations/zipf_law.png b/visualizations/zipf_law.png new file mode 100644 index 0000000000000000000000000000000000000000..6012486d3d4fc43f5675df0d5c34302bd5c14452 --- /dev/null +++ b/visualizations/zipf_law.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46b9d20bda11d51362dd568d189c17d719db6723e5dff252fa1da96b39ac2ca6 +size 105166