DerivedFunction
/

polyglot-tagger-60L-Experimental

Token Classification

language-detection

language-identification

Model card Files Files and versions

Metrics Training metrics Community

DerivedFunction commited on 3 days ago

Commit

56f4476

·

verified ·

1 Parent(s): 5c137a4

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -117,12 +117,12 @@ The data composition follows a strategic curriculum:
 * **10% Mixed with Noise:** Integration of "neutral" spans including code snippets, mathematical notation, emojis, symbols, and `rot_13` text tagged as `O` or their respective source to reduce hallucination.
 ### Supported Languages and Limitations (60)
-The model supports the following ISO-coded languages. Note that Romanized versions of any language is not included in the training set, such as Romanized Russian, and Hindi:
 `af, am, ar, as, be, bg, bn, cs, da, de, el, en, es, fa, fi, fr, gu, he, hi, hu, hy, id, is, it, ja, ka, kk, km, kn, ko, la, lo, ml, mk, mn, mr, ms, my, nl, no, or, pa, pl, ps, pt, ro, ru, sd, sq, sr, sv, ta, te, th, tr, ug, uk, ur, vi, zh`
-### The model scored the following on `papulca/language-identification's test set
 |Language     | Correct  |  Total     | Accuracy    |
 |-------------|----------|-------------|--------|
 |ar           | 114     |   114       |      100.0% |

 * **10% Mixed with Noise:** Integration of "neutral" spans including code snippets, mathematical notation, emojis, symbols, and `rot_13` text tagged as `O` or their respective source to reduce hallucination.
 ### Supported Languages and Limitations (60)
+The model supports the following ISO-coded languages:
 `af, am, ar, as, be, bg, bn, cs, da, de, el, en, es, fa, fi, fr, gu, he, hi, hu, hy, id, is, it, ja, ka, kk, km, kn, ko, la, lo, ml, mk, mn, mr, ms, my, nl, no, or, pa, pl, ps, pt, ro, ru, sd, sq, sr, sv, ta, te, th, tr, ug, uk, ur, vi, zh`
+> Note that Romanized versions of any language is not included in the training set, such as Romanized Russian, and Hindi.
+### The model scored the following on `papulca/language-identification`'s test set
 |Language     | Correct  |  Total     | Accuracy    |
 |-------------|----------|-------------|--------|
 |ar           | 114     |   114       |      100.0% |