lang-uk
/

fasttext_uk

Feature Extraction

Model card Files Files and versions

dchaplinsky commited on Nov 6, 2022

Commit

3fdd67d

·

1 Parent(s): eb31495

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -3,8 +3,23 @@ license: mit
 tags:
 - feature-extraction
 library_name: generic
 ---
 Usage
 ```
 import fasttext.util

 tags:
 - feature-extraction
 library_name: generic
+datasets:
+- ubertext2.0
+widget:
+- text: "доброго вечора ми з україни"
 ---
+_name_ is pre-trained word vectors for the Ukrainian language, trained with fastText on (yet unreleased) UberText2.0 dataset, released by the [lang-uk](https://lang.org.ua/en/). This model was trained using skipgram in dimension 300, with character n-grams range of 2-5, and 15 negative samples.
+Our model increases Accuracy by 6.3% compared to the [Facebook Ukrainian word vectors](https://fasttext.cc/docs/en/crawl-vectors.html) on the word analogy task. The dataset for Ukrainian word analogy is available [here](https://github.com/lang-uk/vecs/).
+Extrinsic evaluations were performed on two sequence labeling tasks: NER and POS tagging. NER-UK dataset was released by the lang-uk, and Ukrainian (UD) corpus was developed by a non-profit organization Institute for Ukrainian.
+Results:
+1) spaCy NER F-score 0.818
+2) POS Flair Accuracy 0.824
+3) POS spaCy Accuracy 0.911
 Usage
 ```
 import fasttext.util