latofat commited on
Commit
728554d
·
verified ·
1 Parent(s): dcafb61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # uzpostagger-cyrillic-3
20
 
21
- This model is a fine-tuned version of [coppercitylabs/uzbert-base-uncased](https://huggingface.co/coppercitylabs/uzbert-base-uncased) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.2715
24
  - Precision: 0.8763
@@ -68,3 +68,30 @@ The following hyperparameters were used during training:
68
  - Pytorch 2.2.0
69
  - Datasets 2.17.1
70
  - Tokenizers 0.13.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  # uzpostagger-cyrillic-3
20
 
21
+ This model is a fine-tuned version of [coppercitylabs/uzbert-base-uncased](https://huggingface.co/coppercitylabs/uzbert-base-uncased) on [uzbekpos](https://huggingface.co/datasets/latofat/uzbekpos) dataset.
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.2715
24
  - Precision: 0.8763
 
68
  - Pytorch 2.2.0
69
  - Datasets 2.17.1
70
  - Tokenizers 0.13.3
71
+
72
+ ## Citation Information
73
+ ```
74
+ @inproceedings{bobojonova-etal-2025-bbpos,
75
+ title = "{BBPOS}: {BERT}-based Part-of-Speech Tagging for {U}zbek",
76
+ author = "Bobojonova, Latofat and
77
+ Akhundjanova, Arofat and
78
+ Ostheimer, Phil Sidney and
79
+ Fellenz, Sophie",
80
+ editor = "Hettiarachchi, Hansi and
81
+ Ranasinghe, Tharindu and
82
+ Rayson, Paul and
83
+ Mitkov, Ruslan and
84
+ Gaber, Mohamed and
85
+ Premasiri, Damith and
86
+ Tan, Fiona Anting and
87
+ Uyangodage, Lasitha",
88
+ booktitle = "Proceedings of the First Workshop on Language Models for Low-Resource Languages",
89
+ month = jan,
90
+ year = "2025",
91
+ address = "Abu Dhabi, United Arab Emirates",
92
+ publisher = "Association for Computational Linguistics",
93
+ url = "https://aclanthology.org/2025.loreslm-1.23/",
94
+ pages = "287--293",
95
+ abstract = "This paper advances NLP research for the low-resource Uzbek language by evaluating two previously untested monolingual Uzbek BERT models on the part-of-speech (POS) tagging task and introducing the first publicly available UPOS-tagged benchmark dataset for Uzbek. Our fine-tuned models achieve 91{\%} average accuracy, outperforming the baseline multi-lingual BERT as well as the rule-based tagger. Notably, these models capture intermediate POS changes through affixes and demonstrate context sensitivity, unlike existing rule-based taggers."
96
+ }
97
+ ```