Jamshid Ahmadov
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,14 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
-
# Tokenizer for
|
| 6 |
|
| 7 |
## Introduction
|
| 8 |
Ushbu tokenizer Mozilla Common Voice dataset ma'lumotlariga asoslangan. train+validated 130.000 sentences
|
|
@@ -32,5 +38,4 @@ print(tokens)
|
|
| 32 |
Common Voice 17.0 dataseti multilangual ya'ni ko'p tilli bo'lib o'zbek tilini ham qo'llab quvvatlaydi.
|
| 33 |
|
| 34 |
## Contact
|
| 35 |
-
[Jamshid Ahmadov](https://www.linkedin.com/in/jamshid-ds)
|
| 36 |
-
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- mozilla-foundation/common_voice_17_0
|
| 5 |
+
language:
|
| 6 |
+
- uz
|
| 7 |
+
base_model:
|
| 8 |
+
- FacebookAI/xlm-roberta-base
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Tokenizer for Uzbek Language
|
| 12 |
|
| 13 |
## Introduction
|
| 14 |
Ushbu tokenizer Mozilla Common Voice dataset ma'lumotlariga asoslangan. train+validated 130.000 sentences
|
|
|
|
| 38 |
Common Voice 17.0 dataseti multilangual ya'ni ko'p tilli bo'lib o'zbek tilini ham qo'llab quvvatlaydi.
|
| 39 |
|
| 40 |
## Contact
|
| 41 |
+
[Jamshid Ahmadov](https://www.linkedin.com/in/jamshid-ds)
|
|
|