Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
datasets:
|
| 3 |
+
- mesolitica/language-detection-dataset
|
| 4 |
+
---
|
| 5 |
+
# fasttext-language-detection-v2
|
| 6 |
+
|
| 7 |
+
FastText model to classify standard English, Local English, standard Malay, social media Indonesian, local Malay and others.
|
| 8 |
+
|
| 9 |
+
## how to use
|
| 10 |
+
|
| 11 |
+
```python
|
| 12 |
+
from huggingface_hub import hf_hub_download
|
| 13 |
+
import fasttext
|
| 14 |
+
|
| 15 |
+
filename = hf_hub_download(
|
| 16 |
+
repo_id="mesolitica/fasttext-language-detection-v2",
|
| 17 |
+
filename="fasttext.ftz"
|
| 18 |
+
)
|
| 19 |
+
lang_model = fasttext.load_model(filename)
|
| 20 |
+
lang_model.predict('hello my name', k = 10)
|
| 21 |
+
```
|
| 22 |
+
|
| 23 |
+
Output,
|
| 24 |
+
|
| 25 |
+
```python
|
| 26 |
+
(('__label__standard-english',
|
| 27 |
+
'__label__local-english',
|
| 28 |
+
'__label__standard-malay',
|
| 29 |
+
'__label__socialmedia-indonesian',
|
| 30 |
+
'__label__local-malay',
|
| 31 |
+
'__label__other'),
|
| 32 |
+
array([9.12180483e-01, 4.69220504e-02, 4.03920077e-02, 5.50693308e-04,
|
| 33 |
+
1.30474637e-05, 1.07987826e-05]))
|
| 34 |
+
```
|