File size: 999 Bytes
58872a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54d3ce7
 
 
 
 
 
 
 
 
58872a2
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
---
license: unlicense
---
# OkayLID
OkayLID is a language identification model in FastText that is only 3 megabytes, meant for basic language detection. It can detect over 201 languages, at an extremely small size. OkayLID trained on a smaller subset of the OpenLID dataset.

## Installation

```bash
pip install fasttext huggingface_hub
```

## Usage

```python
import numpy as np
import fasttext
from huggingface_hub import hf_hub_download

def setup_environment():
    original_array = np.array
    def fixed_array(obj, *args, **kwargs):
        if kwargs.get("copy") is False:
            return np.asarray(obj)
        return original_array(obj, *args, **kwargs)
    np.array = fixed_array

setup_environment()

model_path = hf_hub_download(repo_id="Cutecat6152/OkayLID", filename="OkayLID.bin")
model = fasttext.load_model(model_path)

text = "The quick brown fox jumps over the lazy dog."
labels, probs = model.predict(text, k=1)

print(f"Language: {labels[0].replace('__label__', '')}")
```