Radar-1
Radar-1 is a language detection model developed by UnderTheSea NLP.
Model Description
- Model Type: Language Detection (Text Classification)
- Task: Identify the language of input text
- Language: Multilingual
- License: Apache 2.0
Supported Languages
| Code | Language |
|---|---|
| vi | Vietnamese |
| en | English |
| zh | Chinese |
| ja | Japanese |
| ko | Korean |
| fr | French |
| de | German |
| es | Spanish |
| th | Thai |
| lo | Lao |
| km | Khmer |
Installation
pip install underthesea
Usage
from underthesea import lang_detect
text = "Xin chào, tôi là người Việt Nam"
language = lang_detect(text)
print(language) # vi
API
from radar import RadarLangDetector, detect
# Quick detection
lang = detect("Hello world")
print(lang) # en
# With confidence scores
detector = RadarLangDetector.load("models/radar-1")
result = detector.predict("Xin chào Việt Nam")
print(result.lang) # vi
print(result.score) # 0.98
Training
python src/train.py
Technical Report
See TECHNICAL_REPORT.md for detailed methodology and evaluation.