| license: apache-2.0 | |
| language: | |
| - vi | |
| - en | |
| - zh | |
| - ja | |
| - ko | |
| - fr | |
| - de | |
| - es | |
| - th | |
| - lo | |
| - km | |
| tags: | |
| - language-detection | |
| - language-identification | |
| - vietnamese | |
| - multilingual | |
| library_name: underthesea | |
| pipeline_tag: text-classification | |
| metrics: | |
| - accuracy | |
| - f1 | |
| # Radar-1 | |
| Radar-1 is a language detection model developed by UnderTheSea NLP. | |
| ## Model Description | |
| - **Model Type:** Language Detection (Text Classification) | |
| - **Task:** Identify the language of input text | |
| - **Language:** Multilingual | |
| - **License:** Apache 2.0 | |
| ## Supported Languages | |
| | Code | Language | | |
| |------|----------| | |
| | vi | Vietnamese | | |
| | en | English | | |
| | zh | Chinese | | |
| | ja | Japanese | | |
| | ko | Korean | | |
| | fr | French | | |
| | de | German | | |
| | es | Spanish | | |
| | th | Thai | | |
| | lo | Lao | | |
| | km | Khmer | | |
| ## Installation | |
| ```bash | |
| pip install underthesea | |
| ``` | |
| ## Usage | |
| ```python | |
| from underthesea import lang_detect | |
| text = "Xin chào, tôi là người Việt Nam" | |
| language = lang_detect(text) | |
| print(language) # vi | |
| ``` | |
| ## API | |
| ```python | |
| from radar import RadarLangDetector, detect | |
| # Quick detection | |
| lang = detect("Hello world") | |
| print(lang) # en | |
| # With confidence scores | |
| detector = RadarLangDetector.load("models/radar-1") | |
| result = detector.predict("Xin chào Việt Nam") | |
| print(result.lang) # vi | |
| print(result.score) # 0.98 | |
| ``` | |
| ## Training | |
| ```bash | |
| python src/train.py | |
| ``` | |
| ## Technical Report | |
| See [TECHNICAL_REPORT.md](TECHNICAL_REPORT.md) for detailed methodology and evaluation. | |