Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Cross_Lingual_Intent_Classifier
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
|
| 5 |
+
This model, **Cross_Lingual_Intent_Classifier**, is a state-of-the-art Natural Language Processing (NLP) model designed for classifying user intents across multiple languages. It is trained on a massive multilingual dataset encompassing high-resource languages (English, Spanish, French, German) and medium-resource languages (Italian, Portuguese). The core capability of this model is zero-shot or few-shot transfer of classification knowledge across languages, making it highly valuable for global conversational AI and virtual assistant applications.
|
| 6 |
+
|
| 7 |
+
## Model Architecture
|
| 8 |
+
|
| 9 |
+
The model is based on the **XLM-RoBERTa (XLM-R)** architecture, specifically **XLMRobertaForSequenceClassification**.
|
| 10 |
+
|
| 11 |
+
* **Base Model:** XLM-R large, pre-trained on 2.5TB of filtered CommonCrawl data in 100 languages.
|
| 12 |
+
* **Head:** A classification head (a simple linear layer) is added on top of the pooled output of the last hidden state (CLS token).
|
| 13 |
+
* **Training:** Fine-tuned on a mixed-language intent classification corpus using a standard cross-entropy loss function. The multilingual pre-training allows the model to map sentences from different languages into a shared, semantically rich representation space, enabling cross-lingual generalization.
|
| 14 |
+
* **Intents:** Currently supports 6 core conversational intents: `Book_Flight`, `Get_Weather`, `Find_POI`, `Set_Reminder`, `Control_Device`, and `General_Query`.
|
| 15 |
+
|
| 16 |
+
## Intended Use
|
| 17 |
+
|
| 18 |
+
This model is intended for the following use cases:
|
| 19 |
+
|
| 20 |
+
* **Multilingual Chatbots:** Powering virtual assistants that need to understand user intent regardless of the input language, without requiring a separate model for each language.
|
| 21 |
+
* **Zero-Shot Intent Transfer:** Using the model in a new language (not seen during fine-tuning) with reasonable performance due to its multilingual pre-training.
|
| 22 |
+
* **Cross-Lingual Evaluation:** Benchmarking cross-lingual understanding capabilities in various NLP research projects.
|
| 23 |
+
* **Data Labeling:** Automated classification of large volumes of multilingual customer service queries or voice commands.
|
| 24 |
+
|
| 25 |
+
## Limitations
|
| 26 |
+
|
| 27 |
+
* **Low-Resource Languages:** While cross-lingual, performance degrades significantly for very low-resource or highly divergent languages not well-represented in the XLM-R pre-training corpus (e.g., certain African or indigenous languages).
|
| 28 |
+
* **Domain Shift:** The model's performance may decrease if the intents or the conversational domain are highly specialized and differ greatly from the general-purpose intents it was trained on.
|
| 29 |
+
* **Length Constraint:** Like most Transformer models, input sequences are typically capped at 512 tokens. Very long, multi-sentence utterances will be truncated.
|
| 30 |
+
* **Dialect and Code-Switching:** The model handles standard, clean text better than heavily dialectal language or instances of complex code-switching (mixing two languages in one sentence).
|