YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Cross_Lingual_Intent_Classifier

Overview

This model, Cross_Lingual_Intent_Classifier, is a state-of-the-art Natural Language Processing (NLP) model designed for classifying user intents across multiple languages. It is trained on a massive multilingual dataset encompassing high-resource languages (English, Spanish, French, German) and medium-resource languages (Italian, Portuguese). The core capability of this model is zero-shot or few-shot transfer of classification knowledge across languages, making it highly valuable for global conversational AI and virtual assistant applications.

Model Architecture

The model is based on the XLM-RoBERTa (XLM-R) architecture, specifically XLMRobertaForSequenceClassification.

  • Base Model: XLM-R large, pre-trained on 2.5TB of filtered CommonCrawl data in 100 languages.
  • Head: A classification head (a simple linear layer) is added on top of the pooled output of the last hidden state (CLS token).
  • Training: Fine-tuned on a mixed-language intent classification corpus using a standard cross-entropy loss function. The multilingual pre-training allows the model to map sentences from different languages into a shared, semantically rich representation space, enabling cross-lingual generalization.
  • Intents: Currently supports 6 core conversational intents: Book_Flight, Get_Weather, Find_POI, Set_Reminder, Control_Device, and General_Query.

Intended Use

This model is intended for the following use cases:

  • Multilingual Chatbots: Powering virtual assistants that need to understand user intent regardless of the input language, without requiring a separate model for each language.
  • Zero-Shot Intent Transfer: Using the model in a new language (not seen during fine-tuning) with reasonable performance due to its multilingual pre-training.
  • Cross-Lingual Evaluation: Benchmarking cross-lingual understanding capabilities in various NLP research projects.
  • Data Labeling: Automated classification of large volumes of multilingual customer service queries or voice commands.

Limitations

  • Low-Resource Languages: While cross-lingual, performance degrades significantly for very low-resource or highly divergent languages not well-represented in the XLM-R pre-training corpus (e.g., certain African or indigenous languages).
  • Domain Shift: The model's performance may decrease if the intents or the conversational domain are highly specialized and differ greatly from the general-purpose intents it was trained on.
  • Length Constraint: Like most Transformer models, input sequences are typically capped at 512 tokens. Very long, multi-sentence utterances will be truncated.
  • Dialect and Code-Switching: The model handles standard, clean text better than heavily dialectal language or instances of complex code-switching (mixing two languages in one sentence).
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support