--- language: ["ko", "en", "ja", "zh", "es", "fr", "de", "pt", "it", "ru", "ar", "hi", "th", "vi", "id", "tr", "nl", "pl"] tags: - sentence-transformers - intent-classification - multilingual - distillation - layer-pruning library_name: sentence-transformers pipeline_tag: sentence-similarity license: apache-2.0 --- # Intent Classifier Student: L2_ends Distilled multilingual sentence encoder for intent classification (Action / Recall / Other). Created by **layer pruning** from `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`. ## Model Details | Property | Value | |----------|-------| | Teacher | paraphrase-multilingual-MiniLM-L12-v2 | | Architecture | XLM-RoBERTa (pruned) | | Hidden dim | 384 | | Layers | 2 (from 12) | | Layer indices | [0, 11] | | Strategy | 2 layers, first + last (minimal) | | Est. params | 99,741,312 | | Est. FP32 | 380.5MB | | Est. INT8 | 95.1MB | | Est. INT8 + vocab pruned | 23.7MB | ## Supported Languages (18) ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl ## Intended Use This is a **student encoder** designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems. - **Action**: User requests an action (book, order, change settings, etc.) - **Recall**: User asks about past events or stored information - **Other**: Greetings, chitchat, emotions, etc. ## Usage ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("L2_ends") embeddings = model.encode(["예약 좀 해줘", "지난번 주문 뭐였지?", "안녕하세요"]) print(embeddings.shape) # (3, 384) ``` ## MTEB Results ### MassiveIntentClassification **Average: 49.8%** | Language | Score | |----------|-------| | ar | 42.22% | | en | 56.13% | | es | 48.54% | | ko | 52.31% | ### MassiveScenarioClassification **Average: 52.47%** | Language | Score | |----------|-------| | ar | 44.35% | | en | 59.73% | | es | 51.11% | | ko | 54.7% | ## Training / Distillation This model was created via **layer pruning** (no additional training): 1. Load teacher: `paraphrase-multilingual-MiniLM-L12-v2` (12 layers, 384 hidden) 2. Select layers: `[0, 11]` 3. Copy embedding weights + selected layer weights 4. Wrap with mean pooling for sentence embeddings For deployment, vocabulary pruning (250K → ~55K tokens) and INT8 quantization are applied to meet the ≤50MB size constraint. ## Limitations - Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation - Vocabulary pruning limits the model to the target 18 languages - Designed for short dialogue utterances, not long documents