Intent Classifier Student: L4_top

Distilled multilingual sentence encoder for intent classification (Action / Recall / Other).

Created by layer pruning from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.

Model Details

Property Value
Teacher paraphrase-multilingual-MiniLM-L12-v2
Architecture XLM-RoBERTa (pruned)
Hidden dim 384
Layers 4 (from 12)
Layer indices [8, 9, 10, 11]
Strategy 4 layers, top quarter (semantic-focused compact)
Est. params 103,283,328
Est. FP32 394.0MB
Est. INT8 98.5MB
Est. INT8 + vocab pruned 27.1MB

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Intended Use

This is a student encoder designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems.

  • Action: User requests an action (book, order, change settings, etc.)
  • Recall: User asks about past events or stored information
  • Other: Greetings, chitchat, emotions, etc.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L4_top")
embeddings = model.encode(["์˜ˆ์•ฝ ์ข€ ํ•ด์ค˜", "์ง€๋‚œ๋ฒˆ ์ฃผ๋ฌธ ๋ญ์˜€์ง€?", "์•ˆ๋…•ํ•˜์„ธ์š”"])
print(embeddings.shape)  # (3, 384)

MTEB Results

MassiveIntentClassification

Average: 45.26%

Language Score
ar 34.77%
en 55.68%
es 42.64%
ko 47.96%

MassiveScenarioClassification

Average: 48.79%

Language Score
ar 36.89%
en 61.66%
es 47.3%
ko 49.3%

Training / Distillation

This model was created via layer pruning (no additional training):

  1. Load teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden)
  2. Select layers: [8, 9, 10, 11]
  3. Copy embedding weights + selected layer weights
  4. Wrap with mean pooling for sentence embeddings

For deployment, vocabulary pruning (250K โ†’ ~55K tokens) and INT8 quantization are applied to meet the โ‰ค50MB size constraint.

Limitations

  • Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation
  • Vocabulary pruning limits the model to the target 18 languages
  • Designed for short dialogue utterances, not long documents
Downloads last month
12
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support