Intent Classifier Student: L6_top

Distilled multilingual sentence encoder for intent classification (Action / Recall / Other).

Created by layer pruning from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.

Model Details

Property Value
Teacher paraphrase-multilingual-MiniLM-L12-v2
Architecture XLM-RoBERTa (pruned)
Hidden dim 384
Layers 6 (from 12)
Layer indices [6, 7, 8, 9, 10, 11]
Strategy 6 layers, top half (semantic-focused)
Est. params 106,825,344
Est. FP32 407.5MB
Est. INT8 101.9MB
Est. INT8 + vocab pruned 30.5MB

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Intended Use

This is a student encoder designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems.

  • Action: User requests an action (book, order, change settings, etc.)
  • Recall: User asks about past events or stored information
  • Other: Greetings, chitchat, emotions, etc.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L6_top")
embeddings = model.encode(["์˜ˆ์•ฝ ์ข€ ํ•ด์ค˜", "์ง€๋‚œ๋ฒˆ ์ฃผ๋ฌธ ๋ญ์˜€์ง€?", "์•ˆ๋…•ํ•˜์„ธ์š”"])
print(embeddings.shape)  # (3, 384)

MTEB Results

MassiveIntentClassification

Average: 43.47%

Language Score
ar 30.77%
en 55.96%
es 40.81%
ko 46.34%

MassiveScenarioClassification

Average: 47.62%

Language Score
ar 33.99%
en 62.04%
es 46.12%
ko 48.34%

Training / Distillation

This model was created via layer pruning (no additional training):

  1. Load teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden)
  2. Select layers: [6, 7, 8, 9, 10, 11]
  3. Copy embedding weights + selected layer weights
  4. Wrap with mean pooling for sentence embeddings

For deployment, vocabulary pruning (250K โ†’ ~55K tokens) and INT8 quantization are applied to meet the โ‰ค50MB size constraint.

Limitations

  • Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation
  • Vocabulary pruning limits the model to the target 18 languages
  • Designed for short dialogue utterances, not long documents
Downloads last month
11
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support