Intent Classifier Student: L3_uniform

Distilled multilingual sentence encoder for intent classification (Action / Recall / Other).

Created by layer pruning from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.

Model Details

Property Value
Teacher paraphrase-multilingual-MiniLM-L12-v2
Architecture XLM-RoBERTa (pruned)
Hidden dim 384
Layers 3 (from 12)
Layer indices [0, 6, 11]
Strategy 3 layers, evenly spaced (ultra-compact)
Est. params 101,512,320
Est. FP32 387.2MB
Est. INT8 96.8MB
Est. INT8 + vocab pruned 25.4MB

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Intended Use

This is a student encoder designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems.

  • Action: User requests an action (book, order, change settings, etc.)
  • Recall: User asks about past events or stored information
  • Other: Greetings, chitchat, emotions, etc.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L3_uniform")
embeddings = model.encode(["์˜ˆ์•ฝ ์ข€ ํ•ด์ค˜", "์ง€๋‚œ๋ฒˆ ์ฃผ๋ฌธ ๋ญ์˜€์ง€?", "์•ˆ๋…•ํ•˜์„ธ์š”"])
print(embeddings.shape)  # (3, 384)

MTEB Results

MassiveIntentClassification

Average: 49.58%

Language Score
ar 41.29%
en 56.54%
es 48.51%
ko 51.97%

MassiveScenarioClassification

Average: 52.96%

Language Score
ar 43.83%
en 61.51%
es 51.79%
ko 54.73%

Training / Distillation

This model was created via layer pruning (no additional training):

  1. Load teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden)
  2. Select layers: [0, 6, 11]
  3. Copy embedding weights + selected layer weights
  4. Wrap with mean pooling for sentence embeddings

For deployment, vocabulary pruning (250K โ†’ ~55K tokens) and INT8 quantization are applied to meet the โ‰ค50MB size constraint.

Limitations

  • Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation
  • Vocabulary pruning limits the model to the target 18 languages
  • Designed for short dialogue utterances, not long documents
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support