gomyk's picture
Upload L2_ends student model with MTEB results
23eac37 verified
metadata
language:
  - ko
  - en
  - ja
  - zh
  - es
  - fr
  - de
  - pt
  - it
  - ru
  - ar
  - hi
  - th
  - vi
  - id
  - tr
  - nl
  - pl
tags:
  - sentence-transformers
  - intent-classification
  - multilingual
  - distillation
  - layer-pruning
library_name: sentence-transformers
pipeline_tag: sentence-similarity
license: apache-2.0

Intent Classifier Student: L2_ends

Distilled multilingual sentence encoder for intent classification (Action / Recall / Other).

Created by layer pruning from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.

Model Details

Property Value
Teacher paraphrase-multilingual-MiniLM-L12-v2
Architecture XLM-RoBERTa (pruned)
Hidden dim 384
Layers 2 (from 12)
Layer indices [0, 11]
Strategy 2 layers, first + last (minimal)
Est. params 99,741,312
Est. FP32 380.5MB
Est. INT8 95.1MB
Est. INT8 + vocab pruned 23.7MB

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Intended Use

This is a student encoder designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems.

  • Action: User requests an action (book, order, change settings, etc.)
  • Recall: User asks about past events or stored information
  • Other: Greetings, chitchat, emotions, etc.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L2_ends")
embeddings = model.encode(["์˜ˆ์•ฝ ์ข€ ํ•ด์ค˜", "์ง€๋‚œ๋ฒˆ ์ฃผ๋ฌธ ๋ญ์˜€์ง€?", "์•ˆ๋…•ํ•˜์„ธ์š”"])
print(embeddings.shape)  # (3, 384)

MTEB Results

MassiveIntentClassification

Average: 49.8%

Language Score
ar 42.22%
en 56.13%
es 48.54%
ko 52.31%

MassiveScenarioClassification

Average: 52.47%

Language Score
ar 44.35%
en 59.73%
es 51.11%
ko 54.7%

Training / Distillation

This model was created via layer pruning (no additional training):

  1. Load teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden)
  2. Select layers: [0, 11]
  3. Copy embedding weights + selected layer weights
  4. Wrap with mean pooling for sentence embeddings

For deployment, vocabulary pruning (250K โ†’ ~55K tokens) and INT8 quantization are applied to meet the โ‰ค50MB size constraint.

Limitations

  • Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation
  • Vocabulary pruning limits the model to the target 18 languages
  • Designed for short dialogue utterances, not long documents