Upload L2_ends student model with MTEB results

23eac37 verified 26 days ago

2.78 kB

language:
  - ko
  - en
  - ja
  - zh
  - es
  - fr
  - de
  - pt
  - it
  - ru
  - ar
  - hi
  - th
  - vi
  - id
  - tr
  - nl
  - pl
tags:
  - sentence-transformers
  - intent-classification
  - multilingual
  - distillation
  - layer-pruning
library_name: sentence-transformers
pipeline_tag: sentence-similarity
license: apache-2.0

Intent Classifier Student: L2_ends

Distilled multilingual sentence encoder for intent classification (Action / Recall / Other).

Created by layer pruning from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.

Model Details

Property	Value
Teacher	paraphrase-multilingual-MiniLM-L12-v2
Architecture	XLM-RoBERTa (pruned)
Hidden dim	384
Layers	2 (from 12)
Layer indices	[0, 11]
Strategy	2 layers, first + last (minimal)
Est. params	99,741,312
Est. FP32	380.5MB
Est. INT8	95.1MB
Est. INT8 + vocab pruned	23.7MB

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Intended Use

This is a student encoder designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems.

Action: User requests an action (book, order, change settings, etc.)
Recall: User asks about past events or stored information
Other: Greetings, chitchat, emotions, etc.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L2_ends")
embeddings = model.encode(["예약 좀 해줘", "지난번 주문 뭐였지?", "안녕하세요"])
print(embeddings.shape)  # (3, 384)

MTEB Results

MassiveIntentClassification

Average: 49.8%

Language	Score
ar	42.22%
en	56.13%
es	48.54%
ko	52.31%

MassiveScenarioClassification

Average: 52.47%

Language	Score
ar	44.35%
en	59.73%
es	51.11%
ko	54.7%

Training / Distillation

This model was created via layer pruning (no additional training):

Load teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden)
Select layers: [0, 11]
Copy embedding weights + selected layer weights
Wrap with mean pooling for sentence embeddings

For deployment, vocabulary pruning (250K → ~55K tokens) and INT8 quantization are applied to meet the ≤50MB size constraint.

Limitations

Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation
Vocabulary pruning limits the model to the target 18 languages
Designed for short dialogue utterances, not long documents