L6_bottom
Lightweight multilingual sentence encoder optimized for intent classification.
Created from paraphrase-multilingual-MiniLM-L12-v2 via layer pruning + corpus-based vocabulary pruning.
Model Details
| Property | Value |
|---|---|
| Teacher | paraphrase-multilingual-MiniLM-L12-v2 |
| Architecture | XLM-RoBERTa (pruned) |
| Hidden dim | 384 |
| Layers | 6 / 12 |
| Layer indices | [0, 1, 2, 3, 4, 5] |
| Strategy | 6 layers, bottom half (syntactic-focused) |
| Vocab size | ~38,330 (pruned from 250K) |
| Parameters | 26,184,576 |
| Safetensors size | 98.1MB |
| Distilled | No |
Supported Languages (18)
ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl
Quick Start
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("L6_bottom")
sentences = [
"예약 좀 해줘", # Korean
"What did I order?", # English
"今日はいい天気ですね", # Japanese
"Reserva una mesa", # Spanish
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (4, 384)
MTEB Evaluation Results
Overall Average: 57.05%
MassiveIntentClassification
Average: 54.7%
| Language | Score |
|---|---|
| ar | 46.36% |
| en | 59.84% |
| es | 56.11% |
| ko | 56.49% |
MassiveScenarioClassification
Average: 59.39%
| Language | Score |
|---|---|
| ar | 50.55% |
| en | 64.52% |
| es | 60.31% |
| ko | 62.19% |
Training
This model was created via layer pruning + vocabulary pruning:
- Teacher:
paraphrase-multilingual-MiniLM-L12-v2(12 layers, 384 hidden dim) - Layer selection:
[0, 1, 2, 3, 4, 5]- 6 layers, bottom half (syntactic-focused) - Vocab pruning: 250K -> ~38K tokens (corpus-based filtering for 18 target languages)
- No additional training - weights are directly copied from the teacher
A distilled version of this model is also available with improved performance.
Compression Summary
| Stage | Vocab | Layers | Size |
|---|---|---|---|
| Teacher (original) | 250,002 | 12 | ~480MB |
| + Layer pruning | 250,002 | 6 | ~407MB |
| + Vocab pruning | ~38,330 | 6 | ~98MB |
Limitations
- Vocabulary pruning restricts the model to the 18 target languages
- Designed for short dialogue utterances, not long documents
- Layer pruning may reduce performance on complex semantic tasks
- Downloads last month
- 15