L6_uniform
Lightweight multilingual sentence encoder optimized for intent classification.
Created from paraphrase-multilingual-MiniLM-L12-v2 via layer pruning + corpus-based vocabulary pruning.
Model Details
| Property | Value |
|---|---|
| Teacher | paraphrase-multilingual-MiniLM-L12-v2 |
| Architecture | XLM-RoBERTa (pruned) |
| Hidden dim | 384 |
| Layers | 6 / 12 |
| Layer indices | [0, 2, 4, 7, 9, 11] |
| Strategy | 6 layers, evenly spaced (general-purpose) |
| Vocab size | ~38,330 (pruned from 250K) |
| Parameters | 26,184,576 |
| Safetensors size | 98.1MB |
| Distilled | No |
Supported Languages (18)
ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl
Quick Start
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("L6_uniform")
sentences = [
"예약 좀 해줘", # Korean
"What did I order?", # English
"今日はいい天気ですね", # Japanese
"Reserva una mesa", # Spanish
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (4, 384)
MTEB Evaluation Results
Overall Average: 55.55%
MassiveIntentClassification
Average: 52.9%
| Language | Score |
|---|---|
| ar | 42.79% |
| en | 61.83% |
| es | 52.89% |
| ko | 54.08% |
MassiveScenarioClassification
Average: 58.2%
| Language | Score |
|---|---|
| ar | 46.87% |
| en | 67.91% |
| es | 59.42% |
| ko | 58.62% |
Training
This model was created via layer pruning + vocabulary pruning:
- Teacher:
paraphrase-multilingual-MiniLM-L12-v2(12 layers, 384 hidden dim) - Layer selection:
[0, 2, 4, 7, 9, 11]- 6 layers, evenly spaced (general-purpose) - Vocab pruning: 250K -> ~38K tokens (corpus-based filtering for 18 target languages)
- No additional training - weights are directly copied from the teacher
A distilled version of this model is also available with improved performance.
Compression Summary
| Stage | Vocab | Layers | Size |
|---|---|---|---|
| Teacher (original) | 250,002 | 12 | ~480MB |
| + Layer pruning | 250,002 | 6 | ~407MB |
| + Vocab pruning | ~38,330 | 6 | ~98MB |
Limitations
- Vocabulary pruning restricts the model to the 18 target languages
- Designed for short dialogue utterances, not long documents
- Layer pruning may reduce performance on complex semantic tasks
- Downloads last month
- 23