L6_bottom

Lightweight multilingual sentence encoder optimized for intent classification. Created from paraphrase-multilingual-MiniLM-L12-v2 via layer pruning + corpus-based vocabulary pruning.

Model Details

Property	Value
Teacher	paraphrase-multilingual-MiniLM-L12-v2
Architecture	XLM-RoBERTa (pruned)
Hidden dim	384
Layers	6 / 12
Layer indices	[0, 1, 2, 3, 4, 5]
Strategy	6 layers, bottom half (syntactic-focused)
Vocab size	~38,330 (pruned from 250K)
Parameters	26,184,576
Safetensors size	98.1MB
Distilled	No

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L6_bottom")

sentences = [
    "예약 좀 해줘",           # Korean
    "What did I order?",     # English
    "今日はいい天気ですね",    # Japanese
    "Reserva una mesa",      # Spanish
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (4, 384)

MTEB Evaluation Results

Overall Average: 57.05%

MassiveIntentClassification

Average: 54.7%

Language	Score
ar	46.36%
en	59.84%
es	56.11%
ko	56.49%

MassiveScenarioClassification

Average: 59.39%

Language	Score
ar	50.55%
en	64.52%
es	60.31%
ko	62.19%

Training

This model was created via layer pruning + vocabulary pruning:

Teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden dim)
Layer selection: [0, 1, 2, 3, 4, 5] - 6 layers, bottom half (syntactic-focused)
Vocab pruning: 250K -> ~38K tokens (corpus-based filtering for 18 target languages)
No additional training - weights are directly copied from the teacher

A distilled version of this model is also available with improved performance.

Compression Summary

Stage	Vocab	Layers	Size
Teacher (original)	250,002	12	~480MB
+ Layer pruning	250,002	6	~407MB
+ Vocab pruning	~38,330	6	~98MB

Limitations

Vocabulary pruning restricts the model to the 18 target languages
Designed for short dialogue utterances, not long documents
Layer pruning may reduce performance on complex semantic tasks

Downloads last month: 3

Safetensors

Model size

25.7M params

Tensor type

F32