L4_uniform

Lightweight multilingual sentence encoder optimized for intent classification. Created from paraphrase-multilingual-MiniLM-L12-v2 via layer pruning + corpus-based vocabulary pruning.

Model Details

Property	Value
Teacher	paraphrase-multilingual-MiniLM-L12-v2
Architecture	XLM-RoBERTa (pruned)
Hidden dim	384
Layers	4 / 12
Layer indices	[0, 4, 7, 11]
Strategy	4 layers, evenly spaced (compact)
Vocab size	~38,330 (pruned from 250K)
Parameters	22,642,560
Safetensors size	84.6MB
Distilled	No

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("L4_uniform")

sentences = [
    "예약 좀 해줘",           # Korean
    "What did I order?",     # English
    "今日はいい天気ですね",    # Japanese
    "Reserva una mesa",      # Spanish
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (4, 384)

MTEB Evaluation Results

Overall Average: 52.03%

MassiveIntentClassification

Average: 50.25%

Language	Score
ar	41.2%
en	57.63%
es	49.12%
ko	53.03%

MassiveScenarioClassification

Average: 53.82%

Language	Score
ar	43.82%
en	61.91%
es	53.64%
ko	55.9%

Training

This model was created via layer pruning + vocabulary pruning:

Teacher: paraphrase-multilingual-MiniLM-L12-v2 (12 layers, 384 hidden dim)
Layer selection: [0, 4, 7, 11] - 4 layers, evenly spaced (compact)
Vocab pruning: 250K -> ~38K tokens (corpus-based filtering for 18 target languages)
No additional training - weights are directly copied from the teacher

A distilled version of this model is also available with improved performance.

Compression Summary

Stage	Vocab	Layers	Size
Teacher (original)	250,002	12	~480MB
+ Layer pruning	250,002	4	~393MB
+ Vocab pruning	~38,330	4	~85MB

Limitations

Vocabulary pruning restricts the model to the 18 target languages
Designed for short dialogue utterances, not long documents
Layer pruning may reduce performance on complex semantic tasks

Downloads last month: 9

Safetensors

Model size

22.2M params

Tensor type

F32