Sentence Similarity
sentence-transformers
Safetensors
bert
multilingual
layer-pruning
vocab-pruning
minilm-l12
text-embeddings-inference
Instructions to use gomyk/minilm-student-L2_ends with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use gomyk/minilm-student-L2_ends with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("gomyk/minilm-student-L2_ends") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
L2_ends
Lightweight sentence encoder created from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 via layer pruning + vocabulary pruning.
Model Details
| Property | Value |
|---|---|
| Teacher | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 |
| Architecture | MiniLM-L12 (pruned) |
| Hidden dim | 384 |
| Layers | 2 / 12 |
| Layer indices | [0, 11] |
| Strategy | 2 layers, first + last (minimal) |
| Parameters | 99,741,312 |
| Model size (FP32) | 71.0MB |
| Distilled | No |
Architecture
==============================================================
TEACHER: MiniLM-L12 β STUDENT: 2L / 38,734 vocab
==============================================================
TEACHER STUDENT
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β Input Tokens β β Input Tokens β
ββββββββββββββ¬βββββββββββββ ββββββββββββββ¬βββββββββββββ
β β
ββββββββββββββ΄βββββββββββββ ββββββββββββββ΄βββββββββββββ
β Embeddings β β Embeddings (pruned) β
β vocab: 250,002 β β vocab: 38,734 β
β dim: 384 β β dim: 384 β
ββββββββββββββ¬βββββββββββββ ββββββββββββββ¬βββββββββββββ
β β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β Layer 0 β βββΊ β Layer 0 β L0 β
βββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββ€
β Layer 1 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 2 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 3 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 4 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 5 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 6 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 7 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 8 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 9 β β³ β β
β β β β β β β β β β β ββ€ β β
β Layer 10 β β³ β β
βββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββ€
β Layer 11 β βββΊ β Layer 1 β L11 β
ββββββββββββββ¬βββββββββββββ ββββββββββββββ¬βββββββββββββ
β β
ββββββββββββββ΄βββββββββββββ ββββββββββββββ΄βββββββββββββ
β Mean Pooling β β Mean Pooling β
β β 384d embedding β β β 384d embedding β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
Size: 448.0MB (FP32) β 71.0MB (FP32)
Params: 117,451,392 β 18,614,400
Reduction: 84.2%
==============================================================
Quick Start
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("L2_ends", trust_remote_code=True)
sentences = [
"Hello, how are you?",
"μλ
νμΈμ",
"Bonjour, comment allez-vous?",
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (3, 384)
Training
Created via layer pruning + vocabulary pruning (no additional training):
- Teacher:
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2(12 layers, 384d) - Layer selection:
[0, 11]- 2 layers, first + last (minimal) - Vocab pruning: Corpus-based filtering for target languages
Supported Languages (18)
ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl
- Downloads last month
- 1