StaRSE-512

StaRSE stands for Static Russian Sentence Embeddings. It is a compact Russian sentence embedding model implemented as a Sentence-Transformers StaticEmbedding endpoint.

The model is intended for CPU-friendly semantic similarity, clustering, classification features, and retrieval-style first-stage representations when a full Transformer encoder is too expensive to run at high throughput.

Performance

Evaluation is reported on MTEB(rus, v1.1) across 23 tasks. The main score is mean_task_main_score = 51.16.

Task type	Tasks	Mean score
Classification	9	56.81
Clustering	3	51.80
MultilabelClassification	2	35.01
PairClassification	1	52.50
Reranking	2	41.88
Retrieval	3	39.09
STS	3	62.18

Usage

Install Sentence Transformers:

pip install -U sentence-transformers

Load the model with trust_remote_code=True.

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BorisTM/starse-512", trust_remote_code=True)

sentences = [
    "Партитуры Чайковского часто звучат в консерватории.",
    "Балетная сцена хранит музыку Щелкунчика.",
    "Футбольная команда выиграла матч.",
]

embeddings = model.encode(sentences, normalize_embeddings=True)
similarities = model.similarity(embeddings, embeddings)
print(embeddings.shape)           # (3, 512)
print(tuple(similarities.shape))  # (3, 3)
print(similarities)
# tensor([[1.0000, 0.3521, 0.0626],
#         [0.3521, 1.0000, 0.0420],
#         [0.0626, 0.0420, 1.0000]])

Citation

@misc{starse2026,
  title = {TBD},
  author = {TBD},
  year = {TBD},
  url = {https://huggingface.co/BorisTM/starse-512}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

7.81M params

Tensor type

F32