multilingual-e5-large-instruct-rus-32768
This model is a 39.73% smaller version of intfloat/multilingual-e5-large-instruct optimized for Russian language via vocabulary size reduction using the trimming method.
This trimmed model should perform similarly to the original model with only 32,768 tokens and a much smaller memory footprint. However, it may not perform well for other languages as tokens not commonly used in the selected languages were removed from the vocabulary.
Model Statistics
| Metric |
Original |
Trimmed |
Reduction |
| Vocabulary size |
250,037 tokens |
32,768 tokens |
86.89% |
| Model size |
559,890,432 params |
337,442,816 params |
39.73% |

Mining Dataset Statistics
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("alphaedge-ai/multilingual-e5-large-instruct-rus-32768")
query = "My query in Russian"
documents = [
"Chunk in Russian",
"Chunk in Russian",
"Chunk in Russian",
]
query_embeddings = model.encode_query(query)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
Citations
Multilingual E5
@article{wang2024multilingual,
title={Multilingual E5 Text Embeddings: A Technical Report},
author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu},
journal={arXiv preprint arXiv:2402.05672},
year={2024}
}
Trimming blog post
@misc{hf_blogpost_trimming,
title={Introduction to Trimming},
author={Loïck BOURDOIS and Tom AARSEN and Bram VANROY and Christopher AKIKI and Woojun JUNG and Manuel ROMERO and Prithiv SAKTHI},
year={2026},
url={https://huggingface.co/blog/lbourdois/introduction-to-trimming},
}