unicamp-dl/mmarco
Updated • 2.09k • 92
How to use andreaschari/bge-m3-RU_MMARCO_TRANSLIT with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("andreaschari/bge-m3-RU_MMARCO_TRANSLIT")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]This is a BGE-M3 model post-trained on the Russian dataset from MMARCO/v2. The queries are transliterated Russian to English using uroman.
The model was used for the SIGIR 2025 Short paper: Lost in Transliteration: Bridging the Script Gap in Neural IR.
Base model
BAAI/bge-m3