multilingual-e5-large-finetuned-orders

Fine-tuned version of intfloat/multilingual-e5-large for order-offer matching task.

Model Description

This model was fine-tuned on a Russian dataset of order-offer pairs for semantic similarity matching. It significantly outperforms the base model on this specific task.

Training Details

  • Base model: intfloat/multilingual-e5-large
  • Training data: 68,270 order-offer pairs
  • Loss function: MultipleNegativesRankingLoss
  • Epochs: 3
  • Batch size: 32
  • Learning rate: 2e-5
  • Training time: ~22 minutes on NVIDIA RTX PRO 6000

Performance

Metric Base E5 Fine-tuned Improvement
Accuracy@1 49.40% 72.93% +23.53%
Accuracy@5 69.52% 91.20% +21.67%
Accuracy@10 76.83% 95.11% +18.28%
MRR 0.586 0.811 +0.225

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('olegGerbylev/multilingual-e5-large-finetuned-orders')

# Important: Use E5-style prefixes
orders = ["query: Кабель ВВГнг 3x2.5 | 100 м"]
offers = ["passage: Кабель ВВГнг(А)-LS 3х2,5 | 100.0 м"]

order_embeddings = model.encode(orders)
offer_embeddings = model.encode(offers)

# Compute similarity
from sklearn.metrics.pairwise import cosine_similarity
similarity = cosine_similarity(order_embeddings, offer_embeddings)
print(f"Similarity: {similarity[0][0]:.4f}")

Input Format

This model expects E5-style prefixes:

  • For queries (orders): "query: <text>"
  • For documents (offers): "passage: <text>"

License

MIT

Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olegGerbylev/multilingual-e5-large-finetuned-orders

Finetuned
(145)
this model