multilingual-e5-large-finetuned-orders
Fine-tuned version of intfloat/multilingual-e5-large for order-offer matching task.
Model Description
This model was fine-tuned on a Russian dataset of order-offer pairs for semantic similarity matching. It significantly outperforms the base model on this specific task.
Training Details
- Base model: intfloat/multilingual-e5-large
- Training data: 68,270 order-offer pairs
- Loss function: MultipleNegativesRankingLoss
- Epochs: 3
- Batch size: 32
- Learning rate: 2e-5
- Training time: ~22 minutes on NVIDIA RTX PRO 6000
Performance
| Metric | Base E5 | Fine-tuned | Improvement |
|---|---|---|---|
| Accuracy@1 | 49.40% | 72.93% | +23.53% |
| Accuracy@5 | 69.52% | 91.20% | +21.67% |
| Accuracy@10 | 76.83% | 95.11% | +18.28% |
| MRR | 0.586 | 0.811 | +0.225 |
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('olegGerbylev/multilingual-e5-large-finetuned-orders')
# Important: Use E5-style prefixes
orders = ["query: Кабель ВВГнг 3x2.5 | 100 м"]
offers = ["passage: Кабель ВВГнг(А)-LS 3х2,5 | 100.0 м"]
order_embeddings = model.encode(orders)
offer_embeddings = model.encode(offers)
# Compute similarity
from sklearn.metrics.pairwise import cosine_similarity
similarity = cosine_similarity(order_embeddings, offer_embeddings)
print(f"Similarity: {similarity[0][0]:.4f}")
Input Format
This model expects E5-style prefixes:
- For queries (orders):
"query: <text>" - For documents (offers):
"passage: <text>"
License
MIT
- Downloads last month
- -
Model tree for olegGerbylev/multilingual-e5-large-finetuned-orders
Base model
intfloat/multilingual-e5-large