e-Procure Product Embeddings

Bilingual (English/Arabic) sentence embeddings fine-tuned for B2B procurement product matching on the e-Procure platform.

Model Description

Fine-tuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on 48,000 product pairs from Saudi Arabian B2B procurement catalogs. Optimized for matching purchase requests to supplier catalog items across English and Arabic.

Key Capabilities

  • Cross-lingual matching: Match English RFQ terms to Arabic product descriptions and vice versa
  • Industry-specific: Trained on construction, electrical, HVAC, plumbing, and safety equipment catalogs
  • SKU-aware: Understands product codes, part numbers, and technical specifications

Training Data

Category English Pairs Arabic Pairs Cross-lingual
Construction Materials 8,200 6,100 3,400
Electrical Equipment 7,500 5,800 2,900
HVAC Systems 5,100 4,200 2,100
Plumbing Supplies 4,800 3,600 1,800
Safety Equipment 3,900 2,800 1,500

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("brijeshvadi/eprocure-product-embeddings")

queries = ["3-phase circuit breaker 400A", "قاطع دائرة ثلاثي الطور 400 أمبير"]
products = ["ABB SACE Tmax XT4 400A 3P MCCB", "Schneider NSX400N 3P 400A"]

query_emb = model.encode(queries)
product_emb = model.encode(products)

Architecture

  • Base: paraphrase-multilingual-MiniLM-L12-v2
  • Embedding Dim: 384
  • Max Seq Length: 128
  • Pooling: Mean pooling
  • Training Loss: MultipleNegativesRankingLoss + CosineSimilarityLoss

Platform Context

Built for e-Procure, a B2B procurement platform serving Saudi Arabian construction and industrial supply chains. The platform uses Next.js 15, Strapi CMS, and Redux Toolkit Query.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train brijeshvadi/eprocure-product-embeddings

Evaluation results