Text Ranking
sentence-transformers
Safetensors
English
modernbert
ecommerce
e-commerce
retail
marketplace
shopping
amazon
ebay
alibaba
google
rakuten
bestbuy
walmart
flipkart
wayfair
shein
target
etsy
shopify
taobao
asos
carrefour
costco
overstock
pretraining
encoder
language-modeling
foundation-model
text-embeddings-inference
metadata
license: apache-2.0
language:
- en
tags:
- ecommerce
- e-commerce
- retail
- marketplace
- shopping
- amazon
- ebay
- alibaba
- google
- rakuten
- bestbuy
- walmart
- flipkart
- wayfair
- shein
- target
- etsy
- shopify
- taobao
- asos
- carrefour
- costco
- overstock
- pretraining
- encoder
- language-modeling
- foundation-model
base_model:
- thebajajra/RexBERT-micro
pipeline_tag: text-ranking
library_name: sentence-transformers
datasets:
- thebajajra/Amazebay-Relevance
RexReranker Micro
State-of-the-art e-commerce neural reranker based on RexBERT-micro that predicts relevance scores, given a search query and product details.
Features
- Output: Predicts a probability score between 0.0 and 1.0
- CrossEncoder Compatible: Works directly with Sentence Transformers CrossEncoder
- Mean Pooling: Uses mean pooling over all tokens for robust representations
Installation
pip install transformers sentence-transformers torch
Quick Start
1. Using HuggingFace Transformers
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model_id = "thebajajra/RexReranker-micro"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device).eval()
query = "best laptop for programming"
title = "MacBook Pro M3"
description = "Powerful laptop with M3 chip, 16GB RAM, perfect for developers and creative professionals"
inputs = tokenizer(
f"Query: {query}",
f"Title: {title}\nDescription: {description}",
return_tensors="pt",
truncation=True,
max_length=min(model.config.max_position_embeddings, 7999),
).to(device)
with torch.no_grad():
outputs = model(**inputs)
score = outputs.logits.squeeze(-1) # shape: [batch]
print(f"Relevance Score: {score[0].item():.4f}")
2. Using Sentence Transformers CrossEncoder
from sentence_transformers import CrossEncoder
# Load as CrossEncoder
model = CrossEncoder(
"thebajajra/RexReranker-micro",
trust_remote_code=True
)
# Single prediction
query = "best laptop for programming"
document = "MacBook Pro M3 - Powerful laptop with M3 chip for developers"
score = model.predict([(query, document)])[0]
print(f"Score: {score:.4f}")
3. Batch Reranking with CrossEncoder
from sentence_transformers import CrossEncoder
model = CrossEncoder("thebajajra/RexReranker-micro", trust_remote_code=True)
query = "best laptop for programming"
documents = [
"MacBook Pro M3 - Powerful laptop with M3 chip for developers",
"Gaming Mouse RGB - High precision gaming mouse with 16000 DPI",
"ThinkPad X1 Carbon - Business ultrabook with long battery life",
"Mechanical Keyboard - Cherry MX switches for typing comfort",
"Dell XPS 15 - Premium laptop with 4K OLED display",
]
# Get scores for all documents
pairs = [(query, doc) for doc in documents]
scores = model.predict(pairs)
# Print ranked results
print(f"Query: {query}\n")
for doc, score in sorted(zip(documents, scores), key=lambda x: x[1], reverse=True):
print(f" {score:.4f} | {doc[:60]}")
4. Using CrossEncoder's rank() Method
from sentence_transformers import CrossEncoder
model = CrossEncoder("thebajajra/RexReranker-micro", trust_remote_code=True)
query = "wireless headphones with noise cancellation"
documents = [
"Sony WH-1000XM5 - Industry-leading noise cancellation headphones",
"Apple AirPods Max - Premium over-ear headphones with spatial audio",
"Bose QuietComfort 45 - Comfortable wireless noise cancelling headphones",
"JBL Tune 750BTNC - Affordable wireless headphones with ANC",
"Logitech Gaming Headset - Wired gaming headphones with microphone",
]
# Rank documents
results = model.rank(query, documents, top_k=3)
print(f"Query: {query}\n")
print("Top 3 Results:")
for result in results:
idx = result['corpus_id']
score = result['score']
print(f" {score:.4f} | {documents[idx][:60]}")
Input Format
The model expects query-document pairs formatted as:
| Field | Format |
|---|---|
| Text A (Query) | Query: {your search query} |
| Text B (Document) | Title: {document title}\nDescription: {document description} |