Surpem
/

Supertron2-Reranker-8B

@@ -1,62 +1,129 @@
 ---
-library_name: sentence-transformers
 license: apache-2.0
 base_model:
 - Qwen/Qwen3-VL-Reranker-8B
 pipeline_tag: text-ranking
 tags:
-- supertron2
-- reranker
-- qwen3-vl
-- text-ranking
 - cross-encoder
-language:
-- en
 ---
-# Supertron2-Reranker-8B
-Supertron2-Reranker-8B is a short fine-tune of `Qwen/Qwen3-VL-Reranker-8B` for text reranking.
-It is trained on real reranking pairs, primarily MS MARCO, for search and RAG reranking.
-## Usage
 ```python
-import torch
-from transformers import AutoModelForImageTextToText, AutoProcessor
 model_id = "Surpem/Supertron2-Reranker-8B"
-processor = AutoProcessor.from_pretrained(model_id)
-model = AutoModelForImageTextToText.from_pretrained(
-    model_id,
-    torch_dtype=torch.bfloat16,
-    device_map="auto",
-)
-query = "What is the capital of France?"
-documents = ["Paris is the capital of France.", "Mars is the red planet."]
-prompts = [
-    f"Retrieve text relevant to the user's query.
-Query: {query}
-"
-    f"Document: {document}
-"
-    "Is this document relevant to the query? Answer yes or no:"
-    for document in documents
 ]
-inputs = processor(text=prompts, padding=True, return_tensors="pt").to(model.device)
-yes_id = processor.tokenizer.encode("yes", add_special_tokens=False)[-1]
-no_id = processor.tokenizer.encode("no", add_special_tokens=False)[-1]
-with torch.inference_mode():
-    logits = model(**inputs, return_dict=True, logits_to_keep=1).logits[:, -1, :]
-scores = (logits[:, yes_id] - logits[:, no_id]).float()
 print(scores)
 ```
-## Limitations
-This is a short 30-minute H100 fine-tune. It should be evaluated on your retrieval domain before production use.

 ---
 license: apache-2.0
+language:
+- en
 base_model:
 - Qwen/Qwen3-VL-Reranker-8B
 pipeline_tag: text-ranking
+library_name: sentence-transformers
 tags:
+- reranking
+- retrieval
+- rag
 - cross-encoder
+- qwen3-vl
+- pytorch
+---
+# **Supertron2-Reranker-8B: A Compact Cross-Encoder Reranking Model**
+## **Model Description**
+**Supertron2-Reranker-8B** is a reranking model built on top of [Qwen/Qwen3-VL-Reranker-8B](https://huggingface.co/Qwen/Qwen3-VL-Reranker-8B). It is designed to score query-document pairs for retrieval pipelines, search systems, and RAG applications where a stronger second-stage ranker is useful.
+* **Developed by:** Surpem
+* **Model type:** Cross-Encoder Reranker
+* **Architecture:** Qwen3-VL reranker, 8B parameters
+* **License:** Apache 2.0
 ---
+## **Capabilities**
+### **Search Reranking**
+Supertron2-Reranker-8B can compare a user query against candidate passages and assign relevance scores. It is intended as a second-stage reranker after a faster retriever has already selected candidate documents.
+### **RAG Pipelines**
+The model can help improve retrieval-augmented generation by pushing more relevant documents toward the top of the context window before answer generation.
+### **Question-Document Matching**
+Supertron2-Reranker-8B is useful for matching questions to passages, snippets, help-center articles, documentation chunks, and other text candidates.
+### **Instruction-Aware Retrieval**
+The model is prompted for relevance scoring, making it suitable for natural language search tasks where query intent matters.
+---
+## **Get Started**
 ```python
+from sentence_transformers import CrossEncoder
 model_id = "Surpem/Supertron2-Reranker-8B"
+model = CrossEncoder(model_id)
+pairs = [
+    ("What is the capital of France?", "Paris is the capital and largest city of France."),
+    ("What is the capital of France?", "Mars is often called the red planet."),
 ]
+scores = model.predict(pairs)
 print(scores)
 ```
+Example reranking:
+```python
+query = "How do I reset my password?"
+documents = [
+    "Use the account recovery page to reset your password.",
+    "Our refund policy allows returns within 30 days.",
+    "Two-factor authentication adds extra login security.",
+]
+results = model.rank(query, documents)
+print(results)
+```
+---
+## **Hardware Requirements**
+| Precision | Min VRAM | Recommended |
+|---|---|---|
+| bfloat16 | 18 GB | 24 GB+ |
+| 4-bit quantized | 6 GB | 10 GB+ |
+For larger batches or long documents, use more VRAM or reduce the batch size/max sequence length.
+---
+## **Intended Use**
+Supertron2-Reranker-8B is intended for:
+* Search reranking
+* RAG document reranking
+* Query-passage relevance scoring
+* Documentation and knowledge-base retrieval
+* Evaluation of candidate retrieval results
+It is not intended to be used as a standalone chat model.
+---
+## **Limitations**
+* The model scores relevance; it does not generate answers.
+* It should be evaluated on your own retrieval domain before production use.
+* Long documents may need chunking before reranking.
+* Relevance scores are relative and may not be calibrated across unrelated queries.
+* The model may still rank incorrect, outdated, or unsafe content highly if it appears textually relevant.
+---
+## **Citation**
+```bibtex
+@misc{surpem2026supertron2-reranker-8b,
+      title={Supertron2-Reranker-8B -- Compact Cross-Encoder Reranking Model},
+      author={Surpem},
+      year={2026},
+      url={https://huggingface.co/Surpem/Supertron2-Reranker-8B},
+}
+```