Text Ranking
sentence-transformers
Safetensors
English
qwen3
finance
legal
code
stem
medical
custom_code
Instructions to use zeroentropy/zerank-1-reranker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use zeroentropy/zerank-1-reranker with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("zeroentropy/zerank-1-reranker", trust_remote_code=True) query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
base_model:
|
| 6 |
+
- Qwen/Qwen3-4B
|
| 7 |
+
pipeline_tag: text-ranking
|
| 8 |
+
tags:
|
| 9 |
+
- finance
|
| 10 |
+
- legal
|
| 11 |
+
- code
|
| 12 |
+
- stem
|
| 13 |
+
- medical
|
| 14 |
+
---
|
| 15 |
+
# zerank-1: ZeroEntropy Inc.'s SoTA reranker
|
| 16 |
+
|
| 17 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 18 |
+
|
| 19 |
+
This model is an open-weights reranker model meant to be integrated into RAG applications to rerank results from preliminary search methods such as embeddings, BM25, and hybrid search.
|
| 20 |
+
|
| 21 |
+
This reranker outperforms all competitor rerankers, including ones twice its size, across a wide variety of task domains, including on finance, legal, code, STEM, medical, and conversational data. See [this post](https://evals_blog_post) for more details.
|
| 22 |
+
This model is trained on an innovative multi-stage pipeline that models query-document relevance scores using adjusted Elo-like ratings. See [this post](https://technical_blog_post) and our Technical Report (Coming soon!) for more details.
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
## How to Use
|
| 28 |
+
|
| 29 |
+
## Evaluations
|
| 30 |
+
|
| 31 |
+
Comparing NDCG@10 starting from top 100 documents by embedding (using text-3-embedding-small):
|
| 32 |
+
|
| 33 |
+
| Task | Embedding | cohere-rerank-v3.5 | Salesforce/Llama-rank-v1 | zerank-1-small | zerank-1 |
|
| 34 |
+
|----------------|-----------|--------------------|--------------------------|----------------|----------|
|
| 35 |
+
| Code | 0.678 | 0.724 | 0.694 | 0.730 | 0.754 |
|
| 36 |
+
| Conversational | 0.250 | 0.571 | 0.484 | 0.556 | 0.596 |
|
| 37 |
+
| Finance | 0.839 | 0.824 | 0.828 | 0.861 | 0.894 |
|
| 38 |
+
| Legal | 0.703 | 0.804 | 0.767 | 0.817 | 0.821 |
|
| 39 |
+
| Medical | 0.619 | 0.750 | 0.719 | 0.773 | 0.796 |
|
| 40 |
+
| STEM | 0.401 | 0.510 | 0.595 | 0.680 | 0.694 |
|
| 41 |
+
|
| 42 |
+
Comparing BM25 and Hybrid Search without and with zerank-1:
|
| 43 |
+
|
| 44 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/2GPVHFrI39FspnSNklhsM.png" alt="Description" width="400"/> <img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/dwYo2D7hoL8QiE8u3yqr9.png" alt="Description" width="400"/>
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
## Citation
|
| 48 |
+
|
| 49 |
+
**BibTeX:**
|
| 50 |
+
|
| 51 |
+
Coming soon!
|
| 52 |
+
|
| 53 |
+
**APA:**
|
| 54 |
+
|
| 55 |
+
Coming soon!
|
| 56 |
+
|