Update README.md
Browse files
README.md
CHANGED
|
@@ -19,18 +19,16 @@ tags:
|
|
| 19 |
|
| 20 |
## Introduction
|
| 21 |
|
| 22 |
-
This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm
|
| 23 |
-
|
| 24 |
-
Please refer our
|
| 25 |
|
| 26 |
### Highlights
|
| 27 |
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
constructed negative samples corresponding to each field. Therefore, this model performs very well on the Bright LeaderBoard
|
| 33 |
-
as well as the Mteb-Medical Benchmark.
|
| 34 |
|
| 35 |
|
| 36 |
|
|
|
|
| 19 |
|
| 20 |
## Introduction
|
| 21 |
|
| 22 |
+
This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning ([📝arXiv](https://arxiv.org/pdf/2511.11653)).
|
| 23 |
+
|
| 24 |
+
Please refer our [🧩github repository](https://github.com/AQ-MedAI/Diver) for the usage of GroupRank-32B.
|
| 25 |
|
| 26 |
### Highlights
|
| 27 |
|
| 28 |
+
GroupRank is a reinforcement-learning-powered reranker that breaks the traditional **“pointwise vs. listwise”** trade-off:
|
| 29 |
+
it feeds the query together with a small group of candidates to an LLM, lets the model perform within-group comparisons, and outputs individual relevance scores—combining the flexibility of pointwise scoring with the contextual awareness of listwise ranking.
|
| 30 |
+
Training is driven by GRPO and a heterogeneous reward (NDCG + distributional alignment) that keeps scores consistent across groups, while a synthetic data pipeline eliminates the need for large human-labeled sets.
|
| 31 |
+
On the reasoning-heavy BRIGHT and R2MED benchmarks, GroupRank sets new SOTA **NDCG@10 (46.8 / 52.3)** with strong generalization to classic retrieval tasks.
|
|
|
|
|
|
|
| 32 |
|
| 33 |
|
| 34 |
|