prayerdan commited on
Commit
820775a
·
verified ·
1 Parent(s): 89252f7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -19,18 +19,16 @@ tags:
19
 
20
  ## Introduction
21
 
22
- This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm
23
- Driven by Reinforcement Learning ([📝arXiv](https://arxiv.org/pdf/2511.11653)).
24
- Please refer our [🧩github repository](https://github.com/AQ-MedAI/Diver) for the usage of GroupRank-32B.
25
 
26
  ### Highlights
27
 
28
- ## HighLights
29
- The Diver Retriever 4B model is a reasoning-intensive model designed to tackle the challenge of reasonIR and rader.
30
- We combined data from the fields of mathematics, coding, and healthcare.
31
- At the same time, we made precise matching in terms of the difficulty level of the samples, and uniquely
32
- constructed negative samples corresponding to each field. Therefore, this model performs very well on the Bright LeaderBoard
33
- as well as the Mteb-Medical Benchmark.
34
 
35
 
36
 
 
19
 
20
  ## Introduction
21
 
22
+ This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning ([📝arXiv](https://arxiv.org/pdf/2511.11653)).
23
+
24
+ Please refer our [🧩github repository](https://github.com/AQ-MedAI/Diver) for the usage of GroupRank-32B.
25
 
26
  ### Highlights
27
 
28
+ GroupRank is a reinforcement-learning-powered reranker that breaks the traditional **“pointwise vs. listwise”** trade-off:
29
+ it feeds the query together with a small group of candidates to an LLM, lets the model perform within-group comparisons, and outputs individual relevance scores—combining the flexibility of pointwise scoring with the contextual awareness of listwise ranking.
30
+ Training is driven by GRPO and a heterogeneous reward (NDCG + distributional alignment) that keeps scores consistent across groups, while a synthetic data pipeline eliminates the need for large human-labeled sets.
31
+ On the reasoning-heavy BRIGHT and R2MED benchmarks, GroupRank sets new SOTA **NDCG@10 (46.8 / 52.3)** with strong generalization to classic retrieval tasks.
 
 
32
 
33
 
34