AQ-MedAI
/

Diver-GroupRank-32B

text-generation

passage ranking

Information-Retrieval

text-embeddings-inference

Model card Files Files and versions

prayerdan commited on Nov 23, 2025

Commit

820775a

·

verified ·

1 Parent(s): 89252f7

Update README.md

Files changed (1) hide show

README.md +7 -9

README.md CHANGED Viewed

@@ -19,18 +19,16 @@ tags:
 ## Introduction
-This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm
-Driven by Reinforcement Learning ([📝arXiv](https://arxiv.org/pdf/2511.11653)).
-Please refer our [🧩github repository](https://github.com/AQ-MedAI/Diver) for the usage of GroupRank-32B.
 ### Highlights
-## HighLights
-The Diver Retriever 4B model is a reasoning-intensive model designed to tackle the challenge of reasonIR and rader.
-We combined data from the fields of mathematics, coding, and healthcare.
-At the same time, we made precise matching in terms of the difficulty level of the samples, and uniquely
-constructed negative samples corresponding to each field. Therefore, this model performs very well on the Bright LeaderBoard
-as well as the Mteb-Medical Benchmark.

 ## Introduction
+This is the model trained in our paper: GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning  ([📝arXiv](https://arxiv.org/pdf/2511.11653)).
+Please refer our  [🧩github repository](https://github.com/AQ-MedAI/Diver) for the usage of GroupRank-32B.
 ### Highlights
+GroupRank is a reinforcement-learning-powered reranker that breaks the traditional **“pointwise vs. listwise”** trade-off:
+it feeds the query together with a small group of candidates to an LLM, lets the model perform within-group comparisons, and outputs individual relevance scores—combining the flexibility of pointwise scoring with the contextual awareness of listwise ranking.
+Training is driven by GRPO and a heterogeneous reward (NDCG + distributional alignment) that keeps scores consistent across groups, while a synthetic data pipeline eliminates the need for large human-labeled sets.
+On the reasoning-heavy BRIGHT and R2MED benchmarks, GroupRank sets new SOTA **NDCG@10 (46.8 / 52.3)** with strong generalization to classic retrieval tasks.