MindscapeRAG
/

QRRanker

@@ -1,5 +1,6 @@
 ---
-license: apache-2.0
 datasets:
 - hotpotqa/hotpot_qa
 - dgslibisey/MuSiQue
@@ -8,18 +9,19 @@ datasets:
 language:
 - en
 - zh
 metrics:
 - accuracy
 - exact_match
 - f1
 - recall
-base_model:
-- Qwen/Qwen3-4B-Instruct-2507
 pipeline_tag: text-ranking
 tags:
 - Rerank
 - Memory
 ---
 # QRRanker: Query-focused and Memory-aware Reranker for Long Context Processing
 <p align="center">
@@ -28,8 +30,7 @@ tags:
   <a href="https://huggingface.co/MindscapeRAG/QRRanker"><b>🤗 Models</b></a>
 </p>
-QRRanker is a lightweight reranking framework that leverages **Query-focused Retrieval (QR) heads** to produce continuous relevance scores, enabling effective listwise reranking with small-scale models.
 ## Model Description
@@ -290,7 +291,7 @@ def compute_qr_scores(
     # Select specific QR heads
     if qr_head_list is not None:
-        head_set = [tuple(map(int, h.split('-'))) for h in qr_head_list.split(',')]
         indices = torch.tensor(head_set).to(all_head_scores.device)
         layers, heads = indices[:, 0], indices[:, 1]
         all_head_scores = all_head_scores[:, layers, heads, :]
@@ -323,8 +324,11 @@ def rerank_documents(model, tokenizer, question, paragraphs, qr_head_list, devic
         scores: Corresponding relevance scores
     """
     # Build input sequence
-    prompt_prefix = '<|im_start|>user\n'
-    retrieval_instruction = "Here are some retrieved chunks:\n\n"
     chunk_part = prompt_prefix + retrieval_instruction
     chunk_ranges = []
@@ -336,9 +340,13 @@ def rerank_documents(model, tokenizer, question, paragraphs, qr_head_list, devic
         chunk_part += ' ' + text.strip()
         end = len(chunk_part)
         chunk_ranges.append([start, end])
-        chunk_part += '\n\n'
-    query_part = f"Use the retrieved chunks to answer the user's query.\n\nQuery: {question}"
     full_seq = chunk_part + query_part
     # Tokenize with offset mapping
@@ -449,6 +457,7 @@ python qr_ranker_inference.py \
 | `--use_summary` | flag | False | Use summary field in data |
 If you use our QRRanker, please kindly cite:

 ---
+base_model:
+- Qwen/Qwen3-4B-Instruct-2507
 datasets:
 - hotpotqa/hotpot_qa
 - dgslibisey/MuSiQue
 language:
 - en
 - zh
+license: apache-2.0
 metrics:
 - accuracy
 - exact_match
 - f1
 - recall
 pipeline_tag: text-ranking
+library_name: transformers
 tags:
 - Rerank
 - Memory
 ---
 # QRRanker: Query-focused and Memory-aware Reranker for Long Context Processing
 <p align="center">
   <a href="https://huggingface.co/MindscapeRAG/QRRanker"><b>🤗 Models</b></a>
 </p>
+QRRanker is a lightweight reranking framework that leverages **Query-focused Retrieval (QR) heads** to produce continuous relevance scores, enabling effective listwise reranking with small-scale models. It was introduced in the paper [Query-focused and Memory-aware Reranker for Long Context Processing](https://huggingface.co/papers/2602.12192).
 ## Model Description
     # Select specific QR heads
     if qr_head_list is not None:
+        head_set = [tuple(map(int, h.split('-'))) for h in qr_head_list.split(',') ]
         indices = torch.tensor(head_set).to(all_head_scores.device)
         layers, heads = indices[:, 0], indices[:, 1]
         all_head_scores = all_head_scores[:, layers, heads, :]
         scores: Corresponding relevance scores
     """
     # Build input sequence
+    prompt_prefix = '<|im_start|>user
+'
+    retrieval_instruction = "Here are some retrieved chunks:
+"
     chunk_part = prompt_prefix + retrieval_instruction
     chunk_ranges = []
         chunk_part += ' ' + text.strip()
         end = len(chunk_part)
         chunk_ranges.append([start, end])
+        chunk_part += '
+'
+    query_part = f"Use the retrieved chunks to answer the user's query.
+Query: {question}"
     full_seq = chunk_part + query_part
     # Tokenize with offset mapping
 | `--use_summary` | flag | False | Use summary field in data |
+## Citation
 If you use our QRRanker, please kindly cite: