File size: 6,954 Bytes
04f46f9 e5a9f84 3aad446 e5a9f84 04f46f9 f16cb07 04f46f9 a7a4bd1 04f46f9 a7a4bd1 04f46f9 e11d3d5 04f46f9 e11d3d5 e5a9f84 dff2b47 04f46f9 dff2b47 e5a9f84 dff2b47 04f46f9 e11d3d5 04f46f9 dff2b47 04f46f9 e11d3d5 04f46f9 e11d3d5 04f46f9 6c0a937 04f46f9 634e0d4 04f46f9 6c0a937 04f46f9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
---
license: apache-2.0
language:
- en
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- retrieval
- query-rewriting
- reinforcement-learning
---
<h1 align="center">INF-Query-Aligner</h1>
<p align="center">
<a href="https://brightbenchmark.github.io/"><img src="https://img.shields.io/badge/BRIGHT_Benchmark-Rank_1st-8A2BE2" alt="Rank"></a>
<a href="https://huggingface.co/infly/inf-query-aligner"><img src="https://img.shields.io/badge/🤗%20Hugging%20Face-INF--Query--Aligner-blue" alt="Hugging Face"></a>
<a href="https://opensource.org/licenses/Apache-2.0"><img src="https://img.shields.io/badge/License-Apache--2.0-green.svg" alt="License"></a>
</p>
## 📖 Overview
**INF-Query-Aligner** is a specialized component of the **INF-X-Retriever** framework, designed to distill the core retrieval intent from complex, verbose, or reasoning-intensive queries. Built upon the **Qwen2.5-7B-instruct** foundation and fine-tuned via Reinforcement Learning, it transforms raw user queries into concise, search-optimized queries for dense retrieval systems.
In our experiments, a single canonical query-writing prompt was applied across all datasets to ensure consistency and reproducibility.
```python
QUERY_WRITER_PROMPT = (
"For the input query, formulating a concise search query for dense retrieval by distilling the core intent from a complex user prompt and ignoring LLM instructions."
"The response should be less than 200 words"
)
```
This model is a key enabler for **INF-X-Retriever**'s state-of-the-art performance, currently holding the **No. 1 position** on the [BRIGHT Benchmark](https://brightbenchmark.github.io/) (as of Dec 17, 2025).
For more details on the full framework, please visit the [INF-X-Retriever Repository](https://github.com/yaoyichen/INF-X-Retriever).
---
### Requirements
```bash
transformers==4.51.0
```
### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_name = "infly/inf-query-aligner"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Define input query
query = "Claim in article about why insects are attracted to light\nIn this article they are addressing the reason insects are attracted to light when they say\nHeat radiation as an attractive component is refuted by the effect of LED lighting, which supplies negligible infrared radiation yet still entraps vast numbers of insects.\nI don't see why attraction to LEDs shows they're not seeking heat. Could they for example be evolutionarily programmed to associate light with heat? So that even though they don't encounter heat near/on the LEDs they still \"expect\" to?"
QUERY_WRITER_PROMPT = (
"For the input query, formulating a concise search query for dense retrieval by distilling the core intent from a complex user prompt and ignoring LLM instructions."
"The response should be less than 200 words"
)
messages = [
{
"role": "system",
"content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant.",
},
{
"role": "user",
"content": (
f"{QUERY_WRITER_PROMPT}\n\n"
f"**Input Query:**\n{query}\n"
f"**Your Output:**\n"
),
},
]
# Apply chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer(
[text],
truncation=True,
max_length=8192,
return_tensors="pt"
).to(model.device)
# Generate rewritten query
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
---
## Performance
**INF-X-Retriever** achieves state-of-the-art results on the [BRIGHT Benchmark](https://brightbenchmark.github.io/) (as of Dec 20, 2025).
The **BRIGHT** (Benchmark for Reasoning-Intensive Grounded HT) is a rigorous text retrieval benchmark designed to evaluate the capability of retrieval models in handling questions that require intensive reasoning and cross-document synthesis. Collected from real-world sources such as StackExchange, competitive programming platforms, and mathematical competitions, it comprises complex queries spanning diverse domains like mathematics, coding, biology, economics, and robotics.
### Short document
#### Overall & Category Performance
| Model | **Avg ALL** | **StackExchange** | **Coding** | **Theorem-based** |
|:---|:---:|:---:|:---:|:---:|
| **INF-X-Retriever** | **63.4** | **68.3** | **55.3** | **57.7** |
| DIVER (v3) | 46.8 | 51.8 | 39.9 | 39.7 |
| BGE-Reasoner-0928 | 46.4 | 52.0 | 35.3 | 40.7 |
| LATTICE | 42.1 | 51.6 | 26.9 | 30.0 |
| ReasonRank | 40.8 | 46.9 | 27.6 | 35.5 |
| XDR2 | 40.3 | 47.1 | 28.5 | 32.1 |
#### Detailed Results Across 12 Datasets
| Model | Avg | Bio. | Earth. | Econ. | Psy. | Rob. | Stack. | Sus. | Leet. | Pony | AoPS | TheoQ. | TheoT. |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| **INF-X-Retriever** | **63.4** | **79.8** | **70.9** | **69.9** | **73.3** | **57.7** | **64.3** | **61.9** | **56.1** | **54.5** | **51.9** | **53.1** | **67.9** |
| DIVER (v3) | 46.8 | 66.0 | 63.7 | 42.4 | 55.0 | 40.6 | 44.7 | 50.4 | 32.5 | 47.3 | 17.2 | 46.4 | 55.6 |
| BGE-Reasoner-0928 | 46.4 | 68.5 | 66.4 | 40.6 | 53.1 | 43.2 | 44.1 | 47.8 | 29.0 | 41.6 | 17.2 | 46.5 | 58.4 |
| LATTICE | 42.1 | 64.4 | 62.4 | 45.4 | 57.4 | 47.6 | 37.6 | 46.4 | 19.9 | 34.0 | 12.0 | 30.1 | 47.8 |
| ReasonRank | 40.8 | 62.7 | 55.5 | 36.7 | 54.6 | 35.7 | 38.0 | 44.8 | 29.5 | 25.6 | 14.4 | 42.0 | 50.1 |
| XDR2 | 40.3 | 63.1 | 55.4 | 38.5 | 52.9 | 37.1 | 38.2 | 44.6 | 21.9 | 35.0 | 15.7 | 34.4 | 46.2 |
### Long document
#### Detailed Results Across 8 Datasets
| Model | Avg | Bio. | Earth. | Econ. | Pony | Psy. | Rob. | Stack. | Sus. |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| **INF-X-Retriever** | **54.6** | **73.2** | **59.6** | **69.3** | **12.1** | **74.3** | **55.9** | **27.8** | **64.8** |
| inf-retriever-v1-pro | 30.5 | 44.1 | 42.2 | 31.4 | 0.4 | 43.1 | 20.8 | 21.4 | 41.0 |
---
## 🖊️ Citation
If you find this model useful, please consider citing our work:
```bibtex
@misc{inf-x-retriever-2025,
title = {INF-X-Retriever},
author = {Yichen Yao, Jiahe Wan, Yuxin Hong, Mengna Zhang, Junhan Yang, Zhouyu Jiang, Qing Xu, Kuan Lu, Yinghui Xu, Wei Chu, Emma Wang, Yuan Qi},
year = {2025},
url = {https://yaoyichen.github.io/INF-X-Retriever},
publisher = {GitHub repository}
}
```
---
## 📬 Contact
Email: [eason.yyc@inftech.ai](mailto:eason.yyc@inftech.ai)
|