Text Generation
Transformers
English
query-rewriting
reasoning
retrieval
BRIGHT
GRPO
alignment
Eval Results (legacy)
Instructions to use ForwardAILabs/MQR-A1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ForwardAILabs/MQR-A1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ForwardAILabs/MQR-A1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ForwardAILabs/MQR-A1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ForwardAILabs/MQR-A1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ForwardAILabs/MQR-A1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ForwardAILabs/MQR-A1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ForwardAILabs/MQR-A1
- SGLang
How to use ForwardAILabs/MQR-A1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ForwardAILabs/MQR-A1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ForwardAILabs/MQR-A1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ForwardAILabs/MQR-A1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ForwardAILabs/MQR-A1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ForwardAILabs/MQR-A1 with Docker Model Runner:
docker model run hf.co/ForwardAILabs/MQR-A1
File size: 6,261 Bytes
4844d29 6f94e96 4844d29 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | ---
language:
- en
license: apache-2.0
tags:
- query-rewriting
- reasoning
- retrieval
- BRIGHT
- GRPO
- alignment
library_name: transformers
base_model: Qwen/Qwen3-4B-Instruct-2507
pipeline_tag: text-generation
datasets:
- xlangai/BRIGHT
model-index:
- name: MQR-A1
results:
- task:
type: Retrieval
dataset:
type: xlangai/BRIGHT
name: BRIGHT (Short, Pipeline)
metrics:
- type: ndcg_at_10
value: 66.9
name: nDCG@10
- task:
type: Retrieval
dataset:
type: xlangai/BRIGHT
name: BRIGHT (Long, Pipeline)
metrics:
- type: ndcg_at_10
value: 56.0
name: nDCG@10
---
<div align="center">
<h3>Built by <a href="https://huggingface.co/ForwardAILabs">Forward AI Labs</a></h3>
<p>We are an AI company that provides recruitment agents. | <a href="https://www.mira.day/">mira.day</a></p>
</div>
---
# MQR-A1: Mira Query Rewriter — Alignment v1
**MQR-A1** (Mira Query Rewriter, Alignment v1) is a GRPO-aligned query rewriting model designed for reasoning-intensive retrieval tasks. It is the alignment component of our retrieval pipeline: by rewriting queries to distill core retrieval intents and filter misleading noise, MQR-A1 enables the downstream retriever [MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1) to achieve state-of-the-art performance.
Combined as the **MQR-A1 + MRE-T1** pipeline, our system achieves **No. 1** on the [BRIGHT Benchmark](https://brightbenchmark.github.io/) across all evaluated dimensions — including both short and long document tracks — outperforming sophisticated rerankers, existing alignment models, and complex agentic pipelines.
## Highlights
- **BRIGHT Short Pipeline nDCG@10: 66.9** — No. 1 on the short document retrieval leaderboard
- **BRIGHT Long Pipeline nDCG@10: 56.0** — No. 1 on the long document retrieval leaderboard
- **Intent Distillation over Query Expansion**: Shifts the paradigm from simple additive expansion to RL-driven discriminative feature extraction
- **Immune to Semantic Traps**: Strongly filters out redundant or misleading superficial semantic noise in complex reasoning tasks
- **SNR Enhancement**: Significantly improves the signal-to-noise ratio of retrieval signals, mitigating the "feature dilution" effect under long-text inputs
## Training Methodology
MQR-A1 is trained using a three-stage approach:
### 1. Candidate Rewrite Mining
To prevent the model from converging on rigid, template-based shortcuts during SFT, we implemented a heterogeneous candidate rewrite mining strategy. For every query, we dynamically synthesized a diverse set of natural-language-structured rewrites, forcing the model to prioritize underlying retrieval intent over superficial syntactic patterns.
### 2. Cold Start (SFT)
Traditional cross-entropy loss is inherently misaligned with retrieval objectives. By injecting natural language structural features derived from the mining stage, we equip the base model with strong discriminative feature extraction capabilities, laying a stable initialization foundation for the subsequent GRPO phase.
### 3. GRPO-Driven Intent Alignment
Unlike DPO, which relies on static preference data, GRPO (Group Relative Policy Optimization) enables the model to engage in interactive learning via retrieval feedback within the actual document corpus environment. This allows the model to autonomously explore and extract highly discriminative features, achieving a fundamental leap from superficial "textual matching" to deep "intent alignment."
**Multi-Dimensional Reward Function:**
| Component | Description |
|-----------|-------------|
| Primary Reward | Dense retrieval NDCG scores |
| Constraint | Length penalty to prevent feature dilution |
| Reward Shaping | Cosine similarity with positive examples to maintain semantic grounding |
## BRIGHT Benchmark Results (Pipeline: MQR-A1 + MRE-T1)
### Short Document Retrieval (nDCG@10)
| Task | MQR-A1 + MRE-T1 |
|------|:----------------:|
| Biology | **86.7** |
| Earth Science | **78.5** |
| Economics | 69.7 |
| Psychology | **78.2** |
| Robotics | **58.4** |
| StackOverflow | **67.0** |
| Sustainable Living | **65.9** |
| LeetCode | 46.8 |
| Pony | **73.4** |
| AoPS | 45.2 |
| TheoremQA (Questions) | **60.6** |
| TheoremQA (Theorems) | **72.3** |
| **Average** | **66.9** |
### Long Document Retrieval (nDCG@10)
| Task | MQR-A1 + MRE-T1 |
|------|:----------------:|
| Biology | **77.1** |
| Earth Science | 59.0 |
| Economics | **71.2** |
| Psychology | 73.8 |
| Robotics | 46.0 |
| StackOverflow | **35.5** |
| Sustainable Living | **70.6** |
| Pony | **14.6** |
| **Average** | **56.0** |
### Comparison with Other Retrieval Pipelines (Short Documents)
| Pipeline | Avg nDCG@10 |
|----------|:-----------:|
| **MQR-A1 + MRE-T1** | **66.9** |
| INF-X-Retriever | 63.4 |
| RakanEmb4B | 52.4 |
| Nemo Retriever's Agentic Retrieval | 50.9 |
| DIVER-v3-GroupRank | 46.8 |
| BGE-Reasoner-0928 | 46.4 |
| Lattice Hierarchical Retrieval | 42.1 |
### Comparison with Other Retrieval Pipelines (Long Documents)
| Pipeline | Avg nDCG@10 |
|----------|:-----------:|
| **MQR-A1 + MRE-T1** | **56.0** |
| INF-X-Retriever | 54.6 |
## Usage
MQR-A1 rewrites user queries into intent-distilled versions optimized for dense retrieval with MRE-T1. The rewritten query preserves core retrieval signals while removing misleading surface-level noise.
**Recommended Pipeline:**
1. Pass the raw query through **MQR-A1** to obtain an intent-aligned rewrite
2. Use the rewritten query with **[MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1)** for dense retrieval
## Related Models
| Model | Description | Link |
|-------|-------------|------|
| **MRE-T1** | Reasoning-enhanced retriever (Mira Recruitment Embedding, Thought v1) | [ForwardAILabs/MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1) |
## Citation
If you use MQR-A1 in your research, please cite:
```bibtex
@misc{mqr-a1-2026,
title={MQR-A1: GRPO-Aligned Query Rewriter for Reasoning-Intensive Retrieval},
author={Forward AI Labs},
year={2026},
url={https://huggingface.co/ForwardAILabs/MQR-A1}
}
```
## License
Apache 2.0
---
**Built by [Forward AI Labs](https://huggingface.co/ForwardAILabs)** | [mira.day](https://www.mira.day/)
|