File size: 6,261 Bytes

---
language:
- en
license: apache-2.0
tags:
- query-rewriting
- reasoning
- retrieval
- BRIGHT
- GRPO
- alignment
library_name: transformers
base_model: Qwen/Qwen3-4B-Instruct-2507
pipeline_tag: text-generation
datasets:
- xlangai/BRIGHT
model-index:
- name: MQR-A1
  results:
  - task:
      type: Retrieval
    dataset:
      type: xlangai/BRIGHT
      name: BRIGHT (Short, Pipeline)
    metrics:
    - type: ndcg_at_10
      value: 66.9
      name: nDCG@10
  - task:
      type: Retrieval
    dataset:
      type: xlangai/BRIGHT
      name: BRIGHT (Long, Pipeline)
    metrics:
    - type: ndcg_at_10
      value: 56.0
      name: nDCG@10
---

<div align="center">
  <h3>Built by <a href="https://huggingface.co/ForwardAILabs">Forward AI Labs</a></h3>
  <p>We are an AI company that provides recruitment agents. | <a href="https://www.mira.day/">mira.day</a></p>
</div>

---

# MQR-A1: Mira Query Rewriter — Alignment v1

**MQR-A1** (Mira Query Rewriter, Alignment v1) is a GRPO-aligned query rewriting model designed for reasoning-intensive retrieval tasks. It is the alignment component of our retrieval pipeline: by rewriting queries to distill core retrieval intents and filter misleading noise, MQR-A1 enables the downstream retriever [MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1) to achieve state-of-the-art performance.

Combined as the **MQR-A1 + MRE-T1** pipeline, our system achieves **No. 1** on the [BRIGHT Benchmark](https://brightbenchmark.github.io/) across all evaluated dimensions — including both short and long document tracks — outperforming sophisticated rerankers, existing alignment models, and complex agentic pipelines.

## Highlights

- **BRIGHT Short Pipeline nDCG@10: 66.9** — No. 1 on the short document retrieval leaderboard
- **BRIGHT Long Pipeline nDCG@10: 56.0** — No. 1 on the long document retrieval leaderboard
- **Intent Distillation over Query Expansion**: Shifts the paradigm from simple additive expansion to RL-driven discriminative feature extraction
- **Immune to Semantic Traps**: Strongly filters out redundant or misleading superficial semantic noise in complex reasoning tasks
- **SNR Enhancement**: Significantly improves the signal-to-noise ratio of retrieval signals, mitigating the "feature dilution" effect under long-text inputs

## Training Methodology

MQR-A1 is trained using a three-stage approach:

### 1. Candidate Rewrite Mining
To prevent the model from converging on rigid, template-based shortcuts during SFT, we implemented a heterogeneous candidate rewrite mining strategy. For every query, we dynamically synthesized a diverse set of natural-language-structured rewrites, forcing the model to prioritize underlying retrieval intent over superficial syntactic patterns.

### 2. Cold Start (SFT)
Traditional cross-entropy loss is inherently misaligned with retrieval objectives. By injecting natural language structural features derived from the mining stage, we equip the base model with strong discriminative feature extraction capabilities, laying a stable initialization foundation for the subsequent GRPO phase.

### 3. GRPO-Driven Intent Alignment
Unlike DPO, which relies on static preference data, GRPO (Group Relative Policy Optimization) enables the model to engage in interactive learning via retrieval feedback within the actual document corpus environment. This allows the model to autonomously explore and extract highly discriminative features, achieving a fundamental leap from superficial "textual matching" to deep "intent alignment."

**Multi-Dimensional Reward Function:**
| Component | Description |
|-----------|-------------|
| Primary Reward | Dense retrieval NDCG scores |
| Constraint | Length penalty to prevent feature dilution |
| Reward Shaping | Cosine similarity with positive examples to maintain semantic grounding |

## BRIGHT Benchmark Results (Pipeline: MQR-A1 + MRE-T1)

### Short Document Retrieval (nDCG@10)

| Task | MQR-A1 + MRE-T1 |
|------|:----------------:|
| Biology | **86.7** |
| Earth Science | **78.5** |
| Economics | 69.7 |
| Psychology | **78.2** |
| Robotics | **58.4** |
| StackOverflow | **67.0** |
| Sustainable Living | **65.9** |
| LeetCode | 46.8 |
| Pony | **73.4** |
| AoPS | 45.2 |
| TheoremQA (Questions) | **60.6** |
| TheoremQA (Theorems) | **72.3** |
| **Average** | **66.9** |

### Long Document Retrieval (nDCG@10)

| Task | MQR-A1 + MRE-T1 |
|------|:----------------:|
| Biology | **77.1** |
| Earth Science | 59.0 |
| Economics | **71.2** |
| Psychology | 73.8 |
| Robotics | 46.0 |
| StackOverflow | **35.5** |
| Sustainable Living | **70.6** |
| Pony | **14.6** |
| **Average** | **56.0** |

### Comparison with Other Retrieval Pipelines (Short Documents)

| Pipeline | Avg nDCG@10 |
|----------|:-----------:|
| **MQR-A1 + MRE-T1** | **66.9** |
| INF-X-Retriever | 63.4 |
| RakanEmb4B | 52.4 |
| Nemo Retriever's Agentic Retrieval | 50.9 |
| DIVER-v3-GroupRank | 46.8 |
| BGE-Reasoner-0928 | 46.4 |
| Lattice Hierarchical Retrieval | 42.1 |

### Comparison with Other Retrieval Pipelines (Long Documents)

| Pipeline | Avg nDCG@10 |
|----------|:-----------:|
| **MQR-A1 + MRE-T1** | **56.0** |
| INF-X-Retriever | 54.6 |

## Usage

MQR-A1 rewrites user queries into intent-distilled versions optimized for dense retrieval with MRE-T1. The rewritten query preserves core retrieval signals while removing misleading surface-level noise.

**Recommended Pipeline:**
1. Pass the raw query through **MQR-A1** to obtain an intent-aligned rewrite
2. Use the rewritten query with **[MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1)** for dense retrieval

## Related Models

| Model | Description | Link |
|-------|-------------|------|
| **MRE-T1** | Reasoning-enhanced retriever (Mira Recruitment Embedding, Thought v1) | [ForwardAILabs/MRE-T1](https://huggingface.co/ForwardAILabs/MRE-T1) |

## Citation

If you use MQR-A1 in your research, please cite:

```bibtex
@misc{mqr-a1-2026,
  title={MQR-A1: GRPO-Aligned Query Rewriter for Reasoning-Intensive Retrieval},
  author={Forward AI Labs},
  year={2026},
  url={https://huggingface.co/ForwardAILabs/MQR-A1}
}
```

## License

Apache 2.0

---

**Built by [Forward AI Labs](https://huggingface.co/ForwardAILabs)** | [mira.day](https://www.mira.day/)