File size: 8,409 Bytes
cc89c7e
 
 
 
 
 
 
 
 
 
 
 
461ee81
cc89c7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
031c100
 
 
 
cc89c7e
031c100
 
b54af00
031c100
b54af00
031c100
 
cc89c7e
 
 
031c100
 
cc89c7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8fd0200
 
 
 
 
 
 
 
 
 
 
 
cc89c7e
 
 
 
 
 
dd16b36
 
 
 
 
 
 
 
cc89c7e
 
031c100
cc89c7e
 
 
 
349f238
cc89c7e
349f238
 
 
 
 
cc89c7e
031c100
 
 
 
 
be73721
 
60ac822
 
 
 
 
031c100
cc89c7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be73721
cc89c7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18a0174
cc89c7e
 
 
 
 
 
 
 
4f82502
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
---
language:
- en
license: apache-2.0
tags:
- retrieval
- reasoning
- embedding
- BRIGHT
- information-retrieval
library_name: transformers
pipeline_tag: feature-extraction
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- xlangai/BRIGHT
model-index:
- name: MRE-T1
  results:
  - task:
      type: Retrieval
    dataset:
      type: xlangai/BRIGHT
      name: BRIGHT (Short)
    metrics:
    - type: ndcg_at_10
      value: 39.6
      name: nDCG@10
  - task:
      type: Retrieval
    dataset:
      type: xlangai/BRIGHT
      name: BRIGHT (Long)
    metrics:
    - type: ndcg_at_10
      value: 35.1
      name: nDCG@10
---

<div align="center">
  <h3>Built by <a href="https://huggingface.co/ForwardAILabs">Forward AI Labs</a></h3>
  <p>We are an AI company that provides recruitment agents. | <a href="https://www.mira.day/">mira.day</a></p>
</div>

---

# MRE-T1: Mira Reasoning Embedding — Thought v1

**MRE-T1** (Mira Reasoning Embedding, Thought v1) is the first generation of our reasoning-intensive retrieval model series. The "Thought" in T1 reflects the model's core capability — it thinks before it retrieves, generating explicit reasoning chains to deeply understand query intent before producing embeddings.

MRE-T1 achieves state-of-the-art single-model performance on the [BRIGHT benchmark](https://brightbenchmark.github.io/), which evaluates retrieval models on tasks requiring complex reasoning capabilities.

## Highlights

- **BRIGHT Short nDCG@10: 39.6** — achieves the best single-model result on the short document retrieval leaderboard
- **BRIGHT Long nDCG@10: 35.1** — achieves the best single-model result on the long document retrieval leaderboard
- **Efficient**: Based on Qwen3-4B architecture, significantly smaller than many competing 7-8B models
- **Reasoning-aware**: Uses task-specific reasoning prompts with a special `<emb_token>` for embedding extraction

## Model Details

| Property | Value |
|----------|-------|
| Architecture | Qwen3ForCausalLM |
| Parameters | ~4B |
| Hidden Size | 2560 |
| Layers | 36 |
| Attention Heads | 32 (KV heads: 8) |
| Max Position | 262,144 |
| Precision | bfloat16 |
| Vocabulary | 151,670 |

## BRIGHT Benchmark Results

### Short Document Retrieval (nDCG@10)

| Task | MRE-T1 |
|------|--------|
| Biology | 55.3 |
| Earth Science | 56.5 |
| Economics | 32.9 |
| Psychology | 48.2 |
| Robotics | 33.1 |
| StackOverflow | 34.2 |
| Sustainable Living | 37.3 |
| LeetCode | 35.0 |
| Pony | 35.5 |
| AOPS | 16.7 |
| TheoremQA (Questions) | 43.3 |
| TheoremQA (Theorems) | 46.9 |
| **Average** | **39.6** |

### Long Document Retrieval (nDCG@10)

| Task | MRE-T1 |
|------|--------|
| Biology | 46.5 |
| Earth Science | 46 |
| Economics | 34.5 |
| Psychology | 52.7 |
| Robotics | 27.7 |
| StackOverflow | 22.2 |
| Sustainable Living | 45.2 |
| Pony | 6.3 |
| **Average** | **35.1** |

### Comparison with Other Models (Short, Single Model Only)

| Model | Size | BRIGHT Short nDCG@10 |
|-------|------|---------------------|
| **MRE-T1** | **~4B** | **39.6** |
| BGE-Reasoner-Embed-0928 | 8B | 38.1 |
| Seed1.5-Embedding | MoE | 27.2 |
| gte-Qwen1.5-7B-instruct | 7B | 22.5 |
| GritLM-7B | 7B | 21.0 |
| instructor-xl | 1.5B | 18.9 |
| SFR-Embedding-Mistral | 7B | 18.3 |
| e5-mistral-7b-instruct | 7B | 17.9 |

### Comparison with Other Models (Long, Single Model Only)

| Model | Size | BRIGHT Long nDCG@10 |
|-------|------|---------------------|
| **MRE-T1** | **~4B** | **35.1** |
| Google-Gecko-Text-Embedding-004 | — | 33.2 |
| gte-Qwen1.5-7B-instruct | 7B | 27.8 |
| SFR-Embedding-Mistral | 7B | 26.0 |
| e5-mistral-7b-instruct | 7B | 25.5 |
| voyage-large-2-instruct | — | 24.6 |
| Cohere-embed-english-v3.0 | — | 18.4 |
| bge-large-en-v1.5 | 335M | 14.8 |

## Usage

MRE-T1 uses task-specific system prompts for reasoning-enhanced retrieval. Each query is processed with a domain-specific instruction, and the model generates a reasoning chain followed by a special `<emb_token>` whose representation is used as the query embedding.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ForwardAILabs/MRE-T1"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")

# Task-specific system prompts
TASK_PROMPTS = {
    "biology": "Given a Biology post, extract and briefly describe the core underlying principle or mechanism of this biology question. You MUST end every response with <emb_token>.",
    "earth_science": "Given an Earth Science post, identify the type of Earth science question and briefly describe the core principle for solving it. You MUST end every response with <emb_token>.",
    "economics": "Given an Economics post, analyze the user's core blind spot and the applicable economic analysis framework. You MUST end every response with <emb_token>.",
    "psychology": "Given a Psychology post, extract the user's blind spot and key psychological concepts. You MUST end every response with <emb_token>.",
    "robotics": "Given a Robotics post, diagnose the core issue within the robotics environment and error logs, and point out the applicable technical principles. You MUST end every response with <emb_token>.",
    "stackoverflow": "Given a Stack Overflow post, extract the core underlying technical principle for solving the code error. You MUST end every response with <emb_token>.",
    "sustainable_living": "Given a Sustainable Living post, identify the key scientific concepts and background knowledge required for a closed-loop solution to the life phenomenon or practice. You MUST end every response with <emb_token>.",
    "leetcode": "Given a Coding problem, extract the core algorithm principle (or data structure) and general problem-solving approach. You MUST end every response with <emb_token>.",
    "pony": "Given a Pony question, locate the core knowledge points from the Pony language official documentation needed to solve the code completion problem. You MUST end every response with <emb_token>.",
    "aops": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with <emb_token>.",
    "theoremqa_questions": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with <emb_token>.",
    "theoremqa_theorems": "Given a Math problem, distill the core mathematical principles and problem-solving techniques required for the real-world scenario. You MUST end every response with <emb_token>.",
}

# Example: Generate reasoning-enhanced query embedding
task = "stackoverflow"
query = "How to fix a segmentation fault when using shared_ptr in a multithreaded C++ application?"

messages = [
    {"role": "system", "content": TASK_PROMPTS[task]},
    {"role": "user", "content": query}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model(**inputs, output_hidden_states=True)
    # Use the last hidden state at the <emb_token> position as the embedding
    embedding = outputs.hidden_states[-1][0, -1, :]

print(f"Embedding shape: {embedding.shape}")
```

## Training

MRE-T1 is trained using a two-stage approach on the Qwen3-4B base model:
1. **Stage 1**: Supervised fine-tuning with task-specific reasoning prompts
2. **Stage 2**: Reinforcement learning to optimize retrieval quality

Training data is curated from diverse reasoning-intensive domains including mathematics, science, programming, and social sciences.

## Evaluation

Evaluated on [BRIGHT](https://brightbenchmark.github.io/) (Bridging Reasoning and Information Gathering with Holistic Thinking), a benchmark specifically designed to test retrieval models on tasks requiring complex reasoning.

## Citation

If you use MRE-T1 in your research, please cite:

```bibtex
@misc{mre-t1-2026,
  title={MRE-T1: Reasoning-Enhanced Retrieval Model},
  author={Forward AI},
  year={2026},
  url={https://huggingface.co/ForwardAILabs/MRE-T1}
}
```

## License

Apache 2.0

---

**Built by [Forward AI Labs](https://huggingface.co/ForwardAILabs)** | [mira.day](https://www.mira.day/)