Feature Extraction
Transformers
English
retrieval
reasoning
embedding
BRIGHT
information-retrieval
Eval Results (legacy)
Instructions to use ForwardAILabs/MRE-T1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ForwardAILabs/MRE-T1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="ForwardAILabs/MRE-T1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ForwardAILabs/MRE-T1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: apache-2.0 | |
| tags: | |
| - retrieval | |
| - reasoning | |
| - embedding | |
| - BRIGHT | |
| - information-retrieval | |
| library_name: transformers | |
| pipeline_tag: feature-extraction | |
| base_model: Qwen/Qwen3-4B-Instruct-2507 | |
| datasets: | |
| - xlangai/BRIGHT | |
| model-index: | |
| - name: MRE-T1 | |
| results: | |
| - task: | |
| type: Retrieval | |
| dataset: | |
| type: xlangai/BRIGHT | |
| name: BRIGHT (Short) | |
| metrics: | |
| - type: ndcg_at_10 | |
| value: 39.6 | |
| name: nDCG@10 | |
| - task: | |
| type: Retrieval | |
| dataset: | |
| type: xlangai/BRIGHT | |
| name: BRIGHT (Long) | |
| metrics: | |
| - type: ndcg_at_10 | |
| value: 35.1 | |
| name: nDCG@10 | |
| <div align="center"> | |
| <h3>Built by <a href="https://huggingface.co/ForwardAILabs">Forward AI Labs</a></h3> | |
| <p>We are an AI company that provides recruitment agents. | <a href="https://www.mira.day/">mira.day</a></p> | |
| </div> | |
| --- | |
| # MRE-T1: Mira Reasoning Embedding — Thought v1 | |
| **MRE-T1** (Mira Reasoning Embedding, Thought v1) is the first generation of our reasoning-intensive retrieval model series. The "Thought" in T1 reflects the model's core capability — it thinks before it retrieves, generating explicit reasoning chains to deeply understand query intent before producing embeddings. | |
| MRE-T1 achieves state-of-the-art single-model performance on the [BRIGHT benchmark](https://brightbenchmark.github.io/), which evaluates retrieval models on tasks requiring complex reasoning capabilities. | |
| ## Highlights | |
| - **BRIGHT Short nDCG@10: 39.6** — achieves the best single-model result on the short document retrieval leaderboard | |
| - **BRIGHT Long nDCG@10: 35.1** — achieves the best single-model result on the long document retrieval leaderboard | |
| - **Efficient**: Based on Qwen3-4B architecture, significantly smaller than many competing 7-8B models | |
| - **Reasoning-aware**: Uses task-specific reasoning prompts with a special `<emb_token>` for embedding extraction | |
| ## Model Details | |
| | Property | Value | | |
| |----------|-------| | |
| | Architecture | Qwen3ForCausalLM | | |
| | Parameters | ~4B | | |
| | Hidden Size | 2560 | | |
| | Layers | 36 | | |
| | Attention Heads | 32 (KV heads: 8) | | |
| | Max Position | 262,144 | | |
| | Precision | bfloat16 | | |
| | Vocabulary | 151,670 | | |
| ## BRIGHT Benchmark Results | |
| ### Short Document Retrieval (nDCG@10) | |
| | Task | MRE-T1 | | |
| |------|--------| | |
| | Biology | 55.3 | | |
| | Earth Science | 56.5 | | |
| | Economics | 32.9 | | |
| | Psychology | 48.2 | | |
| | Robotics | 33.1 | | |
| | StackOverflow | 34.2 | | |
| | Sustainable Living | 37.3 | | |
| | LeetCode | 35.0 | | |
| | Pony | 35.5 | | |
| | AOPS | 16.7 | | |
| | TheoremQA (Questions) | 43.3 | | |
| | TheoremQA (Theorems) | 46.9 | | |
| | **Average** | **39.6** | | |
| ### Long Document Retrieval (nDCG@10) | |
| | Task | MRE-T1 | | |
| |------|--------| | |
| | Biology | 46.5 | | |
| | Earth Science | 46 | | |
| | Economics | 34.5 | | |
| | Psychology | 52.7 | | |
| | Robotics | 27.7 | | |
| | StackOverflow | 22.2 | | |
| | Sustainable Living | 45.2 | | |
| | Pony | 6.3 | | |
| | **Average** | **35.1** | | |
| ### Comparison with Other Models (Short, Single Model Only) | |
| | Model | Size | BRIGHT Short nDCG@10 | | |
| |-------|------|---------------------| | |
| | **MRE-T1** | **~4B** | **39.6** | | |
| | BGE-Reasoner-Embed-0928 | 8B | 38.1 | | |
| | Seed1.5-Embedding | MoE | 27.2 | | |
| | gte-Qwen1.5-7B-instruct | 7B | 22.5 | | |
| | GritLM-7B | 7B | 21.0 | | |
| | instructor-xl | 1.5B | 18.9 | | |
| | SFR-Embedding-Mistral | 7B | 18.3 | | |
| | e5-mistral-7b-instruct | 7B | 17.9 | | |
| ### Comparison with Other Models (Long, Single Model Only) | |
| | Model | Size | BRIGHT Long nDCG@10 | | |
| |-------|------|---------------------| | |
| | **MRE-T1** | **~4B** | **35.1** | | |
| | Google-Gecko-Text-Embedding-004 | — | 33.2 | | |
| | gte-Qwen1.5-7B-instruct | 7B | 27.8 | | |
| | SFR-Embedding-Mistral | 7B | 26.0 | | |
| | e5-mistral-7b-instruct | 7B | 25.5 | | |
| | voyage-large-2-instruct | — | 24.6 | | |
| | Cohere-embed-english-v3.0 | — | 18.4 | | |
| | bge-large-en-v1.5 | 335M | 14.8 | | |
| ## Usage | |
| MRE-T1 uses task-specific system prompts for reasoning-enhanced retrieval. Each query is processed with a domain-specific instruction, and the model generates a reasoning chain followed by a special `<emb_token>` whose representation is used as the query embedding. | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_name = "ForwardAILabs/MRE-T1" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto") | |
| # Task-specific system prompts | |
| TASK_PROMPTS = { | |
| "biology": "Given a Biology post, extract and briefly describe the core underlying principle or mechanism of this biology question. You MUST end every response with <emb_token>.", | |
| "earth_science": "Given an Earth Science post, identify the type of Earth science question and briefly describe the core principle for solving it. You MUST end every response with <emb_token>.", | |
| "economics": "Given an Economics post, analyze the user's core blind spot and the applicable economic analysis framework. You MUST end every response with <emb_token>.", | |
| "psychology": "Given a Psychology post, extract the user's blind spot and key psychological concepts. You MUST end every response with <emb_token>.", | |
| "robotics": "Given a Robotics post, diagnose the core issue within the robotics environment and error logs, and point out the applicable technical principles. You MUST end every response with <emb_token>.", | |
| "stackoverflow": "Given a Stack Overflow post, extract the core underlying technical principle for solving the code error. You MUST end every response with <emb_token>.", | |
| "sustainable_living": "Given a Sustainable Living post, identify the key scientific concepts and background knowledge required for a closed-loop solution to the life phenomenon or practice. You MUST end every response with <emb_token>.", | |
| "leetcode": "Given a Coding problem, extract the core algorithm principle (or data structure) and general problem-solving approach. You MUST end every response with <emb_token>.", | |
| "pony": "Given a Pony question, locate the core knowledge points from the Pony language official documentation needed to solve the code completion problem. You MUST end every response with <emb_token>.", | |
| "aops": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with <emb_token>.", | |
| "theoremqa_questions": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with <emb_token>.", | |
| "theoremqa_theorems": "Given a Math problem, distill the core mathematical principles and problem-solving techniques required for the real-world scenario. You MUST end every response with <emb_token>.", | |
| } | |
| # Example: Generate reasoning-enhanced query embedding | |
| task = "stackoverflow" | |
| query = "How to fix a segmentation fault when using shared_ptr in a multithreaded C++ application?" | |
| messages = [ | |
| {"role": "system", "content": TASK_PROMPTS[task]}, | |
| {"role": "user", "content": query} | |
| ] | |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| outputs = model(**inputs, output_hidden_states=True) | |
| # Use the last hidden state at the <emb_token> position as the embedding | |
| embedding = outputs.hidden_states[-1][0, -1, :] | |
| print(f"Embedding shape: {embedding.shape}") | |
| ``` | |
| ## Training | |
| MRE-T1 is trained using a two-stage approach on the Qwen3-4B base model: | |
| 1. **Stage 1**: Supervised fine-tuning with task-specific reasoning prompts | |
| 2. **Stage 2**: Reinforcement learning to optimize retrieval quality | |
| Training data is curated from diverse reasoning-intensive domains including mathematics, science, programming, and social sciences. | |
| ## Evaluation | |
| Evaluated on [BRIGHT](https://brightbenchmark.github.io/) (Bridging Reasoning and Information Gathering with Holistic Thinking), a benchmark specifically designed to test retrieval models on tasks requiring complex reasoning. | |
| ## Citation | |
| If you use MRE-T1 in your research, please cite: | |
| ```bibtex | |
| @misc{mre-t1-2026, | |
| title={MRE-T1: Reasoning-Enhanced Retrieval Model}, | |
| author={Forward AI}, | |
| year={2026}, | |
| url={https://huggingface.co/ForwardAILabs/MRE-T1} | |
| } | |
| ``` | |
| ## License | |
| Apache 2.0 | |
| --- | |
| **Built by [Forward AI Labs](https://huggingface.co/ForwardAILabs)** | [mira.day](https://www.mira.day/) | |