--- language: - en license: apache-2.0 tags: - retrieval - reasoning - embedding - BRIGHT - information-retrieval library_name: transformers pipeline_tag: feature-extraction base_model: Qwen/Qwen3-4B-Instruct-2507 datasets: - xlangai/BRIGHT model-index: - name: MRE-T1 results: - task: type: Retrieval dataset: type: xlangai/BRIGHT name: BRIGHT (Short) metrics: - type: ndcg_at_10 value: 39.6 name: nDCG@10 - task: type: Retrieval dataset: type: xlangai/BRIGHT name: BRIGHT (Long) metrics: - type: ndcg_at_10 value: 35.1 name: nDCG@10 ---

Built by Forward AI Labs

We are an AI company that provides recruitment agents. | mira.day

--- # MRE-T1: Mira Reasoning Embedding — Thought v1 **MRE-T1** (Mira Reasoning Embedding, Thought v1) is the first generation of our reasoning-intensive retrieval model series. The "Thought" in T1 reflects the model's core capability — it thinks before it retrieves, generating explicit reasoning chains to deeply understand query intent before producing embeddings. MRE-T1 achieves state-of-the-art single-model performance on the [BRIGHT benchmark](https://brightbenchmark.github.io/), which evaluates retrieval models on tasks requiring complex reasoning capabilities. ## Highlights - **BRIGHT Short nDCG@10: 39.6** — achieves the best single-model result on the short document retrieval leaderboard - **BRIGHT Long nDCG@10: 35.1** — achieves the best single-model result on the long document retrieval leaderboard - **Efficient**: Based on Qwen3-4B architecture, significantly smaller than many competing 7-8B models - **Reasoning-aware**: Uses task-specific reasoning prompts with a special `` for embedding extraction ## Model Details | Property | Value | |----------|-------| | Architecture | Qwen3ForCausalLM | | Parameters | ~4B | | Hidden Size | 2560 | | Layers | 36 | | Attention Heads | 32 (KV heads: 8) | | Max Position | 262,144 | | Precision | bfloat16 | | Vocabulary | 151,670 | ## BRIGHT Benchmark Results ### Short Document Retrieval (nDCG@10) | Task | MRE-T1 | |------|--------| | Biology | 55.3 | | Earth Science | 56.5 | | Economics | 32.9 | | Psychology | 48.2 | | Robotics | 33.1 | | StackOverflow | 34.2 | | Sustainable Living | 37.3 | | LeetCode | 35.0 | | Pony | 35.5 | | AOPS | 16.7 | | TheoremQA (Questions) | 43.3 | | TheoremQA (Theorems) | 46.9 | | **Average** | **39.6** | ### Long Document Retrieval (nDCG@10) | Task | MRE-T1 | |------|--------| | Biology | 46.5 | | Earth Science | 46 | | Economics | 34.5 | | Psychology | 52.7 | | Robotics | 27.7 | | StackOverflow | 22.2 | | Sustainable Living | 45.2 | | Pony | 6.3 | | **Average** | **35.1** | ### Comparison with Other Models (Short, Single Model Only) | Model | Size | BRIGHT Short nDCG@10 | |-------|------|---------------------| | **MRE-T1** | **~4B** | **39.6** | | BGE-Reasoner-Embed-0928 | 8B | 38.1 | | Seed1.5-Embedding | MoE | 27.2 | | gte-Qwen1.5-7B-instruct | 7B | 22.5 | | GritLM-7B | 7B | 21.0 | | instructor-xl | 1.5B | 18.9 | | SFR-Embedding-Mistral | 7B | 18.3 | | e5-mistral-7b-instruct | 7B | 17.9 | ### Comparison with Other Models (Long, Single Model Only) | Model | Size | BRIGHT Long nDCG@10 | |-------|------|---------------------| | **MRE-T1** | **~4B** | **35.1** | | Google-Gecko-Text-Embedding-004 | — | 33.2 | | gte-Qwen1.5-7B-instruct | 7B | 27.8 | | SFR-Embedding-Mistral | 7B | 26.0 | | e5-mistral-7b-instruct | 7B | 25.5 | | voyage-large-2-instruct | — | 24.6 | | Cohere-embed-english-v3.0 | — | 18.4 | | bge-large-en-v1.5 | 335M | 14.8 | ## Usage MRE-T1 uses task-specific system prompts for reasoning-enhanced retrieval. Each query is processed with a domain-specific instruction, and the model generates a reasoning chain followed by a special `` whose representation is used as the query embedding. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "ForwardAILabs/MRE-T1" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto") # Task-specific system prompts TASK_PROMPTS = { "biology": "Given a Biology post, extract and briefly describe the core underlying principle or mechanism of this biology question. You MUST end every response with .", "earth_science": "Given an Earth Science post, identify the type of Earth science question and briefly describe the core principle for solving it. You MUST end every response with .", "economics": "Given an Economics post, analyze the user's core blind spot and the applicable economic analysis framework. You MUST end every response with .", "psychology": "Given a Psychology post, extract the user's blind spot and key psychological concepts. You MUST end every response with .", "robotics": "Given a Robotics post, diagnose the core issue within the robotics environment and error logs, and point out the applicable technical principles. You MUST end every response with .", "stackoverflow": "Given a Stack Overflow post, extract the core underlying technical principle for solving the code error. You MUST end every response with .", "sustainable_living": "Given a Sustainable Living post, identify the key scientific concepts and background knowledge required for a closed-loop solution to the life phenomenon or practice. You MUST end every response with .", "leetcode": "Given a Coding problem, extract the core algorithm principle (or data structure) and general problem-solving approach. You MUST end every response with .", "pony": "Given a Pony question, locate the core knowledge points from the Pony language official documentation needed to solve the code completion problem. You MUST end every response with .", "aops": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with .", "theoremqa_questions": "Given a Math problem, analyze the problem type characteristics and core examination principles of this math competition problem. You MUST end every response with .", "theoremqa_theorems": "Given a Math problem, distill the core mathematical principles and problem-solving techniques required for the real-world scenario. You MUST end every response with .", } # Example: Generate reasoning-enhanced query embedding task = "stackoverflow" query = "How to fix a segmentation fault when using shared_ptr in a multithreaded C++ application?" messages = [ {"role": "system", "content": TASK_PROMPTS[task]}, {"role": "user", "content": query} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model(**inputs, output_hidden_states=True) # Use the last hidden state at the position as the embedding embedding = outputs.hidden_states[-1][0, -1, :] print(f"Embedding shape: {embedding.shape}") ``` ## Training MRE-T1 is trained using a two-stage approach on the Qwen3-4B base model: 1. **Stage 1**: Supervised fine-tuning with task-specific reasoning prompts 2. **Stage 2**: Reinforcement learning to optimize retrieval quality Training data is curated from diverse reasoning-intensive domains including mathematics, science, programming, and social sciences. ## Evaluation Evaluated on [BRIGHT](https://brightbenchmark.github.io/) (Bridging Reasoning and Information Gathering with Holistic Thinking), a benchmark specifically designed to test retrieval models on tasks requiring complex reasoning. ## Citation If you use MRE-T1 in your research, please cite: ```bibtex @misc{mre-t1-2026, title={MRE-T1: Reasoning-Enhanced Retrieval Model}, author={Forward AI}, year={2026}, url={https://huggingface.co/ForwardAILabs/MRE-T1} } ``` ## License Apache 2.0 --- **Built by [Forward AI Labs](https://huggingface.co/ForwardAILabs)** | [mira.day](https://www.mira.day/)