Image-Text-to-Text
Transformers
Safetensors
English
qwen2_5_vl
conversational
text-generation-inference
Instructions to use hmhm1229/MoRE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use hmhm1229/MoRE with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="hmhm1229/MoRE") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("hmhm1229/MoRE") model = AutoModelForImageTextToText.from_pretrained("hmhm1229/MoRE") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use hmhm1229/MoRE with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "hmhm1229/MoRE" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hmhm1229/MoRE", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/hmhm1229/MoRE
- SGLang
How to use hmhm1229/MoRE with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "hmhm1229/MoRE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hmhm1229/MoRE", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "hmhm1229/MoRE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hmhm1229/MoRE", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use hmhm1229/MoRE with Docker Model Runner:
docker model run hf.co/hmhm1229/MoRE
File size: 5,291 Bytes
c14bad2 272b574 ff4ea1a 07b9559 8e3c274 07b9559 8e3c274 07b9559 d1f1c71 c86c23b d0f03d2 07b9559 d1f1c71 07b9559 8e3c274 07b9559 8e3c274 07b9559 4d9f1af | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-VL-7B-Instruct
pipeline_tag: image-text-to-text
library_name: transformers
---
<div align="center">
<h1> MoRE: Mixture-of-Retrieval Experts for Reasoning-Guided Multimodal Knowledge Exploitation </h1>
<h5 align="center">
<a href='https://arxiv.org/abs/2505.22095'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
<a href='https://huggingface.co/hmhm1229/R1-Router'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue'>
<a href='https://huggingface.co/hmhm1229/R1-Router-3B'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue'>
Chunyi Peng<sup>1,3</sup>,
Zhipeng Xu<sup>1</sup>,
Zhenghao Liu<sup>1</sup>,
Yishan Li<sup>3</sup>,
Yukun Yan<sup>2</sup>,
Yu Gu<sup>1</sup>
Minghe Yu<sup>1</sup>
Ge Yu<sup>1</sup>
Maosong Sun<sup>2</sup>
<sup>1</sup>Northeastern University, <sup>2</sup>Tsinghua University, <sup>3</sup>ModleBest Inc.
<h5 align="center"> If you find this project useful, please give us a star🌟.
</h5>
</div>
## News
26.04.28 Our Work is now accepted by SIGIR 2026🎉!
25.08.22 We upload [MoRE-3B](https://huggingface.co/hmhm1229/MoRE-3B).
## Environment
For training, answer generation, and evaluation processes:
```bash
conda create -n router python=3.11
conda activate router
pip install requirements_router.txt
```
For retriever and corpus construction processes:
```bash
conda create -n retriever python=3.11
conda activate retriever
pip install requirements_retriever.txt
```
## Corpora Construction
For the text corpus, you can download `enwiki-20241020` from [Huggingface](https://huggingface.co/datasets/hmhm1229/enwiki-20241020). Then preprocess, and index it with the following commands:
```bash
7z x enwiki-20241020-pages-articles-multistream.xml.zip.001
conda activate retriever
wikiextractor enwiki-20241020-pages-articles-multistream.xml.bz2 -o wiki_extracted
python wiki_preprocess.py
```
For the image corpus, you can directly download [M-BEIR](https://huggingface.co/datasets/TIGER-Lab/M-BEIR). To embed and index it, you can follow the [repository](https://github.com/TIGER-AI-Lab/UniIR)
For the table corpus, you can download, embed and index Open-WikiTable following the [repository](https://github.com/sean0042/Open_WikiTable), or you can download directly the one we have already preprocessed from [here](https://huggingface.co/hmhm1229/table-retriever).
## Retrievers Preparation
For the Text-Image Retriever, you can directly download [UniIR](https://huggingface.co/TIGER-Lab/UniIR)
For the Table Retriever, you can train it with the help of [repository](https://github.com/sean0042/Open_WikiTable), or you can download it directly from [here](https://huggingface.co/hmhm1229/table-retriever).
## Datasets
We have prepared all the text datasets in `./datasets`, for images you need to download them from:
- `InfoSeek:` InfoSeek images can be downloaded from [OVEN](https://github.com/open-vision-language/oven/tree/main/image_downloads)
- `Dyn-VQA:` Dynamic VQA images can be downloaded from [DynVQA_en.202412](https://github.com/Alibaba-NLP/OmniSearch/blob/main/dataset/DynVQA_en/DynVQA_en.202412.jsonl)
- `WebQA:` WebQA images can be downloaded from [Google Drive](https://drive.google.com/drive/folders/19ApkbD5w0I5sV1IeQ9EofJRyAjKnA7tb)
## Training
If you do not want to train the model, you can download [R1-Router](https://huggingface.co/hmhm1229/R1-Router) and skip this section to [Evaluation](#evaluation)
### Data Synthesis
If you want to use the ready-to-use synthetic data directly, you can skip this section to [Step-GRPO Training](#step-grpo-training)
First, we need to synthesis the data step by step:
```bash
bash src/data_synthesis/data_synthesis.sh
```
### Step-GRPO Training
Our training framework is based on [EasyR1](https://github.com/hiyouga/EasyR1), only you need to do is to download it and replace some files with the files in `./Easy-R1`.
Then start training with the command:
```bash
conda activate router
bash examples/run_qwen2_5_vl_7b_stepgrpo.sh
```
## Evaluation
We provide the evaluation pipeline for the R1-Router:
```bash
bash evaluation.sh
```
or, you can just evaluate the results we have provided by:
```bash
conda activate router
cd src
python evaluate.py --dataset_name all --method "r1-router3"
```
## Acknowledgement
Our work is built on the following codebases, and we are deeply grateful for their contributions.
- [EasyR1](https://github.com/hiyouga/EasyR1)
- [UniIR](https://huggingface.co/TIGER-Lab/UniIR)
- [Open-WikiTable](https://github.com/sean0042/Open_WikiTable)
- [OmniSearch](https://github.com/Alibaba-NLP/OmniSearch)
## Citation
We appreciate your citations if you find our paper relevant and useful to your research!
```
@article{peng2025r1,
title={Learning to Route Queries across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning},
author={Peng, Chunyi and Xu, Zhipeng and Liu, Zhenghao and Li, Yishan and Yan, Yukun and Wang, Shuo and Liu, Zhiyuan and Gu, Yu and Yu, Minghe and Yu, Ge and Sun, Maosong},
year={2025}
url={https://arxiv.org/abs/2505.22095},
}
```
## Contact Us
If you have questions, suggestions, or bug reports, please email us. We will try our best to help you.
```
hm.cypeng@gmail.com
``` |