Tom Aarsen
Introduce config files for simple & warning-free Sentence Transformers integration
326e580 | base_model: | |
| - meta-llama/Llama-3.1-8B | |
| language: | |
| - en | |
| license: cc-by-sa-4.0 | |
| pipeline_tag: feature-extraction | |
| library_name: transformers | |
| tags: | |
| - sentence-transformers | |
| ## Model Summary | |
| ReasonIR-8B is the first retriever specifically trained for general reasoning tasks, achieving the state-of-the-art retrieval performance | |
| on BRIGHT (reasoning-intensive retrieval). | |
| When employed for retrieval-augmented generation (RAG), ReasonIR-8B also brings substantial gains on MMLU and GPQA. | |
| - Repository: https://github.com/facebookresearch/ReasonIR | |
| - Paper: https://arxiv.org/abs/2504.20595 | |
| ## Usage | |
| Make sure to install `transformers>=4.47.0` first! | |
| ### Transformers | |
| ```python | |
| from transformers import AutoModel | |
| model = AutoModel.from_pretrained("reasonir/ReasonIR-8B", torch_dtype="auto", trust_remote_code=True) | |
| model = model.to("cuda") | |
| model.eval() | |
| query = "The quick brown fox jumps over the lazy dog." | |
| document = "The quick brown fox jumps over the lazy dog." | |
| query_instruction = "" | |
| doc_instruction = "" | |
| query_emb = model.encode(query, instruction=query_instruction) | |
| doc_emb = model.encode(document, instruction=doc_instruction) | |
| sim = query_emb @ doc_emb.T | |
| ``` | |
| When using `AutoModel`, it is important to: | |
| 1. Include `trust_remote_code=True` to make sure our custom bidirectional encoding architecture is used. | |
| 2. Use `torch_dtype="auto"` so that `bf16` is activated (by default torch will use `fp32`). | |
| ### Sentence Transformers | |
| In addition to Transformers, you can also use this model with Sentence Transformers | |
| ```python | |
| # pip install sentence-transformers | |
| from sentence_transformers import SentenceTransformer | |
| model_kwargs = {"torch_dtype": "auto"} | |
| model = SentenceTransformer("reasonir/ReasonIR-8B", trust_remote_code=True, model_kwargs=model_kwargs) | |
| query = "The quick brown fox jumps over the lazy dog." | |
| document = "The quick brown fox jumps over the lazy dog." | |
| query_instruction = "" | |
| doc_instruction = "" | |
| query_emb = model.encode(query, instruction=query_instruction) | |
| doc_emb = model.encode(document, instruction=doc_instruction) | |
| sim = model.similarity(query_emb, doc_emb) | |
| ``` | |
| It is important to also include `trust_remote_code=True` and `torch_dtype="auto"` as discussed earlier. | |
| > [!NOTE] | |
| > There are some very slight floating point discrepancy when using the model via SentenceTransformer caused by how the models are cast to the `bfloat16` dtype, though it should not affect the results in general. | |
| ## Citation | |
| ``` | |
| @article{shao2025reasonir, | |
| title={ReasonIR: Training Retrievers for Reasoning Tasks}, | |
| author={Rulin Shao and Rui Qiao and Varsha Kishore and Niklas Muennighoff and Xi Victoria Lin and Daniela Rus and Bryan Kian Hsiang Low and Sewon Min and Wen-tau Yih and Pang Wei Koh and Luke Zettlemoyer}, | |
| year={2025}, | |
| journal={arXiv preprint arXiv:2504.20595}, | |
| url={https://arxiv.org/abs/2504.20595}, | |
| } | |
| ``` |