| | --- |
| | language: |
| | - en |
| | base_model: |
| | - google-t5/t5-large |
| | pipeline_tag: text-classification |
| | tags: |
| | - gen-ir |
| | - information-retrieval |
| | - ir |
| | --- |
| | This repository contains one of the models analyzed in our paper [Reverse-Engineering the Retrieval Process in GenIR Models](https://dl.acm.org/doi/abs/10.1145/3726302.3730076). |
| |
|
| | ### Training |
| | The model is based on T5-large and was trained on the TriviaQA dataset as a atomic GenIR model reproducing [DSI](https://arxiv.org/abs/2202.06991). |
| |
|
| | ### Model Overview |
| | | Model | Huggingface URL | |
| | | ------------ | ----------------------------------------------------------------------- | |
| | | NQ10k | [DSI-large-NQ10k](https://huggingface.co/AnReu/DSI-large-NQ10k) | |
| | | NQ100k | [DSI-large-NQ100k](https://huggingface.co/AnReu/DSI-large-NQ100k) | |
| | | NQ320k | [DSI-large-NQ320k](https://huggingface.co/AnReu/DSI-large-NQ320k) | |
| | | Trivia-QA | [DSI-large-TriviaQA](https://huggingface.co/AnReu/DSI-large-TriviaQA) | |
| | | Trivia-QA QG | [DSI-large-TriviaQA QG](https://huggingface.co/AnReu/DSI-large-TriviaQA-QG) | |
| | ### Citation |
| | ``` |
| | @inproceedings{Reusch2025Reverse, |
| | author = {Reusch, Anja and Belinkov, Yonatan}, |
| | title = {Reverse-Engineering the Retrieval Process in GenIR Models}, |
| | year = {2025}, |
| | isbn = {9798400715921}, |
| | publisher = {Association for Computing Machinery}, |
| | address = {New York, NY, USA}, |
| | url = {https://doi.org/10.1145/3726302.3730076}, |
| | doi = {10.1145/3726302.3730076}, |
| | booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval}, |
| | pages = {668–677}, |
| | numpages = {10}, |
| | location = {Padua, Italy}, |
| | series = {SIGIR '25} |
| | } |
| | ``` |