DDRO-Generative-Document-Retrieval
This collection contains four generative retrieval models trained using Direct Document Relevance Optimization (DDRO), a lightweight alternative to reinforcement learning for aligning docid generation with document-level relevance through pairwise ranking.
The models are trained on two benchmark datasets (MS MARCO (MS300K) and Natural Questions (NQ320K)) with two types of document identifiers:
- PQ (Product Quantization): captures deep semantic features for complex queries.
- TU (Title + URL): leverages surface-level lexical signals for keyword-driven retrieval.
π Models
| Dataset | Docid Type | Model Name | MRR@10 | R@10 |
|---|---|---|---|---|
| MS MARCO (MS300K) | PQ | ddro-msmarco-pq |
45.76 | 73.02 |
| πMS MARCO (MS300K) | TU | ddro-msmarco-tu |
50.07 | 74.01 |
| Natural Questions (NQ320K) | PQ | ddro-nq-pq |
55.51 | 67.31 |
| Natural Questions (NQ320K) | TU | ddro-nq-tu |
45.99 | 55.98 |
π Quick Evaluation
To evaluate this model on your own data:
1. Setup the Evaluation Environment
git clone https://github.com/kidist-amde/ddro.git
cd ddro
# Install dependencies (see repository for requirements)
2. Prepare Your Evaluation Data
Option A: Generate evaluation data from your own dataset:
python src/data/data_prep/build_t5_data/gen_eval_data_pipline.py --encoding "url_title"
Or use the batch script:
sbatch src/scripts/preprocess/generate_eval_data.sh
Option B: Use pre-generated encoded document IDs:
Download from HuggingFace Datasets which contains encoded docids for both MS MARCO and NQ datasets in both pq and url_title formats.
3. Run Evaluation
# For SLURM clusters:
sbatch src/pretrain/hf_eval/slurm_submit_hf_eval.sh
# Or run directly:
python src/pretrain/hf_eval/eval_hf_docid_ranking.py \
--per_gpu_batch_size 4 \
--log_path logs/evaluation.log \
--pretrain_model_path kiyam/ddro-msmarco-tu \
--docid_path resources/datasets/processed/msmarco-data/encoded_docid/url_title_docid.txt \
--test_file_path resources/datasets/processed/msmarco-data/eval_data_top_300k/query_dev.url_title.jsonl \
--dataset_script_dir src/data/data_scripts \
--num_beams 15 \
--add_doc_num 6144 \
--max_seq_length 64 \
--max_docid_length 100 \
--use_docid_rank True \
--docid_format msmarco \
--lookup_fallback True
4. Key Parameters
--encoding: Use"url_title"for this model (or"pq"for PQ models)--docid_format: Use"msmarco"for MS MARCO models,"nq"for Natural Questions models--pretrain_model_path: Replace with the specific model you want to evaluate
π Full setup and evaluation instructions: GitHub Repository
ποΈ Model Architecture
- Base: T5-base
- Training: Supervised Fine-tuning (SFT) + Pairwise Ranking (Direct L2R)
π Citation
If you use these models, please cite:
@inproceedings{mekonnen2025lightweight,
title={Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval},
author={Mekonnen, Kidist Amde and Tang, Yubao and de Rijke, Maarten},
booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages={1327--1338},
year={2025}
}
π Highlights
- No reinforcement learning or reward modeling
- Lightweight and efficient optimization
- Public checkpoints for reproducibility
- Downloads last month
- 2
Model tree for kiyam/ddro-msmarco-tu
Base model
google-t5/t5-base