DDRO-Generative-Document-Retrieval

This collection contains four generative retrieval models trained using Direct Document Relevance Optimization (DDRO), a lightweight alternative to reinforcement learning for aligning docid generation with document-level relevance through pairwise ranking.

The models are trained on two benchmark datasets (MS MARCO (MS300K) and Natural Questions (NQ320K)) with two types of document identifiers:

PQ (Product Quantization): captures deep semantic features for complex queries.
TU (Title + URL): leverages surface-level lexical signals for keyword-driven retrieval.

📌 Models

Dataset	Docid Type	Model Name	MRR@10	R@10
MS MARCO (MS300K)	PQ	`ddro-msmarco-pq`	45.76	73.02
📍MS MARCO (MS300K)	TU	`ddro-msmarco-tu`	50.07	74.01
Natural Questions (NQ320K)	PQ	`ddro-nq-pq`	55.51	67.31
Natural Questions (NQ320K)	TU	`ddro-nq-tu`	45.99	55.98

🚀 Quick Evaluation

To evaluate this model on your own data:

1. Setup the Evaluation Environment

git clone https://github.com/kidist-amde/ddro.git
cd ddro
# Install dependencies (see repository for requirements)

2. Prepare Your Evaluation Data

Option A: Generate evaluation data from your own dataset:

python src/data/data_prep/build_t5_data/gen_eval_data_pipline.py --encoding "url_title"

Or use the batch script:

sbatch src/scripts/preprocess/generate_eval_data.sh

Option B: Use pre-generated encoded document IDs: Download from HuggingFace Datasets which contains encoded docids for both MS MARCO and NQ datasets in both pq and url_title formats.

3. Run Evaluation

# For SLURM clusters:
sbatch src/pretrain/hf_eval/slurm_submit_hf_eval.sh

# Or run directly:
python src/pretrain/hf_eval/eval_hf_docid_ranking.py \
  --per_gpu_batch_size 4 \
  --log_path logs/evaluation.log \
  --pretrain_model_path kiyam/ddro-msmarco-tu \
  --docid_path resources/datasets/processed/msmarco-data/encoded_docid/url_title_docid.txt \
  --test_file_path resources/datasets/processed/msmarco-data/eval_data_top_300k/query_dev.url_title.jsonl \
  --dataset_script_dir src/data/data_scripts \
  --num_beams 15 \
  --add_doc_num 6144 \
  --max_seq_length 64 \
  --max_docid_length 100 \
  --use_docid_rank True \
  --docid_format msmarco \
  --lookup_fallback True

4. Key Parameters

--encoding: Use "url_title" for this model (or "pq" for PQ models)
--docid_format: Use "msmarco" for MS MARCO models, "nq" for Natural Questions models
--pretrain_model_path: Replace with the specific model you want to evaluate

📋 Full setup and evaluation instructions: GitHub Repository

🏗️ Model Architecture

Base: T5-base
Training: Supervised Fine-tuning (SFT) + Pairwise Ranking (Direct L2R)

📖 Citation

If you use these models, please cite:

@inproceedings{mekonnen2025lightweight,
  title={Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval},
  author={Mekonnen, Kidist Amde and Tang, Yubao and de Rijke, Maarten},
  booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages={1327--1338},
  year={2025}
}

🌟 Highlights

No reinforcement learning or reward modeling
Lightweight and efficient optimization
Public checkpoints for reproducibility

Downloads last month: 2

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kiyam/ddro-msmarco-tu

Base model

google-t5/t5-base

Finetuned

(736)

this model

Collection including kiyam/ddro-msmarco-tu

DDRO-Generative-Document-Retrieval

Collection

Step-3 DDRO optimized checkpoints (final policy) + accompanying datasets/artifacts (docIDs, pseudo-queries, testsets) to reproduce the paper. • 4 items • Updated May 22 • 1