kiyam
/

ddro-msmarco-tu

Model card Files Files and versions

kiyam commited on Aug 30, 2025

Commit

51df762

·

verified ·

1 Parent(s): 41f6aae

Update README.md

Files changed (1) hide show

README.md +54 -8

README.md CHANGED Viewed

@@ -21,14 +21,62 @@ The models are trained on two benchmark datasets (**MS MARCO (MS300K)** and **Na
 | Natural Questions (NQ320K) | PQ | `ddro-nq-pq`        | 55.51  | 67.31 |
 | Natural Questions (NQ320K) | TU | `ddro-nq-tu`        | 45.99  | 55.98 |
 ---
-### 🚀 Intended Uses
-- Generative document retrieval and ranking
-- Open-domain question answering
-- Semantic and keyword-oriented search
-- Research and benchmarking in Information Retrieval (IR)
 ### 🏗️ Model Architecture
 - **Base**: T5-base
@@ -55,5 +103,3 @@ If you use these models, please cite:
 - No reinforcement learning or reward modeling
 - Lightweight and efficient optimization
 - Public checkpoints for reproducibility
----

 | Natural Questions (NQ320K) | PQ | `ddro-nq-pq`        | 55.51  | 67.31 |
 | Natural Questions (NQ320K) | TU | `ddro-nq-tu`        | 45.99  | 55.98 |
 ---
+### 🚀 Quick Evaluation
+To evaluate this model on your own data:
+#### 1. Setup the Evaluation Environment
+```bash
+git clone https://github.com/kidist-amde/ddro.git
+cd ddro
+# Install dependencies (see repository for requirements)
+```
+#### 2. Prepare Your Evaluation Data
+**Option A:** Generate evaluation data from your own dataset:
+```bash
+python src/data/data_prep/build_t5_data/gen_eval_data_pipline.py --encoding "url_title"
+```
+Or use the batch script:
+```bash
+sbatch src/scripts/preprocess/generate_eval_data.sh
+```
+**Option B:** Use pre-generated encoded document IDs:
+Download from [HuggingFace Datasets](https://huggingface.co/datasets/kiyam/ddro-docids) which contains encoded docids for both MS MARCO and NQ datasets in both `pq` and `url_title` formats.
+#### 3. Run Evaluation
+```bash
+# For SLURM clusters:
+sbatch src/pretrain/hf_eval/slurm_submit_hf_eval.sh
+# Or run directly:
+python src/pretrain/hf_eval/eval_hf_docid_ranking.py \
+  --per_gpu_batch_size 4 \
+  --log_path logs/evaluation.log \
+  --pretrain_model_path kiyam/ddro-msmarco-tu \
+  --docid_path resources/datasets/processed/msmarco-data/encoded_docid/url_title_docid.txt \
+  --test_file_path resources/datasets/processed/msmarco-data/eval_data_top_300k/query_dev.url_title.jsonl \
+  --dataset_script_dir src/data/data_scripts \
+  --num_beams 15 \
+  --add_doc_num 6144 \
+  --max_seq_length 64 \
+  --max_docid_length 100 \
+  --use_docid_rank True \
+  --docid_format msmarco \
+  --lookup_fallback True
+```
+#### 4. Key Parameters
+- `--encoding`: Use `"url_title"` for this model (or `"pq"` for PQ models)
+- `--docid_format`: Use `"msmarco"` for MS MARCO models, `"nq"` for Natural Questions models
+- `--pretrain_model_path`: Replace with the specific model you want to evaluate
+📋 **Full setup and evaluation instructions**: [GitHub Repository](https://github.com/kidist-amde/ddro)
+---
 ### 🏗️ Model Architecture
 - **Base**: T5-base
 - No reinforcement learning or reward modeling
 - Lightweight and efficient optimization
 - Public checkpoints for reproducibility