kiyam commited on
Commit
51df762
ยท
verified ยท
1 Parent(s): 41f6aae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -8
README.md CHANGED
@@ -21,14 +21,62 @@ The models are trained on two benchmark datasets (**MS MARCO (MS300K)** and **Na
21
  | Natural Questions (NQ320K) | PQ | `ddro-nq-pq` | 55.51 | 67.31 |
22
  | Natural Questions (NQ320K) | TU | `ddro-nq-tu` | 45.99 | 55.98 |
23
 
24
-
25
  ---
26
 
27
- ### ๐Ÿš€ Intended Uses
28
- - Generative document retrieval and ranking
29
- - Open-domain question answering
30
- - Semantic and keyword-oriented search
31
- - Research and benchmarking in Information Retrieval (IR)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ### ๐Ÿ—๏ธ Model Architecture
34
  - **Base**: T5-base
@@ -55,5 +103,3 @@ If you use these models, please cite:
55
  - No reinforcement learning or reward modeling
56
  - Lightweight and efficient optimization
57
  - Public checkpoints for reproducibility
58
-
59
- ---
 
21
  | Natural Questions (NQ320K) | PQ | `ddro-nq-pq` | 55.51 | 67.31 |
22
  | Natural Questions (NQ320K) | TU | `ddro-nq-tu` | 45.99 | 55.98 |
23
 
 
24
  ---
25
 
26
+ ### ๐Ÿš€ Quick Evaluation
27
+
28
+ To evaluate this model on your own data:
29
+
30
+ #### 1. Setup the Evaluation Environment
31
+ ```bash
32
+ git clone https://github.com/kidist-amde/ddro.git
33
+ cd ddro
34
+ # Install dependencies (see repository for requirements)
35
+ ```
36
+
37
+ #### 2. Prepare Your Evaluation Data
38
+ **Option A:** Generate evaluation data from your own dataset:
39
+ ```bash
40
+ python src/data/data_prep/build_t5_data/gen_eval_data_pipline.py --encoding "url_title"
41
+ ```
42
+ Or use the batch script:
43
+ ```bash
44
+ sbatch src/scripts/preprocess/generate_eval_data.sh
45
+ ```
46
+
47
+ **Option B:** Use pre-generated encoded document IDs:
48
+ Download from [HuggingFace Datasets](https://huggingface.co/datasets/kiyam/ddro-docids) which contains encoded docids for both MS MARCO and NQ datasets in both `pq` and `url_title` formats.
49
+
50
+ #### 3. Run Evaluation
51
+ ```bash
52
+ # For SLURM clusters:
53
+ sbatch src/pretrain/hf_eval/slurm_submit_hf_eval.sh
54
+
55
+ # Or run directly:
56
+ python src/pretrain/hf_eval/eval_hf_docid_ranking.py \
57
+ --per_gpu_batch_size 4 \
58
+ --log_path logs/evaluation.log \
59
+ --pretrain_model_path kiyam/ddro-msmarco-tu \
60
+ --docid_path resources/datasets/processed/msmarco-data/encoded_docid/url_title_docid.txt \
61
+ --test_file_path resources/datasets/processed/msmarco-data/eval_data_top_300k/query_dev.url_title.jsonl \
62
+ --dataset_script_dir src/data/data_scripts \
63
+ --num_beams 15 \
64
+ --add_doc_num 6144 \
65
+ --max_seq_length 64 \
66
+ --max_docid_length 100 \
67
+ --use_docid_rank True \
68
+ --docid_format msmarco \
69
+ --lookup_fallback True
70
+ ```
71
+
72
+ #### 4. Key Parameters
73
+ - `--encoding`: Use `"url_title"` for this model (or `"pq"` for PQ models)
74
+ - `--docid_format`: Use `"msmarco"` for MS MARCO models, `"nq"` for Natural Questions models
75
+ - `--pretrain_model_path`: Replace with the specific model you want to evaluate
76
+
77
+ ๐Ÿ“‹ **Full setup and evaluation instructions**: [GitHub Repository](https://github.com/kidist-amde/ddro)
78
+
79
+ ---
80
 
81
  ### ๐Ÿ—๏ธ Model Architecture
82
  - **Base**: T5-base
 
103
  - No reinforcement learning or reward modeling
104
  - Lightweight and efficient optimization
105
  - Public checkpoints for reproducibility