BinKhoaLe1812 commited on
Commit
b040943
·
verified ·
1 Parent(s): 2cd2d6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -3
README.md CHANGED
@@ -1,3 +1,89 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - BeIR/scidocs
5
+ - miriad/miriad-4.4M
6
+ - BioASQ-b
7
+ language:
8
+ - en
9
+ base_model:
10
+ - BAAI/bge-reranker-v2-gemma
11
+ pipeline_tag: text-classification
12
+ tags:
13
+ - medical
14
+ - merge
15
+ - rerank
16
+ ---
17
+
18
+ # MedSwin/MedSwin-Reranker-bge-gemma — Fine-tuned Biomedical & EMR Context Ranking
19
+
20
+ - **Developed by:** Medical Swinburne University of Technology AI Team
21
+ - **Funded by:** [Swinburne University of Technology](https://www.swinburne.edu.au)
22
+ - **Language(s):** English
23
+ - **License:** Apache 2.0
24
+
25
+ ## Overview
26
+ 1. **RAG Context Reranking**
27
+ Re-rank candidate passages retrieved from a VectorDB (initial recall via embeddings), improving final context selection for downstream medical LLM reasoning.
28
+
29
+ 2. **EMR Profile Reranking**
30
+ Re-rank patient historical information (e.g., past assessments, diagnoses, medications) to surface the most clinically relevant records for a given current assessment.
31
+
32
+ The reranker outputs a **direct relevance score** for each *(query, passage)* pair and can be used as a drop-in “second-stage” ranking component after embedding-based retrieval.
33
+
34
+ ---
35
+
36
+ ## Why a Reranker?
37
+ Embedding retrieval is fast and scalable but may miss nuanced relevance (clinical relationships, subtle terminology, long context dependencies).
38
+ A reranker improves precision by explicitly scoring each candidate passage against the query, typically yielding better top-k context for medical QA and decision support.
39
+
40
+ ---
41
+
42
+ ## Base Model
43
+ - **Model**: [BAAI/bge-reranker-v2-gemma](https://huggingface.co/BAAI/bge-reranker-v2-gemma)
44
+ - **Finetuning strategy**: **LoRA** (parameter-efficient fine-tuning) with gradient checkpointing and mixed precision (fp16/bf16 depending on GPU).
45
+ - **Rationale**: Gemma-based rerankers generally provide strong relevance modeling and support longer contexts compared to smaller rerankers.
46
+
47
+ ---
48
+
49
+ ## Training Data (Offline, Local)
50
+ We fine-tune using **open HF datasets** stored locally on HPC:
51
+
52
+ ### 1) BioASQ (Generated Queries)
53
+ - Used as: (query, document) positives; negatives sampled from rolling buffer.
54
+ - Specialised to handle the complex terminology and high precision required for Task B (Biomedical Semantic QA). The reranker acts as a critical second stage in a two-stage retrieval system, filtering initial candidate lists from a PubMed-indexed retriever to ensure the highest-ranked documents contain the specific evidence needed for factoid and 'ideal' answer generation.
55
+
56
+ ### 2) MIRIAD (Medical IR Instruction Dataset)
57
+ - Used as: (question → passage) positives; negatives sampled from rolling buffer.
58
+ - [MIRIAD's 4.4M](https://huggingface.co/datasets/miriad/miriad-4.4M) literature-grounded QA pairs, the model is trained to distinguish between highly similar clinical concepts. This specialization reduces medical hallucinations and ensures that the most scientifically accurate evidence is prioritised in a multi-stage retrieval pipeline for healthcare professionals.
59
+
60
+ ### 3) SciDocs
61
+ - Multi-task dataset—including citation prediction and co-citation analysis—the model learns to capture nuanced semantic relationships that standard Bi-Encoders miss. The resulting reranker serves as a high-accuracy second stage in a two-stage retrieval pipeline, significantly improving Top-K relevance for complex scholarly queries.
62
+
63
+ ---
64
+
65
+ ## Methodology
66
+ ### Data Construction (Triplets)
67
+ The training corpus is converted into reranker triplets:
68
+ ```json
69
+ {
70
+ "query": "clinical question",
71
+ "pos": ["relevant passage 1", "relevant passage 2"],
72
+ "neg": ["irrelevant passage A", "irrelevant passage B"],
73
+ "source": "dataset_name"
74
+ }
75
+ ```
76
+
77
+ * **Positives**: from dataset relevance labels or paired question–passage examples.
78
+ * **Negatives**: sampled from an in-memory rolling buffer (fast, scalable offline).
79
+ * Output splits: **train / val / test** created in one run.
80
+
81
+ ### Evaluation
82
+
83
+ Computes IR ranking metrics by scoring each query against its *(pos + neg)* candidates:
84
+
85
+ * **nDCG@10:** 0.60+
86
+ * **MRR@10:** 0.50+
87
+ * **MAP@10:** 0.40+
88
+ * **Hit@1:** 0.40+
89
+ * Metrics reported overall and broken down by data source.