BinKhoaLe1812 commited on
Commit
8e6de34
·
verified ·
1 Parent(s): 16c50b5

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +173 -389
index.html CHANGED
@@ -3,7 +3,7 @@
3
  <head>
4
  <meta charset="UTF-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
- <title>MedSwin — Multi-Agent Biomedical LLM + Retrieval for EMR & Guidelines</title>
7
  <meta name="description" content="MedSwin: evidence-constrained, auditable multi-agent clinical QA with two-stage biomedical retrieval, calibrated reranking, and distilled 7B medical LLM for deployable decision support." />
8
  <link rel="icon" href="assets/logo.svg">
9
 
@@ -434,59 +434,57 @@
434
  </section>
435
 
436
  <!-- Overview -->
437
- <section id="overview" class="section">
438
- <div class="container">
439
- <div class="flex flex-col lg:flex-row lg:items-end lg:justify-between gap-6 mb-10" data-aos="fade-up">
440
- <div>
441
- <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">Overview</h2>
442
- <p class="mt-3 text-slate-300 max-w-3xl">
443
- MedSwin treats clinical QA as an <span class="text-slate-100 font-semibold">evidence-constrained pipeline</span>, producing:
444
- an answer, a compact evidence bundle under a context budget, and a structured trace for audit and safety review.
445
- </p>
446
- </div>
447
- <div class="flex flex-wrap gap-2">
448
- <span class="badge"><i data-lucide="clipboard-list" class="icon"></i> Answer</span>
449
- <span class="badge"><i data-lucide="files" class="icon"></i> Evidence bundle</span>
450
- <span class="badge"><i data-lucide="route" class="icon"></i> Trace</span>
451
- </div>
452
- </div>
453
-
454
- <div class="grid md:grid-cols-3 gap-6">
455
- <div class="card glass" data-aos="zoom-in-up">
456
- <div class="card-body">
457
- <div class="flex items-center gap-2 font-extrabold text-lg"><i data-lucide="users" class="icon"></i> Specialised Agents</div>
458
- <p class="mt-3 text-slate-300">
459
- A role-based team exchanges typed artifacts:
460
- Query Normaliser, Evidence Retriever, EMR Summariser, Guideline Synthesiser, and Safety Critic.
461
  </p>
462
- <div class="mt-4 text-xs text-slate-400">Outcome: modularity + clear responsibility boundaries.</div>
463
  </div>
464
- </div>
465
-
466
- <div class="card glass" data-aos="zoom-in-up" data-aos-delay="100">
467
- <div class="card-body">
468
- <div class="flex items-center gap-2 font-extrabold text-lg"><i data-lucide="search-check" class="icon"></i> Two-Stage Retrieval</div>
469
- <p class="mt-3 text-slate-300">
470
- Hybrid dense+lexical candidate generation, then long-context biomedical reranking with calibrated scores for
471
- deterministic evidence inclusion policies.
472
  </p>
473
- <div class="mt-4 text-xs text-slate-400">Outcome: fewer critical omissions under token budget.</div>
474
  </div>
475
- </div>
476
-
477
- <div class="card glass" data-aos="zoom-in-up" data-aos-delay="200">
478
- <div class="card-body">
479
- <div class="flex items-center gap-2 font-extrabold text-lg"><i data-lucide="cpu" class="icon"></i> Deployable 7B LLM</div>
480
- <p class="mt-3 text-slate-300">
481
- A compact medical model is trained with large-scale augmentation (SFT) then refined via hard/soft-label KD from
482
- a larger instructor, enabling institution-controlled deployment.
483
  </p>
484
- <div class="mt-4 text-xs text-slate-400">Outcome: practical inference without sacrificing alignment.</div>
485
  </div>
486
  </div>
487
  </div>
 
 
 
 
 
 
 
 
 
 
 
488
  </div>
489
- </section>
490
 
491
  <!-- Contributions -->
492
  <section id="contributions" class="section">
@@ -865,374 +863,160 @@ sequenceDiagram
865
  </section>
866
 
867
  <!-- Retrieval -->
868
- <section id="retrieval" class="section">
869
- <div class="container">
870
- <div class="flex flex-col lg:flex-row lg:items-end lg:justify-between gap-6 mb-10" data-aos="fade-up">
871
- <div>
872
- <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">Two-Stage Retrieval & Calibrated Reranking</h2>
873
- <p class="mt-3 text-slate-300 max-w-3xl">
874
- MedSwin selects a compact, diverse evidence set under budget using hybrid retrieval, a long-context biomedical reranker,
875
- and policy-aware selection with sufficiency constraints.
876
- </p>
877
- </div>
878
- <div class="flex flex-wrap gap-2">
879
- <span class="badge"><i data-lucide="database" class="icon"></i> Dense + BM25</span>
880
- <span class="badge"><i data-lucide="badge-check" class="icon"></i> Calibrated p</span>
881
- <span class="badge"><i data-lucide="layers" class="icon"></i> MMR diversity</span>
882
- </div>
883
- </div>
884
-
885
- <div class="grid lg:grid-cols-2 gap-6">
886
- <!-- Accordion: stage 1/2/policy -->
887
- <div class="card glass" data-aos="fade-up">
888
- <div class="card-body">
889
- <h3 class="text-xl font-extrabold tracking-tight">Pipeline (click to expand)</h3>
890
- <div class="mt-5 space-y-3">
891
-
892
- <div class="accordion-item rounded-2xl border border-white/10 bg-slate-950/40 p-4">
893
- <button class="accordion-btn font-extrabold" type="button">
894
- <span class="inline-flex items-center gap-2"><i data-lucide="scan-search" class="icon"></i> Stage 1 — Candidate generation</span>
895
- <i data-lucide="chevron-down" class="icon"></i>
896
- </button>
897
- <div class="accordion-panel">
898
- <p class="mt-3 text-sm text-slate-300">
899
- Retrieve top-K dense candidates using ANN over biomedical embeddings, then union with BM25 results to handle rare terms and abbreviations.
900
- </p>
901
- <div class="codeblock mt-3">
902
- <button class="copy-btn" data-copy="#codeStage1"><i data-lucide="copy" class="icon"></i> Copy</button>
903
- <pre id="codeStage1"><code>// Candidate pool
904
- C(q) = TopK'( Cdense(q) ∪ Clex(q) )
905
- K' ≥ K</code></pre>
906
- </div>
907
- </div>
908
- </div>
909
-
910
- <div class="accordion-item rounded-2xl border border-white/10 bg-slate-950/40 p-4">
911
- <button class="accordion-btn font-extrabold" type="button">
912
- <span class="inline-flex items-center gap-2"><i data-lucide="badge-check" class="icon"></i> Stage 2 — Long-context reranking</span>
913
- <i data-lucide="chevron-down" class="icon"></i>
914
- </button>
915
- <div class="accordion-panel">
916
- <p class="mt-3 text-sm text-slate-300">
917
- A pointwise LLM reranker scores each (query, passage) pair and provides a calibrated probability used by downstream policy checks.
918
- </p>
919
- <div class="codeblock mt-3">
920
- <button class="copy-btn" data-copy="#codeCalib"><i data-lucide="copy" class="icon"></i> Copy</button>
921
- <pre id="codeCalib"><code>// Calibrated probability (Platt / temperature scaling)
922
- p_cal(q,d) = σ( (ℓ(q,d) − b) / T )</code></pre>
923
- </div>
924
- </div>
925
- </div>
926
-
927
- <div class="accordion-item rounded-2xl border border-white/10 bg-slate-950/40 p-4">
928
- <button class="accordion-btn font-extrabold" type="button">
929
- <span class="inline-flex items-center gap-2"><i data-lucide="sliders" class="icon"></i> Policy-aware selection (budget + sufficiency)</span>
930
- <i data-lucide="chevron-down" class="icon"></i>
931
- </button>
932
- <div class="accordion-panel">
933
- <p class="mt-3 text-sm text-slate-300">
934
- Fuse calibrated reranker probability with dense/lexical signals and lightweight clinical priors, then select a diverse set under budget.
935
- Accept only if EMR and CPG sufficiency targets are met; otherwise trigger “retrieve-more.”
936
- </p>
937
- <div class="mt-3 rounded-2xl border border-white/10 bg-slate-950/40 p-4 text-sm text-slate-200">
938
- <div class="font-extrabold mb-2">Fusion score (illustrative)</div>
939
- <div class="math">
940
- \( S(q,d)= \alpha p_{cal} + \beta \tilde{s}_{emb} + \gamma \tilde{s}_{lex} + \rho f_{recency} + \eta f_{section} + \zeta f_{source} \)
941
- </div>
942
- <div class="text-xs text-slate-400 mt-2">Rendered with KaTeX · weights are interpretable and non-negative.</div>
943
- </div>
944
- </div>
945
- </div>
946
-
947
  </div>
948
- </div>
949
- </div>
950
-
951
- <!-- Visual: selection checklist -->
952
- <div class="card glass" data-aos="fade-up" data-aos-delay="120">
953
- <div class="card-body">
954
- <h3 class="text-xl font-extrabold tracking-tight">What “evidence sufficiency” means in practice</h3>
955
- <p class="mt-3 text-slate-300">
956
- The orchestrator treats sufficiency as a gate: if the selected bundle lacks required guideline and EMR coverage above calibrated thresholds, it will not synthesise a confident answer.
957
  </p>
958
-
959
- <div class="mt-5 grid gap-3">
960
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
961
- <div class="flex items-center justify-between">
962
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="book-open-check" class="icon"></i> CPG coverage</div>
963
- <span class="badge"><i data-lucide="check" class="icon"></i> required</span>
964
- </div>
965
- <p class="text-sm text-slate-300 mt-2">Ensure key recommendations & contraindications are present (not just background text).</p>
966
- </div>
967
-
968
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
969
- <div class="flex items-center justify-between">
970
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="file-heart" class="icon"></i> EMR coverage</div>
971
- <span class="badge"><i data-lucide="check" class="icon"></i> required</span>
972
- </div>
973
- <p class="text-sm text-slate-300 mt-2">Include patient-specific meds/labs/history signals needed to avoid unsafe generalisations.</p>
974
- </div>
975
-
976
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
977
- <div class="flex items-center justify-between">
978
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="layers" class="icon"></i> Diversity under budget</div>
979
- <span class="badge"><i data-lucide="check" class="icon"></i> recommended</span>
980
- </div>
981
- <p class="text-sm text-slate-300 mt-2">Use MMR-style selection to avoid redundant passages and preserve coverage breadth.</p>
982
- </div>
983
-
984
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
985
- <div class="flex items-center justify-between">
986
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="shield-alert" class="icon"></i> Safety critique</div>
987
- <span class="badge"><i data-lucide="check" class="icon"></i> required</span>
988
- </div>
989
- <p class="text-sm text-slate-300 mt-2">Detect missing evidence, conflicts, or contraindication risks; request retrieve-more if needed.</p>
990
- </div>
991
- </div>
992
-
993
- <div class="mt-5 rounded-2xl border border-white/10 bg-slate-950/40 p-4">
994
- <div class="flex items-center gap-2 font-extrabold"><i data-lucide="terminal" class="icon"></i> Example trace fields</div>
995
- <p class="text-sm text-slate-300 mt-2">doc_id · guideline_version · section_tags · chunk_offsets · scores · thresholds · tool_calls</p>
996
- </div>
997
  </div>
998
- </div>
999
- </div>
1000
- </div>
1001
- </section>
1002
-
1003
- <!-- Training -->
1004
- <section id="training" class="section">
1005
- <div class="container">
1006
- <div class="flex flex-col lg:flex-row lg:items-end lg:justify-between gap-6 mb-10" data-aos="fade-up">
1007
- <div>
1008
- <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">Data, Training & Distillation</h2>
1009
- <p class="mt-3 text-slate-300 max-w-3xl">
1010
- MedSwin’s deployable 7B model is produced via SFT on augmented biomedical QA, then KD (hard + soft labels) from a larger instructor,
1011
- using PEFT techniques to fit modest GPU footprints.
1012
- </p>
1013
- </div>
1014
- <div class="flex flex-wrap gap-2">
1015
- <span class="badge"><i data-lucide="wand-2" class="icon"></i> SFT</span>
1016
- <span class="badge"><i data-lucide="git-merge" class="icon"></i> KD</span>
1017
- <span class="badge"><i data-lucide="bolt" class="icon"></i> QLoRA/LoRA</span>
1018
- </div>
1019
- </div>
1020
-
1021
- <div class="grid lg:grid-cols-2 gap-6">
1022
- <!-- Timeline / stepper -->
1023
- <div class="card glass" data-aos="fade-up">
1024
- <div class="card-body">
1025
- <h3 class="text-xl font-extrabold tracking-tight">Pipeline Timeline</h3>
1026
- <p class="mt-2 text-slate-300">A readable progression from data → model → deployable checkpoints.</p>
1027
-
1028
- <div class="mt-6 space-y-3">
1029
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1030
- <div class="flex items-center gap-3">
1031
- <span class="badge">A</span>
1032
- <div class="font-extrabold">Augmentation & QA gates</div>
1033
- </div>
1034
- <p class="text-sm text-slate-300 mt-2">
1035
- Paraphrasing + multi-variant formatting, back-translation, style standardisation, PHI scrubbing, deduplication,
1036
- and medical consistency checks to prevent semantic drift.
1037
- </p>
1038
- </div>
1039
-
1040
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1041
- <div class="flex items-center gap-3">
1042
- <span class="badge">B</span>
1043
- <div class="font-extrabold">SFT (instruction alignment)</div>
1044
- </div>
1045
- <p class="text-sm text-slate-300 mt-2">
1046
- Student learns consistent instruction following and robust clinical writing styles from mixed supervision sources.
1047
- </p>
1048
- </div>
1049
-
1050
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1051
- <div class="flex items-center gap-3">
1052
- <span class="badge">C</span>
1053
- <div class="font-extrabold">Knowledge Distillation (KD)</div>
1054
- </div>
1055
- <p class="text-sm text-slate-300 mt-2">
1056
- Hard labels expand coverage; soft labels preserve calibration/uncertainty. Training uses a combined CE + KL objective at temperature τ.
1057
- </p>
1058
- </div>
1059
-
1060
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1061
- <div class="flex items-center gap-3">
1062
- <span class="badge">D</span>
1063
- <div class="font-extrabold">Model Merging</div>
1064
- </div>
1065
- <p class="text-sm text-slate-300 mt-2">
1066
- Weight-space merging can combine SFT robustness and KD teacher-aligned behaviour without extra full training passes.
1067
- </p>
1068
- </div>
1069
  </div>
 
 
 
1070
  </div>
1071
- </div>
1072
-
1073
- <!-- Tabs: SFT / KD / Reranker training -->
1074
- <div class="card glass" data-aos="fade-up" data-aos-delay="120">
1075
- <div class="card-body">
1076
- <div class="flex items-center justify-between gap-4">
1077
- <h3 class="text-xl font-extrabold tracking-tight">Training Modules</h3>
1078
- <span class="text-xs text-slate-400">Click tabs</span>
1079
- </div>
1080
-
1081
- <div class="mt-5 flex flex-wrap gap-2" role="tablist" aria-label="Training tabs">
1082
- <button class="tab active" data-tab="train-sft" role="tab" aria-selected="true">SFT</button>
1083
- <button class="tab" data-tab="train-kd" role="tab" aria-selected="false">KD</button>
1084
- <button class="tab" data-tab="train-rer" role="tab" aria-selected="false">Reranker</button>
1085
- </div>
1086
-
1087
- <div class="mt-5">
1088
- <div id="train-sft" class="tabpanel">
1089
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1090
- <div class="flex items-center gap-2 font-extrabold"><i data-lucide="wand-2" class="icon"></i> Supervised Fine-Tuning</div>
1091
- <p class="text-sm text-slate-300 mt-2">
1092
- Optimises token-level cross-entropy over instruction-formatted examples; stratified mixing reduces overfitting to any single genre.
1093
- </p>
1094
- <div class="mt-3 text-xs text-slate-400">
1095
- Focus: instruction following, neutral clinical tone, and robust completion behaviour.
1096
- </div>
1097
- </div>
1098
- </div>
1099
-
1100
- <div id="train-kd" class="tabpanel hidden">
1101
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1102
- <div class="flex items-center gap-2 font-extrabold"><i data-lucide="git-merge" class="icon"></i> Knowledge Distillation</div>
1103
- <p class="text-sm text-slate-300 mt-2">
1104
- Combines hard labels (teacher completions) and soft labels (teacher token distributions) to transfer calibrated reasoning.
1105
- </p>
1106
- <div class="codeblock mt-3">
1107
- <button class="copy-btn" data-copy="#codeKD"><i data-lucide="copy" class="icon"></i> Copy</button>
1108
- <pre id="codeKD"><code>// Per-step objective (illustrative)
1109
- L_t = α * CE(y_t) + (1-α) * τ^2 * KL( p_T(·|τ) || p_S(·) )</code></pre>
1110
- </div>
1111
- <div class="mt-3 text-xs text-slate-400">
1112
- Storage efficiency: top-k teacher log-probs per step (renormalised) keeps KD tractable.
1113
- </div>
1114
- </div>
1115
- </div>
1116
-
1117
- <div id="train-rer" class="tabpanel hidden">
1118
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1119
- <div class="flex items-center gap-2 font-extrabold"><i data-lucide="badge-check" class="icon"></i> Biomedical Reranker</div>
1120
- <p class="text-sm text-slate-300 mt-2">
1121
- LoRA-adapts a long-context pointwise LLM reranker for biomedical relevance scoring. Calibration enables threshold-based policy use.
1122
- </p>
1123
- <ul class="mt-3 space-y-2 text-sm text-slate-300">
1124
- <li class="flex gap-2"><i data-lucide="dot" class="icon"></i> Long passages (guidelines / multi-paragraph evidence)</li>
1125
- <li class="flex gap-2"><i data-lucide="dot" class="icon"></i> Calibrated probabilities for inclusion thresholds</li>
1126
- <li class="flex gap-2"><i data-lucide="dot" class="icon"></i> PEFT-friendly for site-specific constraints</li>
1127
- </ul>
1128
- </div>
1129
- </div>
1130
- </div>
1131
-
1132
- <div class="mt-5 flex flex-wrap gap-2">
1133
- <a class="chip" target="_blank" rel="noopener noreferrer" href="https://huggingface.co/collections/MedSwin/finetuning">
1134
- <i data-lucide="wand-2" class="icon"></i> Fine-tune Collection
1135
- </a>
1136
- <a class="chip" target="_blank" rel="noopener noreferrer" href="https://huggingface.co/collections/MedSwin/rag">
1137
- <i data-lucide="database" class="icon"></i> RAG Collection
1138
- </a>
1139
- <a class="chip" target="_blank" rel="noopener noreferrer" href="https://huggingface.co/spaces/MedSwin/Augmentation">
1140
- <i data-lucide="filter" class="icon"></i> Ingestion Pipeline
1141
- </a>
1142
  </div>
 
 
 
1143
  </div>
1144
  </div>
1145
  </div>
 
 
 
 
 
 
 
 
 
 
 
1146
  </div>
1147
- </section>
1148
-
1149
- <!-- Evaluation -->
1150
- <section id="evaluation" class="section">
1151
- <div class="container">
1152
- <div class="flex flex-col lg:flex-row lg:items-end lg:justify-between gap-6 mb-10" data-aos="fade-up">
1153
- <div>
1154
- <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">Evaluation & Safety</h2>
1155
- <p class="mt-3 text-slate-300 max-w-3xl">
1156
- MedSwin evaluates beyond generic RAG metrics by emphasising retrieval quality, guideline coverage, and answer faithfulness—plus runtime guardrails.
1157
- </p>
1158
- </div>
1159
- <div class="flex flex-wrap gap-2">
1160
- <span class="badge"><i data-lucide="clipboard-check" class="icon"></i> Faithfulness</span>
1161
- <span class="badge"><i data-lucide="book-open-check" class="icon"></i> Guideline coverage</span>
1162
- <span class="badge"><i data-lucide="shield-check" class="icon"></i> Guardrails</span>
1163
- </div>
1164
- </div>
1165
-
1166
- <!-- Counters -->
1167
- <div class="grid md:grid-cols-3 gap-6">
1168
- <div class="card glass" data-aos="fade-up">
1169
- <div class="card-body text-center">
1170
- <div class="text-4xl font-extrabold tracking-tight metric-value" data-count="500000">0</div>
1171
- <div class="text-sm text-slate-400 mt-1">Augmented supervision scale (illustrative)</div>
1172
  </div>
1173
- </div>
1174
- <div class="card glass" data-aos="fade-up" data-aos-delay="120">
1175
- <div class="card-body text-center">
1176
- <div class="text-4xl font-extrabold tracking-tight metric-value" data-count="5">0</div>
1177
- <div class="text-sm text-slate-400 mt-1">Core agents in audit loop</div>
1178
  </div>
1179
- </div>
1180
- <div class="card glass" data-aos="fade-up" data-aos-delay="240">
1181
- <div class="card-body text-center">
1182
- <div class="text-4xl font-extrabold tracking-tight metric-value" data-count="2">0</div>
1183
- <div class="text-sm text-slate-400 mt-1">End-to-end benchmark families</div>
1184
  </div>
1185
  </div>
1186
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
1187
 
1188
- <div class="mt-8 grid lg:grid-cols-2 gap-6">
1189
- <div class="card glass" data-aos="fade-up">
1190
- <div class="card-body">
1191
- <h3 class="text-xl font-extrabold tracking-tight">What’s measured</h3>
1192
- <div class="mt-4 space-y-3">
1193
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1194
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="search" class="icon"></i> Retrieval quality</div>
1195
- <p class="text-sm text-slate-300 mt-2">How well the evidence bundle matches the clinical information need under budget.</p>
1196
- </div>
1197
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1198
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="book-open" class="icon"></i> Guideline coverage</div>
1199
- <p class="text-sm text-slate-300 mt-2">Presence of actionable recommendations + contraindications, not just generic background.</p>
1200
- </div>
1201
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1202
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="check-check" class="icon"></i> Answer faithfulness</div>
1203
- <p class="text-sm text-slate-300 mt-2">Does the final answer stay grounded in retrieved evidence and cite what it used?</p>
1204
- </div>
1205
  </div>
 
 
 
1206
  </div>
1207
- </div>
1208
-
1209
- <div class="card glass" data-aos="fade-up" data-aos-delay="120">
1210
- <div class="card-body">
1211
- <h3 class="text-xl font-extrabold tracking-tight">Runtime guards</h3>
1212
- <p class="mt-2 text-slate-300">
1213
- At inference time, MedSwin prioritises safety and transparency: when evidence is weak or incomplete, it avoids confident recommendations.
1214
  </p>
1215
-
1216
- <div class="mt-4 grid gap-3">
1217
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1218
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="help-circle" class="icon"></i> Clarify vs answer</div>
1219
- <p class="text-sm text-slate-300 mt-2">If sufficiency fails, the system requests missing context or expands retrieval.</p>
1220
- </div>
1221
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1222
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="quote" class="icon"></i> Citation-required output</div>
1223
- <p class="text-sm text-slate-300 mt-2">Answers are paired with evidence references and trace-friendly provenance fields.</p>
1224
- </div>
1225
- <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
1226
- <div class="font-extrabold inline-flex items-center gap-2"><i data-lucide="shield-alert" class="icon"></i> Safety critique stage</div>
1227
- <p class="text-sm text-slate-300 mt-2">Detect missing contraindications and unsafe advice before final response.</p>
1228
- </div>
1229
  </div>
 
 
 
1230
  </div>
1231
  </div>
1232
  </div>
1233
-
 
 
 
 
 
 
 
 
 
 
1234
  </div>
1235
- </section>
 
1236
 
1237
 
1238
  <!-- Team -->
 
3
  <head>
4
  <meta charset="UTF-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>MedSwin — Project Introduction</title>
7
  <meta name="description" content="MedSwin: evidence-constrained, auditable multi-agent clinical QA with two-stage biomedical retrieval, calibrated reranking, and distilled 7B medical LLM for deployable decision support." />
8
  <link rel="icon" href="assets/logo.svg">
9
 
 
434
  </section>
435
 
436
  <!-- Overview -->
437
+ <div class="container">
438
+ <div class="grid lg:grid-cols-5 gap-8 items-start">
439
+ <!-- Left: narrative -->
440
+ <div class="lg:col-span-3" data-aos="fade-up">
441
+ <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">Overview</h2>
442
+ <p class="mt-4 text-slate-300 leading-relaxed">
443
+ MedSwin frames clinical QA as an <span class="font-semibold text-slate-100">evidence-constrained decision pipeline</span>.
444
+ Every answer is gated by evidence sufficiency, bounded by a strict context budget, and accompanied by a
445
+ replayable trace suitable for audit and safety review.
446
+ </p>
447
+
448
+ <div class="mt-6 grid sm:grid-cols-3 gap-4">
449
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
450
+ <div class="font-extrabold inline-flex items-center gap-2">
451
+ <i data-lucide="message-square" class="icon"></i> Answer
452
+ </div>
453
+ <p class="text-sm text-slate-300 mt-2">
454
+ Clinically phrased, uncertainty-aware output generated only when evidence gates are satisfied.
 
 
 
 
 
 
455
  </p>
 
456
  </div>
457
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
458
+ <div class="font-extrabold inline-flex items-center gap-2">
459
+ <i data-lucide="files" class="icon"></i> Evidence bundle
460
+ </div>
461
+ <p class="text-sm text-slate-300 mt-2">
462
+ Compact EMR + guideline passages selected under token and diversity constraints.
 
 
463
  </p>
 
464
  </div>
465
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
466
+ <div class="font-extrabold inline-flex items-center gap-2">
467
+ <i data-lucide="route" class="icon"></i> Trace
468
+ </div>
469
+ <p class="text-sm text-slate-300 mt-2">
470
+ Structured artifact log: retrieval, ranking, policies, safety checks.
 
 
471
  </p>
 
472
  </div>
473
  </div>
474
  </div>
475
+
476
+ <!-- Right: compact system summary -->
477
+ <aside class="lg:col-span-2 rounded-3xl border border-white/10 bg-slate-950/40 p-6" data-aos="fade-up">
478
+ <div class="font-extrabold text-lg tracking-tight mb-3">Why MedSwin is different</div>
479
+ <ul class="space-y-3 text-sm text-slate-300">
480
+ <li class="flex gap-2"><i data-lucide="check" class="icon text-emerald-300"></i> Refuses to answer when evidence is insufficient</li>
481
+ <li class="flex gap-2"><i data-lucide="check" class="icon text-emerald-300"></i> Explicit EMR + CPG coverage requirements</li>
482
+ <li class="flex gap-2"><i data-lucide="check" class="icon text-emerald-300"></i> Deterministic retrieval policies (no silent guessing)</li>
483
+ <li class="flex gap-2"><i data-lucide="check" class="icon text-emerald-300"></i> Local-deployable, auditable by design</li>
484
+ </ul>
485
+ </aside>
486
  </div>
487
+ </div>
488
 
489
  <!-- Contributions -->
490
  <section id="contributions" class="section">
 
863
  </section>
864
 
865
  <!-- Retrieval -->
866
+ <div class="container">
867
+ <div class="grid lg:grid-cols-5 gap-8 items-start">
868
+ <div class="lg:col-span-3" data-aos="fade-up">
869
+ <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">
870
+ Two-Stage Retrieval & Calibrated Reranking
871
+ </h2>
872
+ <p class="mt-4 text-slate-300 leading-relaxed">
873
+ Evidence selection is separated into recall-oriented candidate generation and precision-oriented reranking.
874
+ This avoids early truncation while enabling deterministic, policy-aware inclusion decisions.
875
+ </p>
876
+
877
+ <div class="mt-6 space-y-4">
878
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
879
+ <div class="font-extrabold inline-flex items-center gap-2">
880
+ <i data-lucide="scan-search" class="icon"></i> Stage 1 — Candidate generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
881
  </div>
882
+ <p class="text-sm text-slate-300 mt-2">
883
+ Dense ANN retrieval is unioned with BM25 to preserve rare clinical terms, abbreviations, and lab-specific phrasing.
 
 
 
 
 
 
 
884
  </p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
885
  </div>
886
+
887
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
888
+ <div class="font-extrabold inline-flex items-center gap-2">
889
+ <i data-lucide="badge-check" class="icon"></i> Stage 2 — Long-context reranking
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
890
  </div>
891
+ <p class="text-sm text-slate-300 mt-2">
892
+ A biomedical LLM reranker scores each passage and outputs calibrated probabilities usable as policy thresholds.
893
+ </p>
894
  </div>
895
+
896
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
897
+ <div class="font-extrabold inline-flex items-center gap-2">
898
+ <i data-lucide="sliders" class="icon"></i> Policy-aware selection
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
899
  </div>
900
+ <p class="text-sm text-slate-300 mt-2">
901
+ Final selection enforces EMR + guideline sufficiency, diversity (MMR-style), and a strict token budget.
902
+ </p>
903
  </div>
904
  </div>
905
  </div>
906
+
907
+ <!-- Right: sufficiency gate -->
908
+ <aside class="lg:col-span-2 rounded-3xl border border-white/10 bg-slate-950/40 p-6" data-aos="fade-up">
909
+ <div class="font-extrabold text-lg tracking-tight mb-3">Evidence acceptance gate</div>
910
+ <ul class="space-y-3 text-sm text-slate-300">
911
+ <li class="flex gap-2"><i data-lucide="book-open-check" class="icon"></i> Required guideline recommendations present</li>
912
+ <li class="flex gap-2"><i data-lucide="file-heart" class="icon"></i> Patient-specific EMR signals included</li>
913
+ <li class="flex gap-2"><i data-lucide="layers" class="icon"></i> Redundancy reduced under budget</li>
914
+ <li class="flex gap-2"><i data-lucide="shield-alert" class="icon"></i> Safety critic approves synthesis</li>
915
+ </ul>
916
+ </aside>
917
  </div>
918
+ </div>
919
+
920
+ <!-- Training -->
921
+ <div class="container">
922
+ <div class="grid lg:grid-cols-5 gap-8 items-start">
923
+ <div class="lg:col-span-3" data-aos="fade-up">
924
+ <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">
925
+ Data, Training & Distillation
926
+ </h2>
927
+ <p class="mt-4 text-slate-300 leading-relaxed">
928
+ MedSwin’s deployable 7B model is trained for reliability rather than raw scale,
929
+ combining large-scale augmentation, supervised fine-tuning, and knowledge distillation.
930
+ </p>
931
+
932
+ <div class="mt-6 space-y-4">
933
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
934
+ <div class="font-extrabold">A · Data augmentation</div>
935
+ <p class="text-sm text-slate-300 mt-2">
936
+ Paraphrasing, formatting variants, deduplication, and medical consistency checks expand coverage without semantic drift.
937
+ </p>
 
 
 
 
 
938
  </div>
939
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
940
+ <div class="font-extrabold">B · Supervised fine-tuning</div>
941
+ <p class="text-sm text-slate-300 mt-2">
942
+ Aligns the student to clinical instruction style, tone control, and structured answers.
943
+ </p>
944
  </div>
945
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
946
+ <div class="font-extrabold">C · Knowledge distillation</div>
947
+ <p class="text-sm text-slate-300 mt-2">
948
+ Hard labels expand task coverage; soft labels preserve calibration and uncertainty from a larger instructor.
949
+ </p>
950
  </div>
951
  </div>
952
  </div>
953
+
954
+ <!-- Right: why KD -->
955
+ <aside class="lg:col-span-2 rounded-3xl border border-white/10 bg-slate-950/40 p-6" data-aos="fade-up">
956
+ <div class="font-extrabold text-lg tracking-tight mb-3">Why distillation?</div>
957
+ <ul class="space-y-3 text-sm text-slate-300">
958
+ <li class="flex gap-2"><i data-lucide="cpu" class="icon"></i> Enables local inference on modest GPUs</li>
959
+ <li class="flex gap-2"><i data-lucide="shield-check" class="icon"></i> Preserves calibrated reasoning behaviour</li>
960
+ <li class="flex gap-2"><i data-lucide="settings" class="icon"></i> PEFT-friendly (LoRA / QLoRA)</li>
961
+ <li class="flex gap-2"><i data-lucide="lock" class="icon"></i> Institution-controlled deployment</li>
962
+ </ul>
963
+ </aside>
964
+ </div>
965
+ </div>
966
 
967
+ <!-- Evaluation -->
968
+ <div class="container">
969
+ <div class="grid lg:grid-cols-5 gap-8 items-start">
970
+ <div class="lg:col-span-3" data-aos="fade-up">
971
+ <h2 class="text-3xl lg:text-4xl font-extrabold tracking-tight">
972
+ Evaluation & Safety
973
+ </h2>
974
+ <p class="mt-4 text-slate-300 leading-relaxed">
975
+ MedSwin evaluates clinical QA systems beyond answer accuracy, focusing on evidence quality,
976
+ guideline compliance, and runtime safety behaviour.
977
+ </p>
978
+
979
+ <div class="mt-6 grid sm:grid-cols-3 gap-4">
980
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
981
+ <div class="font-extrabold inline-flex items-center gap-2">
982
+ <i data-lucide="search" class="icon"></i> Retrieval quality
 
983
  </div>
984
+ <p class="text-sm text-slate-300 mt-2">
985
+ Evidence relevance and coverage under a fixed token budget.
986
+ </p>
987
  </div>
988
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
989
+ <div class="font-extrabold inline-flex items-center gap-2">
990
+ <i data-lucide="book-open-check" class="icon"></i> Guideline coverage
991
+ </div>
992
+ <p class="text-sm text-slate-300 mt-2">
993
+ Presence of actionable recommendations and contraindications.
 
994
  </p>
995
+ </div>
996
+ <div class="rounded-2xl border border-white/10 bg-slate-950/40 p-4">
997
+ <div class="font-extrabold inline-flex items-center gap-2">
998
+ <i data-lucide="check-check" class="icon"></i> Faithfulness
 
 
 
 
 
 
 
 
 
 
999
  </div>
1000
+ <p class="text-sm text-slate-300 mt-2">
1001
+ Final answers remain grounded in cited evidence only.
1002
+ </p>
1003
  </div>
1004
  </div>
1005
  </div>
1006
+
1007
+ <!-- Right: safety behaviour -->
1008
+ <aside class="lg:col-span-2 rounded-3xl border border-white/10 bg-slate-950/40 p-6" data-aos="fade-up">
1009
+ <div class="font-extrabold text-lg tracking-tight mb-3">Runtime safety behaviour</div>
1010
+ <ul class="space-y-3 text-sm text-slate-300">
1011
+ <li class="flex gap-2"><i data-lucide="help-circle" class="icon"></i> Clarifies when evidence is missing</li>
1012
+ <li class="flex gap-2"><i data-lucide="quote" class="icon"></i> Enforces citation-required answers</li>
1013
+ <li class="flex gap-2"><i data-lucide="shield-alert" class="icon"></i> Safety critic checks contraindications</li>
1014
+ <li class="flex gap-2"><i data-lucide="users" class="icon"></i> Designed for human-in-the-loop use</li>
1015
+ </ul>
1016
+ </aside>
1017
  </div>
1018
+ </div>
1019
+
1020
 
1021
 
1022
  <!-- Team -->