Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,225 @@
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
-
emoji:
|
| 4 |
colorFrom: green
|
| 5 |
colorTo: gray
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
+
emoji: π©Ί
|
| 4 |
colorFrom: green
|
| 5 |
colorTo: gray
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# COS30018 β Multi-Agent Clinical Reasoning System
|
| 11 |
+
_Swinburne University of Technology_ | _Unit: COS30018 Intelligent Systems_
|
| 12 |
+
**Team:** Liam Β· Henry Β· Hai Β· Dylan Β· Vinh
|
| 13 |
+
|
| 14 |
+
> π§ π©Ί A safety-first, agentic medical assistant that coordinates domain-specialist agents (Diagnostics, Pharmacology, Triage) via a reasoning orchestrator, with retrieval-augmented generation and rigorous evaluation.
|
| 15 |
+
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
## Table of Contents
|
| 19 |
+
- [Overview](#overview)
|
| 20 |
+
- [Core Capabilities](#core-capabilities)
|
| 21 |
+
- [Architecture](#architecture)
|
| 22 |
+
- [Data, Training & Reproducibility](#data-training--reproducibility)
|
| 23 |
+
- [Evaluation](#evaluation)
|
| 24 |
+
- [Safety, Ethics & Compliance](#safety-ethics--compliance)
|
| 25 |
+
- [Tech Stack](#tech-stack)
|
| 26 |
+
- [Repository Structure](#repository-structure)
|
| 27 |
+
- [Roadmap](#roadmap)
|
| 28 |
+
- [Team](#team)
|
| 29 |
+
- [Academic Context](#academic-context)
|
| 30 |
+
- [Disclaimer](#disclaimer)
|
| 31 |
+
- [Citation](#citation)
|
| 32 |
+
- [License](#license)
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
## Overview
|
| 37 |
+
This project builds a **multi-agent system** for clinical decision support. A central **reasoning orchestrator** (MCP) routes user problems to **domain-specialist agents**βDiagnostics, Pharmacology, and Triageβthen fuses their outputs with evidence from an **Agentic RAG** pipeline over EMR/EHR data and **PubMed** literature. The system prioritizes **accuracy, traceability, and safety** with strict evaluation and retrieval safety rails.
|
| 38 |
+
|
| 39 |
+
---
|
| 40 |
+
|
| 41 |
+
## Core Capabilities
|
| 42 |
+
- **Specialist Agents**
|
| 43 |
+
- **Diagnostics:** differential reasoning, red-flag detection, uncertainty disclosure.
|
| 44 |
+
- **Pharmacology:** drugβdrug/condition interactions, dosing ranges, contraindications.
|
| 45 |
+
- **Triage:** urgency stratification, disposition options, escalation triggers.
|
| 46 |
+
|
| 47 |
+
- **Reasoning Orchestrator (MCP)**
|
| 48 |
+
- Tool-aware planning, routing, and self-critique with **self-consistency**.
|
| 49 |
+
- Chain-of-thought hidden; user sees concise, cited rationales.
|
| 50 |
+
|
| 51 |
+
- **Agentic RAG (Node & Graph RAG)**
|
| 52 |
+
- Retrieval over **EMR/EHR** + **PubMed** with verifiable **citations**.
|
| 53 |
+
- **Safety rails**: source whitelisting, section-aware chunking, query rewriting.
|
| 54 |
+
- **Real-time updates** to reflect latest clinical literature.
|
| 55 |
+
|
| 56 |
+
- **Modeling & Optimization**
|
| 57 |
+
- **500k+ curated & synthetic cases** for domain adaptation.
|
| 58 |
+
- **Knowledge Distillation**, **LoRA/QLoRA** with **GRPO** (reasoning-oriented).
|
| 59 |
+
- Data augmentation: **QAC paraphrasing/chunking**, **self-consistency**, **counterfactuals**, **back-translation**.
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## Architecture
|
| 64 |
+
```mermaid
|
| 65 |
+
flowchart LR
|
| 66 |
+
U[Clinician / User] -->|case, symptoms, meds| ORCH(MCP Orchestrator)
|
| 67 |
+
ORCH -->|route| AG1[Diagnostics Agent]
|
| 68 |
+
ORCH -->|route| AG2[Pharmacology Agent]
|
| 69 |
+
ORCH -->|route| AG3[Triage Agent]
|
| 70 |
+
|
| 71 |
+
subgraph RAG[Agentic RAG]
|
| 72 |
+
Q[Query Router] --> RET[Retriever (Node & Graph)]
|
| 73 |
+
RET --> KB1[(EMR/EHR)]
|
| 74 |
+
RET --> KB2[(PubMed)]
|
| 75 |
+
RET --> SR[Safety Rails: filters, provenance, sectioning]
|
| 76 |
+
end
|
| 77 |
+
|
| 78 |
+
AG1 --> RAG
|
| 79 |
+
AG2 --> RAG
|
| 80 |
+
AG3 --> RAG
|
| 81 |
+
RAG --> AG1
|
| 82 |
+
RAG --> AG2
|
| 83 |
+
RAG --> AG3
|
| 84 |
+
|
| 85 |
+
AG1 --> FUSE[Evidence Fusion & Self-Consistent Reasoning]
|
| 86 |
+
AG2 --> FUSE
|
| 87 |
+
AG3 --> FUSE
|
| 88 |
+
FUSE --> OUT[Final Report: summary, citations, cautions]
|
| 89 |
+
|
| 90 |
+
subgraph EVAL[Evaluation & QA]
|
| 91 |
+
M1[MedMCQA]
|
| 92 |
+
M2[PubMedQA]
|
| 93 |
+
SIM[Semantic Similarity Audits (biomedical embeddings)]
|
| 94 |
+
LR[Early-Stopping + LR Scheduling]
|
| 95 |
+
end
|
| 96 |
+
OUT -. logged .-> EVAL
|
| 97 |
+
```
|
| 98 |
+
|
| 99 |
+
---
|
| 100 |
+
|
| 101 |
+
## Data, Training & Reproducibility
|
| 102 |
+
|
| 103 |
+
* **Corpora:** 500k+ curated & synthetic clinical cases spanning multiple specialties.
|
| 104 |
+
* **Distillation & Fine-Tuning:** Teacherβstudent **Knowledge Distillation**; **LoRA/QLoRA** adapters for efficient specialization; **GRPO** to bias toward faithful, stepwise reasoning.
|
| 105 |
+
* **Augmentation:** QAC paraphrasing & chunking, self-consistency sampling, **counterfactual** case generation, and **back-translation** for robustness.
|
| 106 |
+
* **Reproducible Training (HPC):**
|
| 107 |
+
|
| 108 |
+
* Deterministic seeds, pinned package versions, mixed-precision logs.
|
| 109 |
+
* Checkpointing & artifact tracking; early-stopping with learning-rate scheduling.
|
| 110 |
+
* Config-driven runs (`configs/β¦`) and run sheets for auditability.
|
| 111 |
+
|
| 112 |
+
---
|
| 113 |
+
|
| 114 |
+
## Evaluation
|
| 115 |
+
|
| 116 |
+
* **Benchmarks:** **MedMCQA**, **PubMedQA** for domain generalization and reading comprehension.
|
| 117 |
+
* **Semantic Audits:** Biomedical-embedding similarity checks vs. gold rationales/evidence.
|
| 118 |
+
* **Runtime Guards:** Refusal policies for out-of-scope requests, uncertainty flags when evidence is weak.
|
| 119 |
+
* **Reporting:** Per-agent precision/recall, citation coverage, hallucination rate, decision-time.
|
| 120 |
+
|
| 121 |
+
> *Goal: measurably reduce hallucinations while preserving coverage and answerability, with transparent citations and cautions.*
|
| 122 |
+
|
| 123 |
+
---
|
| 124 |
+
|
| 125 |
+
## Safety, Ethics & Compliance
|
| 126 |
+
|
| 127 |
+
* **Not a medical device**; educational/research use only (see Disclaimer).
|
| 128 |
+
* **Retrieval Safety Rails:** source whitelisting, date scopes, section filters, and citation requirements.
|
| 129 |
+
* **Privacy:** de-identification pipelines for EMR/EHR; least-privilege access patterns.
|
| 130 |
+
* **Compliance Mindset:** aligns with **Australian Privacy Principles (APPs)** and general healthcare data-handling norms.
|
| 131 |
+
* **Human-in-the-Loop:** outputs framed as **clinical decision support**, not directives.
|
| 132 |
+
|
| 133 |
+
---
|
| 134 |
+
|
| 135 |
+
## Tech Stack
|
| 136 |
+
|
| 137 |
+
* **Agent Framework:** MCP-style orchestrator for tool/agent routing.
|
| 138 |
+
* **RAG:** Node & Graph RAG; biomedical embeddings; citation enforcement.
|
| 139 |
+
* **LLM Adaptation:** LoRA/QLoRA, KD, GRPO; augmentation toolchain.
|
| 140 |
+
* **Infra:** HPC training; experiment tracking; containerized services (dev/prod).
|
| 141 |
+
* **Eval:** MedMCQA, PubMedQA harnesses; semantic similarity metrics.
|
| 142 |
+
|
| 143 |
+
---
|
| 144 |
+
|
| 145 |
+
## Repository Structure
|
| 146 |
+
|
| 147 |
+
```
|
| 148 |
+
ββ apps/
|
| 149 |
+
β ββ orchestrator/ # MCP planner/router, self-consistency, fusion
|
| 150 |
+
β ββ agent_diagnostics/ # differential logic, red flags, uncertainty
|
| 151 |
+
β ββ agent_pharmacology/ # DDIs, dosing, contraindications
|
| 152 |
+
β ββ agent_triage/ # urgency/disposition policy
|
| 153 |
+
ββ rag/
|
| 154 |
+
β ββ pipelines/ # Node & Graph RAG, safety rails
|
| 155 |
+
β ββ build_index.py # EMR/EHR + PubMed indexing
|
| 156 |
+
β ββ server.py # retrieval API with citation payloads
|
| 157 |
+
ββ training/
|
| 158 |
+
β ββ datasets/ # loaders for curated/synthetic cases
|
| 159 |
+
β ββ augmentation/ # QAC, counterfactuals, back-translation
|
| 160 |
+
β ββ finetune/ # KD, LoRA/QLoRA, GRPO loops
|
| 161 |
+
β ββ configs/ # seeds, LR schedules, early-stopping
|
| 162 |
+
ββ eval/
|
| 163 |
+
β ββ medmcqa/ # benchmark harness
|
| 164 |
+
β ββ pubmedqa/ # benchmark harness
|
| 165 |
+
β ββ semantics/ # embedding audits, hallucination metrics
|
| 166 |
+
ββ docs/ # design notes, risk register, policies
|
| 167 |
+
ββ scripts/ # utilities, data prep, CI hooks
|
| 168 |
+
ββ requirements.txt
|
| 169 |
+
ββ .env.example
|
| 170 |
+
ββ LICENSE
|
| 171 |
+
```
|
| 172 |
+
|
| 173 |
+
---
|
| 174 |
+
|
| 175 |
+
## Roadmap
|
| 176 |
+
|
| 177 |
+
* [ ] Expand specialty coverage (cardiology, oncology, paediatrics).
|
| 178 |
+
* [ ] Add guideline-aware retrieval (e.g., section targeting for dosing tables).
|
| 179 |
+
* [ ] Continual-learning loop with clinician feedback and calibration tracking.
|
| 180 |
+
* [ ] UI for evidence graphs and counterfactual βwhat-ifβ explorations.
|
| 181 |
+
* [ ] Stress tests for adversarial prompts and data drift.
|
| 182 |
+
|
| 183 |
+
---
|
| 184 |
+
|
| 185 |
+
## Team
|
| 186 |
+
|
| 187 |
+
**Swinburne COS30018 β Group Members**
|
| 188 |
+
Liam Β· Henry Β· Hai Β· Dylan Β· Vinh
|
| 189 |
+
|
| 190 |
+
---
|
| 191 |
+
|
| 192 |
+
## Academic Context
|
| 193 |
+
|
| 194 |
+
This project demonstrates COS30018 learning outcomes in **intelligent systems** by integrating:
|
| 195 |
+
|
| 196 |
+
* **Agent architectures** (specialist agents + central planner),
|
| 197 |
+
* **Search & knowledge representation** (graph-structured RAG),
|
| 198 |
+
* **Machine learning** (distillation, parameter-efficient fine-tuning),
|
| 199 |
+
* **Evaluation & ethics** (benchmarking, safety rails, HIL oversight).
|
| 200 |
+
|
| 201 |
+
---
|
| 202 |
+
|
| 203 |
+
## Disclaimer
|
| 204 |
+
|
| 205 |
+
This system is **for research and educational purposes only**. It **does not provide medical advice** and must **not** be used to diagnose, treat, or manage real patients. Always consult qualified healthcare professionals.
|
| 206 |
+
|
| 207 |
+
---
|
| 208 |
+
|
| 209 |
+
## Citation
|
| 210 |
+
|
| 211 |
+
If you reference this work in academic contexts:
|
| 212 |
+
|
| 213 |
+
```
|
| 214 |
+
Swinburne University of Technology, COS30018 Team (2025).
|
| 215 |
+
Multi-Agent Clinical Reasoning System with Agentic RAG and Safety Rails.
|
| 216 |
+
https://huggingface.co/MedAI-COS30018
|
| 217 |
+
```
|
| 218 |
+
|
| 219 |
+
---
|
| 220 |
+
|
| 221 |
+
## License
|
| 222 |
+
Apache-2.0.
|
| 223 |
+
|
| 224 |
+
|
| 225 |
+
|