| --- |
| title: OncoAgent |
| emoji: 𧬠|
| colorFrom: red |
| colorTo: blue |
| sdk: gradio |
| sdk_version: 5.31.0 |
| app_file: app.py |
| pinned: false |
| license: apache-2.0 |
| short_description: Multi-Agent Oncology Triage powered by AMD MI300X |
| --- |
| |
| # 𧬠OncoAgent β Multi-Agent Oncology Triage System |
|
|
|  |
|  |
|  |
|  |
|  |
|
|
| > **AMD Developer Hackathon 2026** Β· Powered by AMD Instinctβ’ MI300X Β· ROCm 7.2 |
|
|
| ## π 100% Open-Source: Democratizing Oncology |
| OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to: |
| - **Guarantee Patient Privacy:** Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital. |
| - **Foster Global Contribution:** Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base. |
|
|
| OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat **unstructured data blindness** in primary care oncology. It leverages a tier-adaptive architecture featuring **Qwen 3.5-9B** (Speed Triage) and **Qwen 3.6-27B** (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop. |
|
|
| --- |
|
|
| ## ποΈ Architecture |
|
|
| ``` |
| ββββββββββ βββββββββββ βββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββ |
| β Router ββββΆβIngestionββββΆβCorrectiveββββΆβ Specialist βββββββ Critic β β Formatterβ |
| β(Triage)β β (PHI) β β RAG β β (Qwen 9B/ β β(Reflexion β β(Output) β |
| ββββββββββ βββββββββββ βββββββββββ β 27B) ββββββΆβ Validation)β βββββββββββ |
| β β β ββββββββββββββ ββββββββββββββ β² |
| β β β β β β |
| βΌ βΌ βΌ βΌ βΌ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββ |
| β Fallback Node β β HITL Gate β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β(Acuity Chk)β |
| ββββββββββββββ |
| ``` |
|
|
| **Key Components:** |
|
|
| | Module | Description | |
| |--------|-------------| |
| | `data_prep/` | Dataset builder: PMC-Patients/OncoCoT β Strict JSONL (Llama 3 chat template) | |
| | `rag_engine/` | The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization. | |
| | `agents/` | The "Reasoning": LangGraph multi-agent orchestration (Router β Corrective RAG β Specialist β Critic β HITL Gate). | |
| | `ui/` | The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output. | |
|
|
| --- |
|
|
| ## π§ Dual-Tier Model Strategy (Qwen) |
|
|
| To maximize the compute capabilities of the **AMD MI300X**, OncoAgent implements a dynamic **Dual-Tier** routing strategy using the Qwen model family. **Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types** (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning: |
|
|
| - **Tier 1: Qwen 3.5-9B (Speed Triage):** A lightweight, extremely fast model used by the `Router` to assess initial complexity, perform simple triage, and handle low-risk queries. |
| - **Tier 2: Qwen 3.6-27B (Deep Reasoning):** The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops. |
|
|
| --- |
|
|
| ## β‘ Hardware Target |
|
|
| - **GPU:** AMD Instinctβ’ MI300X (192GB HBM3) |
| - **Software Stack:** ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention |
| - **Models:** `Qwen/Qwen3.5-9B` (Speed Triage) & `Qwen/Qwen3.6-27B-Instruct` (Deep Reasoning) |
| - **Precision:** QLoRA 4-bit NormalFloat4 via `bitsandbytes` (ROCm compatible) |
|
|
| --- |
|
|
| ## π Quick Start |
|
|
| ```bash |
| # 1. Clone and setup |
| git clone <repo-url> |
| cd OncoAgent |
| |
| # 2. Install dependencies |
| python -m venv .venv |
| source .venv/bin/activate |
| pip install -r requirements.txt |
| |
| # 3. Start Inference Server (vLLM on Docker) |
| # This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention |
| docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \ |
| --model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1 |
| |
| # 4. Configure environment & Run UI |
| cp .env.example .env |
| # Set VLLM_API_BASE=http://localhost:8000/v1 in .env |
| python -m ui.app |
| ``` |
|
|
| --- |
|
|
| ## π Project Structure |
|
|
| ``` |
| βββ docs/ # Documentation & research |
| β βββ research/ # Deep Research analysis documents |
| β βββ ADR/ # Architectural Decision Records |
| β βββ oncoagent_master_directive.md |
| β βββ antigravity_rules.md |
| βββ data_prep/ # Dataset preparation (Fase 0) |
| βββ rag_engine/ # RAG ingestion & retrieval (Fase 0-3) |
| βββ agents/ # LangGraph orchestration (Fase 3) |
| βββ ui/ # Gradio frontend (Fase 4) |
| βββ tests/ # Unit & integration tests |
| βββ scripts/ # Utility scripts |
| βββ logs/ # Paper log & social media log |
| βββ requirements.txt # Pinned dependencies |
| βββ Dockerfile # HF Spaces deployment |
| ``` |
|
|
| --- |
|
|
| ## π©Ί Safety Guarantees |
|
|
| - **Reflexion-based Critic Loop:** A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages. |
| - **Human-In-The-Loop (HITL) Gate:** An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations). |
| - **Corrective RAG:** The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing. |
| - **Zero-PHI:** Regex-based PII redaction before any processing |
| - **Reproducibility:** Fixed seeds (`torch.manual_seed(42)`) across all ML scripts |
|
|
| --- |
|
|
| ## π License |
|
|
| This project was built for the AMD Developer Hackathon 2026. |
|
|
| --- |
|
|
| ## π₯ Team |
|
|
| Built with β€οΈ and AMD Instinct MI300X. |
|
|