🧬 OncoAgent v1.0 — 9B (Tier 1)
QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage
AMD Developer Hackathon 2026 · Trained on AMD Instinct™ MI300X · ROCm 7.2
Model Description
OncoAgent v1.0 9B is a QLoRA fine-tuned LoRA adapter built on top of Qwen/Qwen3.5-9B, specialized for clinical oncology triage and treatment recommendation.
This is the Tier 1 (fast triage) model in the OncoAgent multi-agent system, optimized for:
- Rapid cancer type classification and routing
- Clinical entity extraction (symptoms, staging, biomarkers)
- First-pass treatment recommendations based on NCCN/ESMO guidelines
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-9B |
| Method | QLoRA (4-bit NormalFloat4) |
| Framework | Unsloth + PEFT + TRL |
| Hardware | AMD Instinct™ MI300X (192GB HBM3) |
| Software | ROCm 7.2 · PyTorch 2.3+ |
| LoRA Rank | 32 |
| LoRA Alpha | 32 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Samples | 240,168 (+ 26,686 eval) |
| Max Sequence Length | 2,048 tokens |
| Batch Size | 8 (gradient accumulation: 2 → effective: 16) |
| Learning Rate | 2e-4 (cosine schedule) |
| Epochs | 1 |
| Precision | BF16 (native MI300X) |
| Seed | 42 (reproducible) |
Dataset
Trained on MaximoLopezChenlo/OncoAgent-Clinical-266K, a curated oncology dataset combining:
- PMC-Patients — Real clinical case presentations
- PubMedQA — Evidence-based medical Q&A
- OncoCoT — Chain-of-thought oncology reasoning (synthetic)
- NCCN/ESMO Guidelines — Structured guideline extracts
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.5-9B",
device_map="auto",
torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"MaximoLopezChenlo/OncoAgent-v1.0-9B",
)
# Inference
messages = [
{"role": "system", "content": "You are a clinical oncology specialist."},
{"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
vLLM Deployment (AMD MI300X)
# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen3.5-9B \
--enable-lora \
--lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
--dtype bfloat16 \
--tensor-parallel-size 1 \
--gpu-memory-utilization 0.45
Architecture
OncoAgent v1.0 9B serves as the Tier 1 model in a dual-tier architecture:
Clinical Case → Router → [Tier 1: 9B] → Specialist → Critic → Output
↓
(Complex cases)
↓
[Tier 2: 27B] → Specialist → Critic → Output
Links
- 🔗 Demo: HF Space
- 🔗 GitHub: maximolopezchenlo-lab/OncoAgent
- 🔗 Tier 2 Model: OncoAgent-v1.0-27B
- 🔗 Dataset: OncoAgent-Clinical-266K
Citation
@misc{oncoagent2026,
title={OncoAgent: Multi-Agent Oncology Triage System},
author={Lopez Chenlo, Maximo},
year={2026},
howpublished={AMD Developer Hackathon 2026},
url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}
License
Apache 2.0 — This adapter is for research and educational purposes only. Not intended for direct clinical use without professional medical oversight.
- Downloads last month
- 46