🧬 OncoAgent v1.0 — 9B (Tier 1)

QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage

AMD ROCm License

AMD Developer Hackathon 2026 · Trained on AMD Instinct™ MI300X · ROCm 7.2

Model Description

OncoAgent v1.0 9B is a QLoRA fine-tuned LoRA adapter built on top of Qwen/Qwen3.5-9B, specialized for clinical oncology triage and treatment recommendation.

This is the Tier 1 (fast triage) model in the OncoAgent multi-agent system, optimized for:

  • Rapid cancer type classification and routing
  • Clinical entity extraction (symptoms, staging, biomarkers)
  • First-pass treatment recommendations based on NCCN/ESMO guidelines

Training Details

Parameter Value
Base Model Qwen/Qwen3.5-9B
Method QLoRA (4-bit NormalFloat4)
Framework Unsloth + PEFT + TRL
Hardware AMD Instinct™ MI300X (192GB HBM3)
Software ROCm 7.2 · PyTorch 2.3+
LoRA Rank 32
LoRA Alpha 32
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Samples 240,168 (+ 26,686 eval)
Max Sequence Length 2,048 tokens
Batch Size 8 (gradient accumulation: 2 → effective: 16)
Learning Rate 2e-4 (cosine schedule)
Epochs 1
Precision BF16 (native MI300X)
Seed 42 (reproducible)

Dataset

Trained on MaximoLopezChenlo/OncoAgent-Clinical-266K, a curated oncology dataset combining:

  • PMC-Patients — Real clinical case presentations
  • PubMedQA — Evidence-based medical Q&A
  • OncoCoT — Chain-of-thought oncology reasoning (synthetic)
  • NCCN/ESMO Guidelines — Structured guideline extracts

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.5-9B",
    device_map="auto",
    torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    "MaximoLopezChenlo/OncoAgent-v1.0-9B",
)

# Inference
messages = [
    {"role": "system", "content": "You are a clinical oncology specialist."},
    {"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM Deployment (AMD MI300X)

# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen3.5-9B \
    --enable-lora \
    --lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
    --dtype bfloat16 \
    --tensor-parallel-size 1 \
    --gpu-memory-utilization 0.45

Architecture

OncoAgent v1.0 9B serves as the Tier 1 model in a dual-tier architecture:

Clinical Case → Router → [Tier 1: 9B] → Specialist → Critic → Output
                    ↓
              (Complex cases)
                    ↓
              [Tier 2: 27B] → Specialist → Critic → Output

Links

Citation

@misc{oncoagent2026,
  title={OncoAgent: Multi-Agent Oncology Triage System},
  author={Lopez Chenlo, Maximo},
  year={2026},
  howpublished={AMD Developer Hackathon 2026},
  url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}

License

Apache 2.0 — This adapter is for research and educational purposes only. Not intended for direct clinical use without professional medical oversight.

Downloads last month
46
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MaximoLopezChenlo/OncoAgent-v1.0-9B

Finetuned
Qwen/Qwen3.5-9B
Adapter
(159)
this model

Dataset used to train MaximoLopezChenlo/OncoAgent-v1.0-9B