SurgicalCopilot Onco - Cancer Surveillance & Recurrence Detection

LoRA adapter for MedGemma-27B fine-tuned on long-term oncology surveillance after curative-intent cancer surgery.

Live Demo URL Update: The original Azure URL submitted (https://surgicalcopilot-app.azurewebsites.net/) is currently unavailable due to an unexpected Microsoft Azure account freeze. We have migrated the frontend to Vercel so the application can still be evaluated.

🌐 Working Live Demo (Vercel)

Model Description

This is a LoRA adapter trained on Google's MedGemma-27B for oncology surveillance (months to years after cancer surgery). The model performs risk assessment and recurrence detection using:

RECIST criteria (Complete Response / Partial Response / Stable Disease / Progressive Disease)
Tumor marker trends (CEA, CA19-9, CA125, etc.)
Clinical symptoms and quality of life
Imaging findings (CT, PET, MRI)

Risk stratification:

GREEN: No evidence of disease, routine follow-up
AMBER: Concerning trends, accelerated surveillance needed
RED: Recurrence suspected or confirmed, oncology intervention required
Developed by: Aayush (SurgicalCopilot Project)
Model type: Causal Language Model with LoRA adapter
Language: English (Oncology terminology)
License: Apache 2.0
Base Model: google/medgemma-27b-text-it
Adapter Type: LoRA (PEFT)

Intended Use

Primary Use Case

Long-term cancer surveillance (months to years post-surgery)
Recurrence detection from imaging + markers + symptoms
RECIST alignment for standardized response assessment
Trend analysis of tumor markers over time
Surveillance protocol adherence (NCCN guidelines)

Users

Surgical oncologists
Medical oncologists
Cancer surveillance programs
Tumor boards and MDT meetings

IMPORTANT: This is a research/demo model

⚠️ Not FDA approved or validated for clinical use
⚠️ Requires oncology expertise for interpretation
⚠️ Trained on synthetic data - real-world validation needed
⚠️ For demonstration purposes only

Training Details

Training Data

Dataset Size: ~15,000-20,000 synthetic oncology cases
Cancer Types: Colorectal, pancreatic, gastric, hepatobiliary
Data Features:
- Imaging reports (CT, PET, MRI)
- Tumor marker trends (CEA, CA19-9, CA125, AFP)
- Patient symptoms and performance status
- Surgical history and pathology
- Time from surgery (surveillance interval)
Label Distribution:
- GREEN (NED): ~55-60%
- AMBER (suspicious): ~20-25%
- RED (recurrence): ~15-20%
RECIST Distribution:
- CR: ~50-55%
- SD: ~25-30%
- PR: ~10-15%
- PD: ~10-15%

Training Procedure

LoRA Configuration

{
    "r": 16,
    "lora_alpha": 32,
    "lora_dropout": 0.05,
    "bias": "none",
    "task_type": "CAUSAL_LM",
    "target_modules": [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ]
}

Training Hyperparameters

Epochs: 3
Batch Size: 1 per GPU × 8 gradient accumulation = 64 effective
Learning Rate: 2e-4 (cosine schedule)
Warmup Steps: 80
Optimizer: AdamW (fused)
Weight Decay: 0.01
Precision: bfloat16 + tf32
Max Sequence Length: 2048 tokens (longest of 3 adapters)

Hardware

GPUs: 8× NVIDIA H200 141GB
Training Time: ~6-8 hours

Framework Versions

Transformers: 4.45.0
PEFT: 0.13.0
PyTorch: 2.1.0+cu121
Python: 3.12

Performance Metrics

Evaluation Results (n=500)

Metric	Score
Parse Rate	99.7%
Schema Compliance	100%
Label Accuracy (risk)	93.2%
RECIST Accuracy	95.1%
Macro F1	0.94
RED Recall (Recurrence)	97.1%
RED Precision	95.8%

Critical Safety Metrics

✅ 97.1% sensitivity for recurrence detection
✅ Zero missed progressive disease in validation
✅ High RECIST alignment (95.1% agreement with ground truth)
✅ Tumor marker trend analysis improves early detection

Latency

Average Inference Time: 4.2 seconds (H100 GPU)
Tokens Generated: ~300-500 tokens per case (longest output)

Output Schema

{
  "doc_type": "oncology_surveillance",
  "risk_level": "RED",
  "risk_score": 0.89,
  "progression_status": "recurrence_suspected",
  "recist_alignment": "PD",
  "trigger_reason": "Rising CEA + new liver lesions",
  "copilot_transfer": {
    "urgency": "urgent",
    "recommended_action": "Oncology referral within 48-72 hours",
    "imaging_recommendation": "Contrast-enhanced CT chest/abdomen/pelvis"
  },
  "recommended_actions": [
    "Urgent oncology consultation",
    "Repeat tumor markers in 2 weeks",
    "Consider PET scan for metastatic workup",
    "Tumor board discussion"
  ],
  "clinical_explanation": "Rising CEA from 3.2 to 12.8 over 3 months combined with new hepatic lesions on CT suggests hepatic recurrence. Patient reports new-onset fatigue and weight loss (5kg in 2 months). RECIST criteria consistent with progressive disease.",
  "safety_flags": {
    "tumor_marker_doubling_time": "45 days",
    "symptomatic_progression": true,
    "new_metastases": true
  },
  "phase1b_compat": {
    "red_flag_triggered": true
  }
}

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load model
base_model = "google/medgemma-27b-text-it"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Load Onco adapter
model = PeftModel.from_pretrained(
    model,
    "bobby07007/surgicalcopilot-onco-27b"
)

# System prompt
system_prompt = (
    'You are an oncology surveillance AI. Output ONLY a single raw JSON object — '
    'no markdown, no code fences, no explanation. '
    'The JSON must contain the key "risk_level" with value "green", "amber", or "red", '
    'and "recist_alignment" with value "CR", "PR", "SD", or "PD".'
)

# Example case
case_text = """
Patient: 58M, 18 months post right hemicolectomy for stage III colon cancer
Completed adjuvant FOLFOX (6 months)

Surveillance Labs:
  CEA: 12.8 ng/mL (baseline 2.1, last visit 8.4)
  
Imaging (CT Chest/Abdomen/Pelvis):
  - Two new hypodense lesions in liver (segments 6 and 7), largest 2.3 cm
  - No evidence of local recurrence at anastomosis
  - No pulmonary nodules
  - No lymphadenopathy

Symptoms:
  - Fatigue, progressive over 2 months
  - Unintentional weight loss: 5kg in 2 months
  - No abdominal pain
  - Bowel function normal
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": case_text}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=1536, do_sample=False)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Key Features

RECIST Integration

Standardized response assessment (CR/PR/SD/PD)
Aligns with oncology guidelines (NCCN, ESMO)
Imaging finding interpretation

Tumor Marker Analysis

Temporal trends (doubling time calculation)
Multi-marker integration (CEA + CA19-9 + others)
Threshold exceedance detection

Clinical Reasoning

Verbose explanations (300-400 tokens)
Evidence synthesis from imaging + labs + symptoms
Differential diagnosis considerations

Actionable Recommendations

Urgency stratification (routine / accelerated / urgent)
Imaging recommendations
Oncology referral guidance
Tumor board discussion triggers

Limitations

Synthetic training data: No real patient outcomes
Limited cancer types: Primarily GI malignancies
No pathology integration: Text-based imaging reports only
Context window: 2048 tokens may truncate complex histories
No treatment recommendations: Surveillance focus only

Bias & Fairness

Known Biases

Cancer type bias: Better performance on colorectal vs. rare cancers
Stage bias: More training data for stage II-III than stage IV
Imaging modality: CT-centric, less MRI/PET experience

Clinical Validation Needed

Before clinical deployment:

✅ Retrospective validation on real surveillance cohorts
✅ Prospective pilot with oncology oversight
✅ Multi-institutional validation
✅ Rare cancer type assessment
✅ Inter-rater reliability with oncologists

Citation

@misc{surgicalcopilot2026onco,
  title={SurgicalCopilot Onco: Cancer Surveillance with MedGemma},
  author={Aayush},
  year={2026},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/bobby07007/surgicalcopilot-onco-27b}},
  note={LoRA adapter for oncology surveillance}
}

Acknowledgments

RECIST Criteria: Eisenhauer EA et al. (2009) European Journal of Cancer
NCCN Guidelines: National Comprehensive Cancer Network
Base Model: Google MedGemma-27B-text-it

License

Apache 2.0

⚠️ DISCLAIMER: Research model only. Not for clinical decision-making without validation and oncology oversight. Early recurrence detection requires tissue confirmation.

Downloads last month: 1

Model tree for bobby07007/surgicalcopilot-onco-27b

Base model

google/gemma-3-27b-pt

Finetuned

google/medgemma-27b-text-it

Adapter

(7)

this model