SurgicalCopilot Phase2 - Post-Discharge Recovery Monitoring
LoRA adapter for MedGemma-27B fine-tuned on post-discharge surgical patient monitoring (SAFEGUARD system).
Model Description
This is a LoRA adapter trained on Google's MedGemma-27B for post-discharge recovery monitoring (days 5-30 after surgery). The model performs risk stratification of surgical patients into three categories:
- GREEN: Recovery on track, routine follow-up
- AMBER: Concerning signs, needs closer monitoring
- RED: Critical deterioration, urgent clinical review
The model integrates patient-reported symptoms (pain, temperature, wound status, mobility) with optional wearable device data to detect complications like surgical site infection, anastomotic leak, DVT, and post-discharge deterioration.
- Developed by: Aayush (SurgicalCopilot Project)
- Model type: Causal Language Model with LoRA adapter
- Language: English (Medical + patient-friendly)
- License: Apache 2.0
- Base Model: google/medgemma-27b-text-it
- Adapter Type: LoRA (PEFT)
- System Name: SAFEGUARD (Surgical AI Framework for Enhanced Guidance and Uninterrupted Assessment of Recovery and Deterioration)
Intended Use
Primary Use Case
- Post-discharge surgical monitoring (Days 5-30 after discharge)
- Remote patient monitoring via daily check-ins
- Complication detection (SSI, leak, DVT, ileus)
- Patient-reported outcome assessment
- Wearable device integration (Apple Watch, Fitbit, Garmin)
Users
- Surgical patients (self-reporting symptoms)
- Surgeons and care teams (monitoring dashboards)
- Remote monitoring programs
- Telehealth platforms
IMPORTANT: This is a research/demo model
- ⚠️ Not FDA approved or validated for clinical use
- ⚠️ Requires clinical oversight for RED alerts
- ⚠️ Trained on synthetic data - real-world validation needed
- ⚠️ For demonstration purposes only
Training Details
Training Data
- Dataset Size: ~15,000-20,000 synthetic post-discharge cases
- Data Features:
- Patient check-in forms (pain, temperature, wound, mobility, GI function)
- Wearable device data (HR, SpO2, steps, sleep)
- Surgical procedure and POD (post-op day)
- Historical trends and trajectories
- Label Distribution:
- GREEN: ~60-65% (majority stable recoveries)
- AMBER: ~25-30% (concerning but manageable)
- RED: ~10-15% (critical complications)
Training Procedure
LoRA Configuration
{
"r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05,
"bias": "none",
"task_type": "CAUSAL_LM",
"target_modules": [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
]
}
Training Hyperparameters
- Epochs: 2
- Batch Size: 1 per GPU × 8 gradient accumulation = 64 effective
- Learning Rate: 2e-4 (cosine schedule)
- Warmup Steps: 150
- Optimizer: AdamW (fused)
- Weight Decay: 0.01
- Precision: bfloat16 + tf32
- Max Sequence Length: 1536 tokens
Hardware
- GPUs: 8× NVIDIA H200 141GB
- Training Time: ~4-6 hours
Framework Versions
- Transformers: 4.45.0
- PEFT: 0.13.0
- PyTorch: 2.1.0+cu121
- Python: 3.12
Performance Metrics
Evaluation Results (n=500)
| Metric | Score |
|---|---|
| Parse Rate | 99.5% |
| Schema Compliance | 100% |
| Label Accuracy | 92.8% |
| Macro F1 | 0.93 |
| RED Recall (Critical) | 96.7% |
| RED Precision | 94.2% |
Critical Safety Metrics
- ✅ 96.7% sensitivity for RED (critical) cases
- ✅ Low false negative rate for complications
- ✅ Patient history integration improves trend detection by 23%
Latency
- Average Inference Time: 3.1 seconds (H100 GPU)
- Tokens Generated: ~200-400 tokens per case
Output Schema
{
"doc_type": "safeguard_assessment",
"risk_level": "RED",
"risk_score": 0.87,
"timeline_deviation": "behind_expected",
"trajectory": "deteriorating",
"trigger_reason": "Surgical site infection suspected",
"domain_flags": {
"wound": "moderate",
"pain": "severe",
"mobility": "impaired",
"gi": "normal",
"respiratory": "normal"
},
"patient_message": {
"summary": "Your wound shows signs that need evaluation. Please contact your surgeon today.",
"self_care": [
"Take temperature every 4 hours",
"Keep wound clean and dry",
"Do not apply any creams"
],
"next_checkin": "12 hours or if symptoms worsen"
},
"copilot_transfer": {
"urgency": "same_day",
"recommended_action": "Surgical clinic visit within 24 hours"
},
"followup_questions": [
"Is there any drainage from the wound? What color?",
"Have you noticed any foul odor?",
"Are you able to keep food down?"
],
"evidence": [
{
"source": "temperature",
"domain": "infection",
"snippet": "Temperature 38.6°C exceeds post-discharge threshold"
}
],
"safety": {
"sepsis_screen": false,
"immediate_911": false
},
"phase1b_compat": {
"red_flag_triggered": true
}
}
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load model
base_model = "google/medgemma-27b-text-it"
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
# Load Phase2 adapter
model = PeftModel.from_pretrained(
model,
"bobby07007/surgicalcopilot-phase2-27b"
)
# System prompt
system_prompt = (
'You are SAFEGUARD, a post-discharge recovery monitoring AI. '
'Output ONLY a single raw JSON object — no markdown, no code fences. '
'The JSON must contain the key "risk_level" with value "green", "amber", or "red".'
)
# Example case
case_text = """
Patient: 45F, POD 7 post laparoscopic appendectomy
Daily Check-in:
Pain: 6/10 (increased from 3/10 yesterday)
Temperature: 38.6°C
Wound: Redness around incision, warmth noted
Nausea: None
Mobility: Limited due to pain
Appetite: Reduced
"""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": case_text}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Key Features
Patient History Integration
- Uses last 5-10 check-ins for trend analysis
- Detects gradual deterioration over time
- Identifies improving vs. worsening trajectories
Wearable Device Integration
- Heart rate monitoring
- SpO2 tracking
- Sleep quality assessment
- Activity level trends
Patient-Friendly Output
- Plain-language summaries for patients
- Self-care instructions
- Clear guidance on when to seek help
- Next check-in timing
Limitations
- Relies on self-reported data: Accuracy depends on patient reporting
- No physical examination: Cannot assess wound directly without image
- Context window: Limited to 1536 tokens
- Synthetic training: Needs real-world validation
- No image analysis: Text-only (images processed separately by 4B model)
Citation
@misc{surgicalcopilot2026phase2,
title={SurgicalCopilot Phase2: SAFEGUARD Post-Discharge Monitoring},
author={Aayush},
year={2026},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/bobby07007/surgicalcopilot-phase2-27b}}
}
License
Apache 2.0
⚠️ DISCLAIMER: Research/demonstration model only. Not for clinical use without validation and oversight.
- Downloads last month
- 29