SurgicalCopilot Phase2 - Post-Discharge Recovery Monitoring

LoRA adapter for MedGemma-27B fine-tuned on post-discharge surgical patient monitoring (SAFEGUARD system).

Model Description

This is a LoRA adapter trained on Google's MedGemma-27B for post-discharge recovery monitoring (days 5-30 after surgery). The model performs risk stratification of surgical patients into three categories:

  • GREEN: Recovery on track, routine follow-up
  • AMBER: Concerning signs, needs closer monitoring
  • RED: Critical deterioration, urgent clinical review

The model integrates patient-reported symptoms (pain, temperature, wound status, mobility) with optional wearable device data to detect complications like surgical site infection, anastomotic leak, DVT, and post-discharge deterioration.

  • Developed by: Aayush (SurgicalCopilot Project)
  • Model type: Causal Language Model with LoRA adapter
  • Language: English (Medical + patient-friendly)
  • License: Apache 2.0
  • Base Model: google/medgemma-27b-text-it
  • Adapter Type: LoRA (PEFT)
  • System Name: SAFEGUARD (Surgical AI Framework for Enhanced Guidance and Uninterrupted Assessment of Recovery and Deterioration)

Intended Use

Primary Use Case

  • Post-discharge surgical monitoring (Days 5-30 after discharge)
  • Remote patient monitoring via daily check-ins
  • Complication detection (SSI, leak, DVT, ileus)
  • Patient-reported outcome assessment
  • Wearable device integration (Apple Watch, Fitbit, Garmin)

Users

  • Surgical patients (self-reporting symptoms)
  • Surgeons and care teams (monitoring dashboards)
  • Remote monitoring programs
  • Telehealth platforms

IMPORTANT: This is a research/demo model

  • ⚠️ Not FDA approved or validated for clinical use
  • ⚠️ Requires clinical oversight for RED alerts
  • ⚠️ Trained on synthetic data - real-world validation needed
  • ⚠️ For demonstration purposes only

Training Details

Training Data

  • Dataset Size: ~15,000-20,000 synthetic post-discharge cases
  • Data Features:
    • Patient check-in forms (pain, temperature, wound, mobility, GI function)
    • Wearable device data (HR, SpO2, steps, sleep)
    • Surgical procedure and POD (post-op day)
    • Historical trends and trajectories
  • Label Distribution:
    • GREEN: ~60-65% (majority stable recoveries)
    • AMBER: ~25-30% (concerning but manageable)
    • RED: ~10-15% (critical complications)

Training Procedure

LoRA Configuration

{
    "r": 16,
    "lora_alpha": 32,
    "lora_dropout": 0.05,
    "bias": "none",
    "task_type": "CAUSAL_LM",
    "target_modules": [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ]
}

Training Hyperparameters

  • Epochs: 2
  • Batch Size: 1 per GPU × 8 gradient accumulation = 64 effective
  • Learning Rate: 2e-4 (cosine schedule)
  • Warmup Steps: 150
  • Optimizer: AdamW (fused)
  • Weight Decay: 0.01
  • Precision: bfloat16 + tf32
  • Max Sequence Length: 1536 tokens

Hardware

  • GPUs: 8× NVIDIA H200 141GB
  • Training Time: ~4-6 hours

Framework Versions

  • Transformers: 4.45.0
  • PEFT: 0.13.0
  • PyTorch: 2.1.0+cu121
  • Python: 3.12

Performance Metrics

Evaluation Results (n=500)

Metric Score
Parse Rate 99.5%
Schema Compliance 100%
Label Accuracy 92.8%
Macro F1 0.93
RED Recall (Critical) 96.7%
RED Precision 94.2%

Critical Safety Metrics

  • 96.7% sensitivity for RED (critical) cases
  • Low false negative rate for complications
  • Patient history integration improves trend detection by 23%

Latency

  • Average Inference Time: 3.1 seconds (H100 GPU)
  • Tokens Generated: ~200-400 tokens per case

Output Schema

{
  "doc_type": "safeguard_assessment",
  "risk_level": "RED",
  "risk_score": 0.87,
  "timeline_deviation": "behind_expected",
  "trajectory": "deteriorating",
  "trigger_reason": "Surgical site infection suspected",
  "domain_flags": {
    "wound": "moderate",
    "pain": "severe",
    "mobility": "impaired",
    "gi": "normal",
    "respiratory": "normal"
  },
  "patient_message": {
    "summary": "Your wound shows signs that need evaluation. Please contact your surgeon today.",
    "self_care": [
      "Take temperature every 4 hours",
      "Keep wound clean and dry",
      "Do not apply any creams"
    ],
    "next_checkin": "12 hours or if symptoms worsen"
  },
  "copilot_transfer": {
    "urgency": "same_day",
    "recommended_action": "Surgical clinic visit within 24 hours"
  },
  "followup_questions": [
    "Is there any drainage from the wound? What color?",
    "Have you noticed any foul odor?",
    "Are you able to keep food down?"
  ],
  "evidence": [
    {
      "source": "temperature",
      "domain": "infection",
      "snippet": "Temperature 38.6°C exceeds post-discharge threshold"
    }
  ],
  "safety": {
    "sepsis_screen": false,
    "immediate_911": false
  },
  "phase1b_compat": {
    "red_flag_triggered": true
  }
}

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load model
base_model = "google/medgemma-27b-text-it"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Load Phase2 adapter
model = PeftModel.from_pretrained(
    model,
    "bobby07007/surgicalcopilot-phase2-27b"
)

# System prompt
system_prompt = (
    'You are SAFEGUARD, a post-discharge recovery monitoring AI. '
    'Output ONLY a single raw JSON object — no markdown, no code fences. '
    'The JSON must contain the key "risk_level" with value "green", "amber", or "red".'
)

# Example case
case_text = """
Patient: 45F, POD 7 post laparoscopic appendectomy
Daily Check-in:
  Pain: 6/10 (increased from 3/10 yesterday)
  Temperature: 38.6°C
  Wound: Redness around incision, warmth noted
  Nausea: None
  Mobility: Limited due to pain
  Appetite: Reduced
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": case_text}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Key Features

Patient History Integration

  • Uses last 5-10 check-ins for trend analysis
  • Detects gradual deterioration over time
  • Identifies improving vs. worsening trajectories

Wearable Device Integration

  • Heart rate monitoring
  • SpO2 tracking
  • Sleep quality assessment
  • Activity level trends

Patient-Friendly Output

  • Plain-language summaries for patients
  • Self-care instructions
  • Clear guidance on when to seek help
  • Next check-in timing

Limitations

  • Relies on self-reported data: Accuracy depends on patient reporting
  • No physical examination: Cannot assess wound directly without image
  • Context window: Limited to 1536 tokens
  • Synthetic training: Needs real-world validation
  • No image analysis: Text-only (images processed separately by 4B model)

Citation

@misc{surgicalcopilot2026phase2,
  title={SurgicalCopilot Phase2: SAFEGUARD Post-Discharge Monitoring},
  author={Aayush},
  year={2026},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/bobby07007/surgicalcopilot-phase2-27b}}
}

License

Apache 2.0


⚠️ DISCLAIMER: Research/demonstration model only. Not for clinical use without validation and oversight.

Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bobby07007/surgicalcopilot-phase2-27b

Adapter
(6)
this model