GVHD Severity Prediction - Final Results

Best Model: Stacking Ensemble (CatBoost + XGBoost + Neural Net)

Metric Value
AUC 0.7083 ± 0.0117
Baseline AUC 0.7034
Improvement +0.0049 (+0.7%)
Brier (Platt Calibrated) 0.2019
Optimal Threshold 0.544
Sensitivity 68.4%
Specificity 58.8%
PPV 76.9%
NPV 48.3%

Model Comparison

Model AUC Mean AUC Std Fold1 Fold2 Fold3 Fold4 Fold5
CatBoost 0.6963 ±0.0105 0.700 0.711 0.699 0.679 0.694
XGBoost 0.6986 ±0.0126 0.705 0.711 0.704 0.675 0.698
NeuralNet 0.6870 ±0.0088 0.692 0.699 0.698 0.677 0.681
Stacking 0.7083 ±0.0117 0.714 0.722 0.714 0.688 0.703

Calibration

Method Brier Score
Raw 0.2150
Platt Scaling 0.2019
Isotonic Regression 0.2024

Key Improvements

  1. Feature Engineering: interactions, polynomials, log transforms, missingness indicators
  2. GPU Acceleration: Tesla T4 for CatBoost and Neural Net
  3. Stacking Ensemble: Logistic Regression meta-learner on OOF predictions
  4. Probability Calibration: Platt scaling for clinical deployment

Files

  • gvhd_gpu_pipeline.py - Complete pipeline code (every line commented)
  • result_comparison_final.csv - Model comparison table
  • GVHD_Final_Report.ipynb - Jupyter notebook with tables
  • calibration_plot.png - Calibration curve

Honest Ceiling

Pre-transplant-only models plateau at AUC ≈ 0.71. Any claim > 0.75 requires post-transplant biomarkers (Day 7-14).

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "cuimiandashi/gvhd-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support