sybil / README.md
Aakash-Tripathi's picture
Upload folder using huggingface_hub
1206896 verified
|
raw
history blame
3.89 kB
metadata
license: mit
tags:
  - medical
  - cancer
  - ct-scan
  - risk-prediction
  - healthcare
  - pytorch
  - vision
datasets:
  - NLST
metrics:
  - auc
  - c-index
language:
  - en
library_name: transformers
pipeline_tag: image-classification

Sybil - Lung Cancer Risk Prediction

Model Description

Sybil is a validated deep learning model that predicts future lung cancer risk from a single low-dose chest CT (LDCT) scan. Published in the Journal of Clinical Oncology, this model can assess cancer risk over a 1-6 year timeframe.

Key Features

  • Single Scan Analysis: Requires only one LDCT scan
  • Multi-Year Prediction: Provides risk scores for years 1-6
  • Validated Performance: Tested across multiple institutions globally
  • Ensemble Approach: Uses 5 models for robust predictions

Model Details

  • Developed by: MIT CSAIL & Mass General Cancer Center (Original)
  • Adapted by: Lab-Rasool (Hugging Face version)
  • Model type: 3D Convolutional Neural Network
  • Architecture: 3D ResNet-18 with multi-attention pooling
  • Input: LDCT scans (200 slices × 256×256 pixels)
  • Output: 6 risk scores (years 1-6)
  • License: MIT

Performance Metrics

Dataset 1-Year AUC 6-Year AUC
NLST Test 0.94 0.86
MGH 0.86 0.75
CGMH Taiwan 0.94 0.80

Usage

from huggingface_sybil import SybilHFWrapper, SybilConfig

# Load model
config = SybilConfig()
model = SybilHFWrapper.from_pretrained("Lab-Rasool/sybil")

# Prepare DICOM files
dicom_paths = ["scan1.dcm", "scan2.dcm", ...]

# Get predictions
output = model(dicom_paths=dicom_paths)
risk_scores = output.risk_scores

# Display results
for year, score in enumerate(risk_scores, 1):
    print(f"Year {year}: {score:.1%} risk")

Intended Use

Primary Use Cases

  • Risk stratification in lung cancer screening programs
  • Research on lung cancer prediction models
  • Clinical decision support (with appropriate oversight)

Users

  • Healthcare providers
  • Medical researchers
  • Screening program coordinators

Out of Scope

  • Diagnosis of existing cancer
  • Use with non-LDCT imaging (X-rays, MRI)
  • Sole basis for clinical decisions

Training Data

Trained on the National Lung Screening Trial (NLST) dataset:

  • ~50,000 participants
  • Ages 55-74
  • Current/former heavy smokers
  • 3 annual LDCT scans

Ethical Considerations

⚠️ Medical AI Notice: This model should supplement, not replace, clinical judgment. Always consider:

  • Complete patient history
  • Other risk factors
  • Current screening guidelines
  • Need for human oversight

Limitations

  • Optimized for screening-eligible population (55-80 years)
  • Requires LDCT scans specifically
  • Performance may vary across different CT scanners
  • Not validated for non-screening populations

Citation

Original Paper:

@article{mikhael2023sybil,
  title={Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography},
  author={Mikhael, Peter G and Wohlwend, Jeremy and Yala, Adam and Karstens, Ludvig and Xiang, Justin and Takigami, Angelo K and Bourgouin, Patrick P and Chan, PuiYee and Mrah, Sofiane and Amayri, Wael and others},
  journal={Journal of Clinical Oncology},
  volume={41},
  number={12},
  pages={2191--2200},
  year={2023},
  publisher={American Society of Clinical Oncology}
}

Acknowledgments

This Hugging Face implementation is based on the original work by Peter G. Mikhael, Jeremy Wohlwend, and the team at MIT CSAIL and Massachusetts General Hospital. Original model and code available at GitHub.

Model Card Contact

For questions about this Hugging Face implementation: Lab-Rasool For questions about the original model: See the original repository