sybil / README.md

Aakash-Tripathi

Upload folder using huggingface_hub

1206896 verified 4 months ago

preview code

raw

history blame

3.89 kB

metadata

license: mit
tags:
  - medical
  - cancer
  - ct-scan
  - risk-prediction
  - healthcare
  - pytorch
  - vision
datasets:
  - NLST
metrics:
  - auc
  - c-index
language:
  - en
library_name: transformers
pipeline_tag: image-classification

Sybil - Lung Cancer Risk Prediction

Model Description

Sybil is a validated deep learning model that predicts future lung cancer risk from a single low-dose chest CT (LDCT) scan. Published in the Journal of Clinical Oncology, this model can assess cancer risk over a 1-6 year timeframe.

Key Features

Single Scan Analysis: Requires only one LDCT scan
Multi-Year Prediction: Provides risk scores for years 1-6
Validated Performance: Tested across multiple institutions globally
Ensemble Approach: Uses 5 models for robust predictions

Model Details

Developed by: MIT CSAIL & Mass General Cancer Center (Original)
Adapted by: Lab-Rasool (Hugging Face version)
Model type: 3D Convolutional Neural Network
Architecture: 3D ResNet-18 with multi-attention pooling
Input: LDCT scans (200 slices × 256×256 pixels)
Output: 6 risk scores (years 1-6)
License: MIT

Performance Metrics

Dataset	1-Year AUC	6-Year AUC
NLST Test	0.94	0.86
MGH	0.86	0.75
CGMH Taiwan	0.94	0.80

Usage

from huggingface_sybil import SybilHFWrapper, SybilConfig

# Load model
config = SybilConfig()
model = SybilHFWrapper.from_pretrained("Lab-Rasool/sybil")

# Prepare DICOM files
dicom_paths = ["scan1.dcm", "scan2.dcm", ...]

# Get predictions
output = model(dicom_paths=dicom_paths)
risk_scores = output.risk_scores

# Display results
for year, score in enumerate(risk_scores, 1):
    print(f"Year {year}: {score:.1%} risk")

Intended Use

Primary Use Cases

Risk stratification in lung cancer screening programs
Research on lung cancer prediction models
Clinical decision support (with appropriate oversight)

Users

Healthcare providers
Medical researchers
Screening program coordinators

Out of Scope

Diagnosis of existing cancer
Use with non-LDCT imaging (X-rays, MRI)
Sole basis for clinical decisions

Training Data

Trained on the National Lung Screening Trial (NLST) dataset:

~50,000 participants
Ages 55-74
Current/former heavy smokers
3 annual LDCT scans

Ethical Considerations

⚠️ Medical AI Notice: This model should supplement, not replace, clinical judgment. Always consider:

Complete patient history
Other risk factors
Current screening guidelines
Need for human oversight

Limitations

Optimized for screening-eligible population (55-80 years)
Requires LDCT scans specifically
Performance may vary across different CT scanners
Not validated for non-screening populations

Citation

Original Paper:

@article{mikhael2023sybil,
  title={Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography},
  author={Mikhael, Peter G and Wohlwend, Jeremy and Yala, Adam and Karstens, Ludvig and Xiang, Justin and Takigami, Angelo K and Bourgouin, Patrick P and Chan, PuiYee and Mrah, Sofiane and Amayri, Wael and others},
  journal={Journal of Clinical Oncology},
  volume={41},
  number={12},
  pages={2191--2200},
  year={2023},
  publisher={American Society of Clinical Oncology}
}

Acknowledgments

This Hugging Face implementation is based on the original work by Peter G. Mikhael, Jeremy Wohlwend, and the team at MIT CSAIL and Massachusetts General Hospital. Original model and code available at GitHub.

Model Card Contact

For questions about this Hugging Face implementation: Lab-Rasool For questions about the original model: See the original repository