sybil / README.md

Upload folder using huggingface_hub

1206896 verified 4 months ago

3.89 kB

	---
	license: mit
	tags:
	- medical
	- cancer
	- ct-scan
	- risk-prediction
	- healthcare
	- pytorch
	- vision
	datasets:
	- NLST
	metrics:
	- auc
	- c-index
	language:
	- en
	library_name: transformers
	pipeline_tag: image-classification
	---

	# Sybil - Lung Cancer Risk Prediction

	## Model Description

	Sybil is a validated deep learning model that predicts future lung cancer risk from a single low-dose chest CT (LDCT) scan. Published in the Journal of Clinical Oncology, this model can assess cancer risk over a 1-6 year timeframe.

	### Key Features
	- Single Scan Analysis: Requires only one LDCT scan
	- Multi-Year Prediction: Provides risk scores for years 1-6
	- Validated Performance: Tested across multiple institutions globally
	- Ensemble Approach: Uses 5 models for robust predictions

	## Model Details

	- Developed by: MIT CSAIL & Mass General Cancer Center (Original)
	- Adapted by: Lab-Rasool (Hugging Face version)
	- Model type: 3D Convolutional Neural Network
	- Architecture: 3D ResNet-18 with multi-attention pooling
	- Input: LDCT scans (200 slices × 256×256 pixels)
	- Output: 6 risk scores (years 1-6)
	- License: MIT

	## Performance Metrics

	\| Dataset \| 1-Year AUC \| 6-Year AUC \|
	\|---------\|------------\|------------\|
	\| NLST Test \| 0.94 \| 0.86 \|
	\| MGH \| 0.86 \| 0.75 \|
	\| CGMH Taiwan \| 0.94 \| 0.80 \|

	## Usage

	```python
	from huggingface_sybil import SybilHFWrapper, SybilConfig

	# Load model
	config = SybilConfig()
	model = SybilHFWrapper.from_pretrained("Lab-Rasool/sybil")

	# Prepare DICOM files
	dicom_paths = ["scan1.dcm", "scan2.dcm", ...]

	# Get predictions
	output = model(dicom_paths=dicom_paths)
	risk_scores = output.risk_scores

	# Display results
	for year, score in enumerate(risk_scores, 1):
	print(f"Year {year}: {score:.1%} risk")
	```

	## Intended Use

	### Primary Use Cases
	- Risk stratification in lung cancer screening programs
	- Research on lung cancer prediction models
	- Clinical decision support (with appropriate oversight)

	### Users
	- Healthcare providers
	- Medical researchers
	- Screening program coordinators

	### Out of Scope
	- Diagnosis of existing cancer
	- Use with non-LDCT imaging (X-rays, MRI)
	- Sole basis for clinical decisions

	## Training Data

	Trained on the National Lung Screening Trial (NLST) dataset:
	- ~50,000 participants
	- Ages 55-74
	- Current/former heavy smokers
	- 3 annual LDCT scans

	## Ethical Considerations

	⚠️ Medical AI Notice: This model should supplement, not replace, clinical judgment. Always consider:
	- Complete patient history
	- Other risk factors
	- Current screening guidelines
	- Need for human oversight

	## Limitations

	- Optimized for screening-eligible population (55-80 years)
	- Requires LDCT scans specifically
	- Performance may vary across different CT scanners
	- Not validated for non-screening populations

	## Citation

	Original Paper:
	```bibtex
	@article{mikhael2023sybil,
	title={Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography},
	author={Mikhael, Peter G and Wohlwend, Jeremy and Yala, Adam and Karstens, Ludvig and Xiang, Justin and Takigami, Angelo K and Bourgouin, Patrick P and Chan, PuiYee and Mrah, Sofiane and Amayri, Wael and others},
	journal={Journal of Clinical Oncology},
	volume={41},
	number={12},
	pages={2191--2200},
	year={2023},
	publisher={American Society of Clinical Oncology}
	}
	```

	## Acknowledgments

	This Hugging Face implementation is based on the original work by Peter G. Mikhael, Jeremy Wohlwend, and the team at MIT CSAIL and Massachusetts General Hospital. Original model and code available at [GitHub](https://github.com/reginabarzilaygroup/Sybil).

	## Model Card Contact

	For questions about this Hugging Face implementation: Lab-Rasool
	For questions about the original model: See the [original repository](https://github.com/reginabarzilaygroup/Sybil)