--- license: mit tags: - medical - cancer - ct-scan - risk-prediction - healthcare - pytorch - vision datasets: - NLST metrics: - auc - c-index language: - en library_name: transformers pipeline_tag: image-classification --- # Sybil - Lung Cancer Risk Prediction ## Model Description Sybil is a validated deep learning model that predicts future lung cancer risk from a single low-dose chest CT (LDCT) scan. Published in the Journal of Clinical Oncology, this model can assess cancer risk over a 1-6 year timeframe. ### Key Features - **Single Scan Analysis**: Requires only one LDCT scan - **Multi-Year Prediction**: Provides risk scores for years 1-6 - **Validated Performance**: Tested across multiple institutions globally - **Ensemble Approach**: Uses 5 models for robust predictions ## Model Details - **Developed by**: MIT CSAIL & Mass General Cancer Center (Original) - **Adapted by**: Lab-Rasool (Hugging Face version) - **Model type**: 3D Convolutional Neural Network - **Architecture**: 3D ResNet-18 with multi-attention pooling - **Input**: LDCT scans (200 slices × 256×256 pixels) - **Output**: 6 risk scores (years 1-6) - **License**: MIT ## Performance Metrics | Dataset | 1-Year AUC | 6-Year AUC | |---------|------------|------------| | NLST Test | 0.94 | 0.86 | | MGH | 0.86 | 0.75 | | CGMH Taiwan | 0.94 | 0.80 | ## Usage ```python from huggingface_sybil import SybilHFWrapper, SybilConfig # Load model config = SybilConfig() model = SybilHFWrapper.from_pretrained("Lab-Rasool/sybil") # Prepare DICOM files dicom_paths = ["scan1.dcm", "scan2.dcm", ...] # Get predictions output = model(dicom_paths=dicom_paths) risk_scores = output.risk_scores # Display results for year, score in enumerate(risk_scores, 1): print(f"Year {year}: {score:.1%} risk") ``` ## Intended Use ### Primary Use Cases - Risk stratification in lung cancer screening programs - Research on lung cancer prediction models - Clinical decision support (with appropriate oversight) ### Users - Healthcare providers - Medical researchers - Screening program coordinators ### Out of Scope - Diagnosis of existing cancer - Use with non-LDCT imaging (X-rays, MRI) - Sole basis for clinical decisions ## Training Data Trained on the National Lung Screening Trial (NLST) dataset: - ~50,000 participants - Ages 55-74 - Current/former heavy smokers - 3 annual LDCT scans ## Ethical Considerations ⚠️ **Medical AI Notice**: This model should supplement, not replace, clinical judgment. Always consider: - Complete patient history - Other risk factors - Current screening guidelines - Need for human oversight ## Limitations - Optimized for screening-eligible population (55-80 years) - Requires LDCT scans specifically - Performance may vary across different CT scanners - Not validated for non-screening populations ## Citation **Original Paper:** ```bibtex @article{mikhael2023sybil, title={Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography}, author={Mikhael, Peter G and Wohlwend, Jeremy and Yala, Adam and Karstens, Ludvig and Xiang, Justin and Takigami, Angelo K and Bourgouin, Patrick P and Chan, PuiYee and Mrah, Sofiane and Amayri, Wael and others}, journal={Journal of Clinical Oncology}, volume={41}, number={12}, pages={2191--2200}, year={2023}, publisher={American Society of Clinical Oncology} } ``` ## Acknowledgments This Hugging Face implementation is based on the original work by Peter G. Mikhael, Jeremy Wohlwend, and the team at MIT CSAIL and Massachusetts General Hospital. Original model and code available at [GitHub](https://github.com/reginabarzilaygroup/Sybil). ## Model Card Contact For questions about this Hugging Face implementation: Lab-Rasool For questions about the original model: See the [original repository](https://github.com/reginabarzilaygroup/Sybil)