ORACLE-IPF / README.md
chestrad's picture
Update README.md
f2622b0 verified
metadata
license: cc-by-nc-nd-4.0
language:
  - en
tags:
  - ensemble
  - medical-imaging
  - ipf
  - survival-prediction
  - ct-scan

ORACLE-IPF: CT-based IPF Survival Prediction

This repository hosts an ensemble of 5 models trained for idiopathic pulmonary fibrosis (IPF) survival analysis using chest CT scans.

Repository Structure

ORACLE-IPF/
β”œβ”€β”€ README.md                                          # Model card (this file)
β”œβ”€β”€ inference/                                         # Inference pipeline
β”‚   β”œβ”€β”€ README.md                                      # Inference documentation
β”‚   β”œβ”€β”€ checkpoints/                                   # Pre-trained model weights
β”‚   β”‚   β”œβ”€β”€ weight1.ckpt                               # Ensemble model 1
β”‚   β”‚   β”œβ”€β”€ weight2.ckpt                               # Ensemble model 2
β”‚   β”‚   β”œβ”€β”€ weight3.ckpt                               # Ensemble model 3
β”‚   β”‚   β”œβ”€β”€ weight4.ckpt                               # Ensemble model 4
β”‚   β”‚   └── weight5.ckpt                               # Ensemble model 5
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ checkpoints.yaml                           # Checkpoint configuration
β”‚   β”‚   └── patients.yaml                              # Sample patient configuration
β”‚   β”œβ”€β”€ requirements.txt                               # Python dependencies
β”‚   β”œβ”€β”€ run_inference.sh                               # Shell script to run inference
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ ensemble.py                                # Ensemble prediction logic
β”‚   β”‚   └── inference.py                               # Main inference script
β”‚   └── tests/
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── test_results.py                            # Result validation tests
β”‚
└── training/                                          # Training pipeline
    β”œβ”€β”€ README.md                                      # Training documentation
    β”œβ”€β”€ .gitignore                                     # Git ignore rules
    β”œβ”€β”€ .huggingfaceignore                             # HuggingFace ignore rules
    β”œβ”€β”€ config/
    β”‚   └── train_config.yaml                          # Training hyperparameters
    β”œβ”€β”€ scripts/
    β”‚   β”œβ”€β”€ prepare_data.sh                            # Data preparation script
    β”‚   └── train.sh                                   # Training launch script
    β”œβ”€β”€ training_data_sample/                          # Sample training data (de-identified)
    β”‚   └── ct{1,2,3}/                                 # Sample CT cases
    β”‚       β”œβ”€β”€ ct.npy                                 # CT volume (D, H, W)
    β”‚       β”œβ”€β”€ meta.json                              # DICOM metadata
    β”‚       β”œβ”€β”€ LungTexture.Obj.Honeycomb.npy          # Honeycomb pattern mask
    β”‚       β”œβ”€β”€ LungTexture.Obj.Reticular.npy          # Reticular pattern mask
    β”‚       └── LungTexture.Obj.Normal.npy             # Normal lung mask
    └── src/
        β”œβ”€β”€ __init__.py
        β”œβ”€β”€ config.py                                  # Model configuration (ModelArgs)
        β”œβ”€β”€ dataset.py                                 # PyTorch dataset for IPF data
        β”œβ”€β”€ prepare_masks.py                           # IPF mask preparation script
        β”œβ”€β”€ train.py                                   # Main training script
        └── models/
            β”œβ”€β”€ __init__.py
            β”œβ”€β”€ oracle.py                              # ORACLE model architecture
            β”œβ”€β”€ cumulative_probability_layer.py        # Survival probability layer
            β”œβ”€β”€ pooling_layer.py                       # Feature pooling layer
            └── regressor.py                           # Regression head

Model Architecture

  • Architecture: ORACLE / ORACLEDoubleDensity
  • Backbone: R3D-18 (pretrained on Kinetics-400)
  • Input: CT volume (B, 3, 200, 256, 256) with clinical features (sex, age)
  • Output: The model outputs cumulative survival probabilities at annual horizons from 1 to 5 years, plus one additional probability for the subsequent time interval beyond 5 years (B, 6)

Data Format

Input Data

  • CT volume: 3D numpy array (D, H, W) in Hounsfield Units
  • Lung texture masks (for training only): Binary masks for Honeycomb, Reticular, and Normal patterns (obtained from AVIEW, Coreline, Seoul, Korea)
  • Metadata: Patient demographics (sex, age at CT scan)

Processed Data

  • IPF mask (i.e., fibrosis mask): Union of Honeycomb and Reticular patterns
  • Fibrosis density: Ratio computed from Fibrosis (Honeycomb or Reticular) and Normal masks

Access and Usage Conditions

The distribution of these model weights is subject to the following conditions. By requesting and receiving access, you agree that:

  • The weights are provided solely for non-commercial research and educational purposes.
  • Any attempt to use the model for commercial applications, product development, or deployment beyond academic research requires separate written permission from the authors.
  • Users must respect privacy standards and must not attempt to identify or re-identify any individual from the model outputs.
  • Any use of the model in academic work must include a formal citation of the associated publication:

    [TBU: DOI]

  • The model has been developed for academic and research purposes and has not undergone regulatory validation for clinical deployment.
  • The authors and affiliated institution retain all intellectual property rights and disclaim any responsibility for misuse.

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.