--- license: mit language: - en tags: - medical-imaging - self-supervised-learning - mae - swin-transformer - 3d-vision - pytorch - ct - opscc --- # 3D Swin Transformer MAE for OPSCC CT Pretraining Self-supervised masked autoencoder (MAE) using a **3D Swin Transformer** backbone trained on cropped OPSCC neck CT volumes. Includes asymmetry-aware loss weighting (airway + soft-tissue features) and overfitting monitoring via augmented-pair cosine similarity. ## Model Details - **Architecture**: 3D Swin Transformer encoder + lightweight asymmetric decoder + auxiliary asymmetry prediction heads - **Input shape**: 1×60×128×128 (single-channel CT volumes, intensities normalized to [0,1]) - **Pretraining objective**: Masked reconstruction (75% masking ratio) + auxiliary asymmetry regression - **Drop path rate**: linear schedule up to 0.1 - **Training**: AdamW, lr=1e-4, batch size 2 (adjustable), early stopping + cosine sim monitoring ## Intended Use & Limitations **Primary use**: Pretraining foundation for downstream OPSCC tasks (staging, segmentation, outcome prediction) **Not intended for**: Direct clinical diagnosis without fine-tuning and validation **Limitations**: - Trained on limited cohort (TCIA-derived OPSCC cases) - Assumes cropped, skull-base-to-thoracic-inlet volumes - Asymmetry heuristics are rule-based → may miss subtle cases - No multi-modal / contrast-enhanced support yet ## How to Use ```bash # 1. Clone repo git clone https://huggingface.co/jdmayfield/opscc-ct-mae-swin-pretrain cd opscc-ct-mae-swin-pretrain # 2. Install deps pip install -r requirements.txt # 3. Train (or resume from checkpoint) python train_mae_swin3d.py \ --data-dir /path/to/your/cropped_volumes \ --output-dir ./checkpoints \ --epochs 100 \ --batch-size 2 \ --lr 1e-4