--- title: AMDRisk emoji: 📉 colorFrom: indigo colorTo: pink sdk: gradio sdk_version: 6.14.0 python_version: '3.13' app_file: app.py pinned: false license: mit short_description: Predicting risk of late AMD using Deep Learning --- # AMD Risk Prediction PyTorch reimplementation of the DeepSeeNet-based AMD progression risk framework from Peng et al., *npj Digital Medicine* 2020, **“Predicting risk of late age-related macular degeneration using deep learning.”** - Extracts DeepSeeNet hidden features from drusen and pigment abnormality models. - Fits Cox proportional hazards models using image-derived features plus age and smoking status. ## Feature Extraction - Input: paired baseline CFPs - `LE_PATHNAME` - `RE_PATHNAME` - Models used: - `deepseenet/weights/drus.pt` - `deepseenet/weights/pig.pt` - Image preprocessing: - RGB conversion - validation transform from `deepseenet/augmentations.py` - default input size: `1024 × 1024` - Feature source: - penultimate layer of each DeepSeeNet classifier - final linear layer input captured by forward hook - Feature layout: ```text LE_DRUS_000 ... LE_DRUS_127 RE_DRUS_000 ... RE_DRUS_127 LE_PIG_000 ... LE_PIG_127 RE_PIG_000 ... RE_PIG_127 ```` - Total feature dimension: ```text 128 × 2 models × 2 eyes = 512 features / patient ``` - Output: ```text data/areds1_deepseenet_features.npz ``` - Stored arrays: - `features`: `(N, 512)` - `patids`: `(N,)` - `feature_names`: `(512,)` ## Cox Model Training - Survival labels from endpoint-specific JSON files: - `Status_late_amd` - `Status_anyga` - `Status_nv` - Time-to-event column: - `Survival_in_years` - Predefined fold split: - train: folds `3, 4, 5, 6, 7, 8, 9` - validation: fold `2` - test: folds `0, 1` - Stage 1 baseline: - structured grading features only ```text LE_DRUS RE_DRUS LE_PIG RE_PIG age smkever ```` * Stage 2 DeepSeeNet features: * selected hidden features from 512-dimensional feature vector * plus `age` * plus `smkever` * Preprocessing: * low-variance feature filtering * train-only `StandardScaler` * same scaler applied to validation/test * Cox model: * `lifelines.CoxPHFitter` * L2 penalization * default penalizer: `0.01` * Feature-selection tweaks: * global top-k selection * rank features by univariate train-set concordance * examples: `--top-k 8`, `--top-k 16` * block-balanced top-k selection * select top-k features separately from each block: ```text LE_DRUS_* RE_DRUS_* LE_PIG_* RE_PIG_* ``` * Block-balanced example: ```text --top-k-per-block 4 4 LE_DRUS features 4 RE_DRUS features 4 LE_PIG features 4 RE_PIG features + age + smkever = 18 total features ``` * Rationale: * avoids one highly correlated feature block dominating global top-k * improves stability of Cox fitting * closer in spirit to grouped/correlation-aware feature selection