Spaces:

farrell236
/

AMDRisk

Sleeping

App Files Files Community

AMDRisk / README.md

Hou

add src

a7c73c5 about 1 month ago

preview code

Raw

History Blame Contribute Delete

2.9 kB

	---
	title: AMDRisk
	emoji: 📉
	colorFrom: indigo
	colorTo: pink
	sdk: gradio
	sdk_version: 6.14.0
	python_version: '3.13'
	app_file: app.py
	pinned: false
	license: mit
	short_description: Predicting risk of late AMD using Deep Learning
	---

	# AMD Risk Prediction

	PyTorch reimplementation of the DeepSeeNet-based AMD progression risk framework from Peng et al., npj Digital Medicine 2020, “Predicting risk of late age-related macular degeneration using deep learning.”

	- Extracts DeepSeeNet hidden features from drusen and pigment abnormality models.
	- Fits Cox proportional hazards models using image-derived features plus age and smoking status.

	## Feature Extraction

	- Input: paired baseline CFPs
	- `LE_PATHNAME`
	- `RE_PATHNAME`
	- Models used:
	- `deepseenet/weights/drus.pt`
	- `deepseenet/weights/pig.pt`
	- Image preprocessing:
	- RGB conversion
	- validation transform from `deepseenet/augmentations.py`
	- default input size: `1024 × 1024`
	- Feature source:
	- penultimate layer of each DeepSeeNet classifier
	- final linear layer input captured by forward hook

	- Feature layout:
	```text
	LE_DRUS_000 ... LE_DRUS_127
	RE_DRUS_000 ... RE_DRUS_127
	LE_PIG_000 ... LE_PIG_127
	RE_PIG_000 ... RE_PIG_127
	````

	- Total feature dimension:
	```text
	128 × 2 models × 2 eyes = 512 features / patient
	```

	- Output:
	```text
	data/areds1_deepseenet_features.npz
	```

	- Stored arrays:
	- `features`: `(N, 512)`
	- `patids`: `(N,)`
	- `feature_names`: `(512,)`


	## Cox Model Training

	- Survival labels from endpoint-specific JSON files:
	- `Status_late_amd`
	- `Status_anyga`
	- `Status_nv`

	- Time-to-event column:
	- `Survival_in_years`

	- Predefined fold split:
	- train: folds `3, 4, 5, 6, 7, 8, 9`
	- validation: fold `2`
	- test: folds `0, 1`

	- Stage 1 baseline:
	- structured grading features only

	```text
	LE_DRUS
	RE_DRUS
	LE_PIG
	RE_PIG
	age
	smkever
	````

	* Stage 2 DeepSeeNet features:

	* selected hidden features from 512-dimensional feature vector
	* plus `age`
	* plus `smkever`

	* Preprocessing:

	* low-variance feature filtering
	* train-only `StandardScaler`
	* same scaler applied to validation/test

	* Cox model:

	* `lifelines.CoxPHFitter`
	* L2 penalization
	* default penalizer: `0.01`

	* Feature-selection tweaks:

	* global top-k selection

	* rank features by univariate train-set concordance
	* examples: `--top-k 8`, `--top-k 16`
	* block-balanced top-k selection

	* select top-k features separately from each block:

	```text
	LE_DRUS_*
	RE_DRUS_*
	LE_PIG_*
	RE_PIG_*
	```

	* Block-balanced example:

	```text
	--top-k-per-block 4

	4 LE_DRUS features
	4 RE_DRUS features
	4 LE_PIG features
	4 RE_PIG features
	+ age
	+ smkever
	= 18 total features
	```

	* Rationale:

	* avoids one highly correlated feature block dominating global top-k
	* improves stability of Cox fitting
	* closer in spirit to grouped/correlation-aware feature selection