PrazNeuro commited on
Commit
bda8d0e
Β·
verified Β·
1 Parent(s): e386fee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -69
README.md CHANGED
@@ -1,69 +1,94 @@
1
- Project: PRECISE-GBM - Model training & retraining helpers
2
-
3
- Overview
4
-
5
- This repository contains code to train models (Gaussian Mixture labelling + SVM and ensemble classifiers) and to persist all artifacts required to reproduce or retrain models on new data. It includes:
6
-
7
- - `Scenario_heldout_final_PRECISE.py` β€” training pipeline producing `.joblib` models and metadata JSONs (selected features, best params, CV results).
8
- - `retrain_helper.py` β€” CLI utility to rebuild pipelines, set best params and retrain using saved selected-features and params JSONs. Supports JSON/YAML config files and auto-detection of model type.
9
- - `README_RETRAIN.md` β€” detailed retrain examples and a notebook cell.
10
-
11
- This repo also includes helper files to make it ready for GitHub:
12
- - `requirements.txt` β€” Python dependencies
13
- - `.gitignore` β€” recommended ignores (models, caches, logs)
14
- - `LICENSE` β€” MIT license
15
- - GitHub Actions workflow for CI (pytest smoke test)
16
-
17
- Getting started (Windows PowerShell)
18
-
19
- 1) Create and activate a virtual environment
20
-
21
- ```powershell
22
- python -m venv .venv
23
- .\.venv\Scripts\Activate.ps1
24
- ```
25
-
26
- 2) Install dependencies
27
-
28
- ```powershell
29
- pip install --upgrade pip
30
- pip install -r requirements.txt
31
- ```
32
-
33
- 3) Run training (note: the training script reads data from absolute paths configured in the script β€” adjust them or run from an environment where those files are present)
34
-
35
- ```powershell
36
- python Scenario_heldout_final_PRECISE.py
37
- ```
38
-
39
- The training script will create model files under `models_LM22/` and `models_GBM/` and write metadata JSONs next to each joblib model (selected features, params, cv results) as well as group-level JSON summaries.
40
-
41
- Retraining
42
-
43
- See `README_RETRAIN.md` for detailed CLI and notebook examples. Short example:
44
-
45
- ```powershell
46
- python retrain_helper.py \
47
- --model-prefix "models_GBM/scenario_1/GBM_scen1_Tcell" \
48
- --train-csv "data\new_train.csv" \
49
- --label-col "label"
50
- ```
51
-
52
- Notes
53
-
54
- - The training script contains hard-coded absolute paths to data files. Before running on another machine, update the `scenarios_*` file paths or place the datasets in the same paths.
55
- - Retrain helper auto-detects model type when `--model-type` is omitted by looking for `{prefix}_svm_params.json` or `{prefix}_ens_params.json`.
56
- - YAML config support for retrain requires PyYAML (`pip install pyyaml`).
57
-
58
- CI
59
-
60
- A basic GitHub Actions workflow runs a smoke pytest to ensure the retrain helper imports and basic pipeline construction works. It does not run heavy training.
61
-
62
- Contributing
63
-
64
- See `CONTRIBUTING.md` for guidance on opening issues and PRs.
65
-
66
- License
67
-
68
- This project is released under the MIT License β€” see `LICENSE`.
69
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center"> <b> Predictive Radiomics for Evaluation of Cancer Immune SignaturE in Glioblastoma | PRECISE-GBM </b> </p>
2
+
3
+ <p align="center">
4
+ <img src="PRECISE-GBM_GUI_logo%20(1).png" alt="PRECISE-GBM Logo">
5
+ </p>
6
+
7
+ <b> Project: PRECISE-GBM - Model training & retraining helpers </b>
8
+
9
+ Overview
10
+
11
+ This repository contains code to train models (Gaussian Mixture labelling + SVM and ensemble classifiers) and to persist all artifacts required to reproduce or retrain models on new data. It includes:
12
+
13
+ - `Scenario_heldout_final_PRECISE.py` β€” training pipeline producing `.joblib` models and metadata JSONs (selected features, best params, CV results).
14
+ - `retrain_helper.py` β€” CLI utility to rebuild pipelines, set best params and retrain using saved selected-features and params JSONs. Supports JSON/YAML config files and auto-detection of model type.
15
+ - `README_RETRAIN.md` β€” detailed retrain examples and a notebook cell.
16
+
17
+ This repo also includes helper files to make it ready for GitHub:
18
+ - `requirements.txt` β€” Python dependencies
19
+ - `.gitignore` β€” recommended ignores (models, caches, logs)
20
+ - `LICENSE` β€” MIT license
21
+ - GitHub Actions workflow for CI (pytest smoke test)
22
+
23
+ Getting started (Windows PowerShell)
24
+
25
+ 1) Create and activate a virtual environment
26
+
27
+ ```powershell
28
+ python -m venv .venv
29
+ .\.venv\Scripts\Activate.ps1
30
+ ```
31
+
32
+ 2) Install dependencies
33
+
34
+ ```powershell
35
+ pip install --upgrade pip
36
+ pip install -r requirements.txt
37
+ ```
38
+
39
+ 3) Run training (note: the training script reads data from absolute paths configured in the script β€” adjust them or run from an environment where those files are present)
40
+
41
+ ```powershell
42
+ python Scenario_heldout_final_PRECISE.py
43
+ ```
44
+
45
+ The training script will create model files under `models_LM22/` and `models_GBM/` and write metadata JSONs next to each joblib model (selected features, params, cv results) as well as group-level JSON summaries.
46
+
47
+ Retraining
48
+
49
+ See `README_RETRAIN.md` for detailed CLI and notebook examples. Short example:
50
+
51
+ ```powershell
52
+ python retrain_helper.py \
53
+ --model-prefix "models_GBM/scenario_1/GBM_scen1_Tcell" \
54
+ --train-csv "data\new_train.csv" \
55
+ --label-col "label"
56
+ ```
57
+
58
+ Notes
59
+
60
+ - The training script contains hard-coded absolute paths to data files. Before running on another machine, update the `scenarios_*` file paths or place the datasets in the same paths.
61
+ - Retrain helper auto-detects model type when `--model-type` is omitted by looking for `{prefix}_svm_params.json` or `{prefix}_ens_params.json`.
62
+ - YAML config support for retrain requires PyYAML (`pip install pyyaml`).
63
+
64
+ CI
65
+
66
+ A basic GitHub Actions workflow runs a smoke pytest to ensure the retrain helper imports and basic pipeline construction works. It does not run heavy training.
67
+
68
+ Contributing
69
+
70
+ See `CONTRIBUTING.md` for guidance on opening issues and PRs.
71
+
72
+ License
73
+
74
+ This project is released under the MIT License β€” see `LICENSE`.
75
+
76
+ Citation:
77
+ Please use the following citation when using the repository.
78
+
79
+ 2025
80
+
81
+ β€’ Ghimire P, Modat M, Booth T. Predictive radiogenomic AI Model for patient stratification in brain tumor immunotherapy trials. Neuro-oncology. Oct 2025; 26(Suppl_3): iii58–iii59. https://doi.org/10.1093/neuonc/noaf193.188
82
+
83
+ β€’ Ghimire P, Modat M, Booth T. Radiogenomic AI model predicts immune status in IDH wildtype glioblastoma: PRECISE-GBM study. RCR open. Jan 2025; 3(1): 100234
84
+
85
+ 2024
86
+
87
+ β€’ Ghimire P, Modat M, Booth T. A machine Learning bases predictive radiomics for evaluation of cancer immune signature in glioblastoma: the PRECISE-GBM study. Neuro-Oncology. Oct 2024; 26(suppl_5): v25.
88
+
89
+ β€’ Ghimire P, Modat M, Booth T. A radiogenomic machine learning based study to identify Predictive Radiomics for Evaluation of Cancer Immune SignaturE in IDHw Glioblastoma. Neuro-Oncology. Oct 2024; 26(suppl_7): vii3
90
+
91
+
92
+
93
+
94
+