gprolcastelo commited on
Commit
171c85a
·
verified ·
1 Parent(s): 245b43f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +218 -204
README.md CHANGED
@@ -1,204 +1,218 @@
1
- # Pretrained Models
2
-
3
- This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project.
4
-
5
- These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section).
6
-
7
- ## Available Models
8
-
9
- ### KIRC (Kidney Renal Clear Cell Carcinoma)
10
-
11
- **Location**: `KIRC/`
12
-
13
- *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300)
14
-
15
- **Model Files**:
16
- - `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights
17
- - `network_reconstruction.pth` - Reconstruction network weights
18
- - `network_dims.csv` - Network architecture specifications
19
-
20
- **Model Specifications**:
21
- - Input dimension: 8,516 genes
22
- - VAE architecture:
23
- - Middle dimension: 512
24
- - Latent dimension: 256
25
- - Loss function: MSE
26
- - Activation: ReLU
27
- - Reconstruction network: [8954, 3512, 824, 3731, 8954]
28
- - Training: Beta-VAE with 3 cycles, 600 epochs total
29
-
30
- ### BRCA (Breast Invasive Carcinoma)
31
-
32
- **Location**: `BRCA/`
33
-
34
- *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123)
35
-
36
- **Model Files**:
37
- - `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights
38
- - `network_reconstruction.pth` - Reconstruction network weights
39
- - `network_dims.csv` - Network architecture specifications
40
-
41
- **Model Specifications**:
42
- - Input dimension: 8,954 genes
43
- - VAE architecture:
44
- - Middle dimension: 1,024
45
- - Latent dimension: 512
46
- - Loss function: MSE
47
- - Activation: ReLU
48
- - Reconstruction network: [8954, 3104, 790, 4027, 8954]
49
- - Training: Beta-VAE with 3 cycles, 600 epochs total
50
-
51
- ## Usage
52
-
53
- ### Loading Models in Python
54
-
55
- See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects.
56
-
57
-
58
- ```python
59
- import torch
60
- import pandas as pd
61
- import json
62
- from pathlib import Path
63
- import huggingface_hub as hf
64
- from renalprog.modeling.train import VAE, NetworkReconstruction
65
-
66
- # Configuration
67
- cancer_type = "KIRC" # or "BRCA"
68
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
69
-
70
- # ============================================================================
71
- # Load VAE Model
72
- # ============================================================================
73
-
74
- # Download VAE config
75
- vae_config_path = hf.hf_hub_download(
76
- repo_id="gprolcastelo/evenflow_models",
77
- filename=f"{cancer_type}/config.json"
78
- )
79
-
80
- # Load configuration
81
- with open(vae_config_path, "r") as f:
82
- vae_config = json.load(f)
83
-
84
- print(f"VAE Configuration: {vae_config}")
85
-
86
- # Download VAE model weights
87
- if cancer_type == "KIRC":
88
- vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth"
89
- elif cancer_type == "BRCA":
90
- vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth"
91
- else:
92
- raise ValueError(f"Unknown cancer type: {cancer_type}")
93
-
94
- vae_model_path = hf.hf_hub_download(
95
- repo_id="gprolcastelo/evenflow_models",
96
- filename=vae_filename
97
- )
98
-
99
- # Initialize and load VAE
100
- model_vae = VAE(
101
- input_dim=vae_config["INPUT_DIM"],
102
- mid_dim=vae_config["MID_DIM"],
103
- features=vae_config["LATENT_DIM"]
104
- ).to(device)
105
-
106
- checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False)
107
- model_vae.load_state_dict(checkpoint_vae)
108
- model_vae.eval()
109
-
110
- print(f"VAE model loaded successfully from {cancer_type}")
111
-
112
- # ============================================================================
113
- # Load Reconstruction Network
114
- # ============================================================================
115
-
116
- # Download network dimensions
117
- network_dims_path = hf.hf_hub_download(
118
- repo_id="gprolcastelo/evenflow_models",
119
- filename=f"{cancer_type}/network_dims.csv"
120
- )
121
-
122
- # Load network dimensions
123
- network_dims = pd.read_csv(network_dims_path)
124
- layer_dims = network_dims.values.tolist()[0]
125
-
126
- print(f"Reconstruction Network dimensions: {layer_dims}")
127
-
128
- # Download reconstruction network weights
129
- recnet_model_path = hf.hf_hub_download(
130
- repo_id="gprolcastelo/evenflow_models",
131
- filename=f"{cancer_type}/network_reconstruction.pth"
132
- )
133
-
134
- # Initialize and load Reconstruction Network
135
- model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device)
136
- checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False)
137
- model_recnet.load_state_dict(checkpoint_recnet)
138
- model_recnet.eval()
139
-
140
- print(f"Reconstruction Network loaded successfully from {cancer_type}")
141
-
142
- # ============================================================================
143
- # Use the models
144
- # ============================================================================
145
-
146
- # Example: Apply VAE to your data
147
- # your_data = torch.tensor(your_data_array).float().to(device)
148
- # with torch.no_grad():
149
- # vae_output = model_vae(your_data)
150
- # recnet_output = model_recnet(vae_output)
151
-
152
- ```
153
-
154
- ## Citation
155
-
156
- !!! warning "Warning"
157
- This citation is temporary. It will be updated when a pre-print is released.
158
-
159
- If you use these pretrained models, please cite:
160
-
161
- ```bibtex
162
- @software{renalprog2024,
163
- title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling},
164
- author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]},
165
- year = {2025},
166
- url = {https://github.com/gprolcas/renalprog},
167
- note = {Preprint in preparation}
168
- }
169
- ```
170
-
171
- ## Training Details
172
-
173
- These models were trained using:
174
- - Random seed: 2023
175
- - Train/test split: 80/20
176
- - Optimizer: Adam
177
- - Learning rate: 1e-4
178
- - Batch size: 8
179
- - Beta annealing (for VAE): 3 cycles with 0.5 ratio
180
-
181
- ## Model Performance
182
-
183
- **KIRC Model**:
184
- - Reconstruction loss (test): ~1.1
185
-
186
- **BRCA Model**:
187
- - Reconstruction loss (test): ~0.9
188
-
189
- ## License
190
-
191
- These pretrained models are provided under the same Apache 2.0 license.
192
-
193
- ## Contact
194
-
195
- For questions about the pretrained models, please:
196
- 1. Check the [documentation](https://gprolcastelo.github.io/renalprog/)
197
- 2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues)
198
- 3. Contact the authors
199
-
200
- ---
201
-
202
- **Last Updated**: December 2025
203
- **Version**: 1.0.0-alpha
204
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: tabular-regression
6
+ tags:
7
+ - VAE
8
+ - bioinformatics
9
+ - TCGA
10
+ - ccRCC
11
+ - KIRC
12
+ - cancer
13
+ ---
14
+
15
+
16
+ # Pretrained Models
17
+
18
+ This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project.
19
+
20
+ These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section).
21
+
22
+ ## Available Models
23
+
24
+ ### KIRC (Kidney Renal Clear Cell Carcinoma)
25
+
26
+ **Location**: `KIRC/`
27
+
28
+ *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300)
29
+
30
+ **Model Files**:
31
+ - `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights
32
+ - `network_reconstruction.pth` - Reconstruction network weights
33
+ - `network_dims.csv` - Network architecture specifications
34
+
35
+ **Model Specifications**:
36
+ - Input dimension: 8,516 genes
37
+ - VAE architecture:
38
+ - Middle dimension: 512
39
+ - Latent dimension: 256
40
+ - Loss function: MSE
41
+ - Activation: ReLU
42
+ - Reconstruction network: [8954, 3512, 824, 3731, 8954]
43
+ - Training: Beta-VAE with 3 cycles, 600 epochs total
44
+
45
+ ### BRCA (Breast Invasive Carcinoma)
46
+
47
+ **Location**: `BRCA/`
48
+
49
+ *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123)
50
+
51
+ **Model Files**:
52
+ - `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights
53
+ - `network_reconstruction.pth` - Reconstruction network weights
54
+ - `network_dims.csv` - Network architecture specifications
55
+
56
+ **Model Specifications**:
57
+ - Input dimension: 8,954 genes
58
+ - VAE architecture:
59
+ - Middle dimension: 1,024
60
+ - Latent dimension: 512
61
+ - Loss function: MSE
62
+ - Activation: ReLU
63
+ - Reconstruction network: [8954, 3104, 790, 4027, 8954]
64
+ - Training: Beta-VAE with 3 cycles, 600 epochs total
65
+
66
+ ## Usage
67
+
68
+ ### Loading Models in Python
69
+
70
+ See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects.
71
+
72
+
73
+ ```python
74
+ import torch
75
+ import pandas as pd
76
+ import json
77
+ from pathlib import Path
78
+ import huggingface_hub as hf
79
+ from renalprog.modeling.train import VAE, NetworkReconstruction
80
+
81
+ # Configuration
82
+ cancer_type = "KIRC" # or "BRCA"
83
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
84
+
85
+ # ============================================================================
86
+ # Load VAE Model
87
+ # ============================================================================
88
+
89
+ # Download VAE config
90
+ vae_config_path = hf.hf_hub_download(
91
+ repo_id="gprolcastelo/evenflow_models",
92
+ filename=f"{cancer_type}/config.json"
93
+ )
94
+
95
+ # Load configuration
96
+ with open(vae_config_path, "r") as f:
97
+ vae_config = json.load(f)
98
+
99
+ print(f"VAE Configuration: {vae_config}")
100
+
101
+ # Download VAE model weights
102
+ if cancer_type == "KIRC":
103
+ vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth"
104
+ elif cancer_type == "BRCA":
105
+ vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth"
106
+ else:
107
+ raise ValueError(f"Unknown cancer type: {cancer_type}")
108
+
109
+ vae_model_path = hf.hf_hub_download(
110
+ repo_id="gprolcastelo/evenflow_models",
111
+ filename=vae_filename
112
+ )
113
+
114
+ # Initialize and load VAE
115
+ model_vae = VAE(
116
+ input_dim=vae_config["INPUT_DIM"],
117
+ mid_dim=vae_config["MID_DIM"],
118
+ features=vae_config["LATENT_DIM"]
119
+ ).to(device)
120
+
121
+ checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False)
122
+ model_vae.load_state_dict(checkpoint_vae)
123
+ model_vae.eval()
124
+
125
+ print(f"VAE model loaded successfully from {cancer_type}")
126
+
127
+ # ============================================================================
128
+ # Load Reconstruction Network
129
+ # ============================================================================
130
+
131
+ # Download network dimensions
132
+ network_dims_path = hf.hf_hub_download(
133
+ repo_id="gprolcastelo/evenflow_models",
134
+ filename=f"{cancer_type}/network_dims.csv"
135
+ )
136
+
137
+ # Load network dimensions
138
+ network_dims = pd.read_csv(network_dims_path)
139
+ layer_dims = network_dims.values.tolist()[0]
140
+
141
+ print(f"Reconstruction Network dimensions: {layer_dims}")
142
+
143
+ # Download reconstruction network weights
144
+ recnet_model_path = hf.hf_hub_download(
145
+ repo_id="gprolcastelo/evenflow_models",
146
+ filename=f"{cancer_type}/network_reconstruction.pth"
147
+ )
148
+
149
+ # Initialize and load Reconstruction Network
150
+ model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device)
151
+ checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False)
152
+ model_recnet.load_state_dict(checkpoint_recnet)
153
+ model_recnet.eval()
154
+
155
+ print(f"Reconstruction Network loaded successfully from {cancer_type}")
156
+
157
+ # ============================================================================
158
+ # Use the models
159
+ # ============================================================================
160
+
161
+ # Example: Apply VAE to your data
162
+ # your_data = torch.tensor(your_data_array).float().to(device)
163
+ # with torch.no_grad():
164
+ # vae_output = model_vae(your_data)
165
+ # recnet_output = model_recnet(vae_output)
166
+
167
+ ```
168
+
169
+ ## Citation
170
+
171
+ !!! warning "Warning"
172
+ This citation is temporary. It will be updated when a pre-print is released.
173
+
174
+ If you use these pretrained models, please cite:
175
+
176
+ ```bibtex
177
+ @software{renalprog2024,
178
+ title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling},
179
+ author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]},
180
+ year = {2025},
181
+ url = {https://github.com/gprolcas/renalprog},
182
+ note = {Preprint in preparation}
183
+ }
184
+ ```
185
+
186
+ ## Training Details
187
+
188
+ These models were trained using:
189
+ - Random seed: 2023
190
+ - Train/test split: 80/20
191
+ - Optimizer: Adam
192
+ - Learning rate: 1e-4
193
+ - Batch size: 8
194
+ - Beta annealing (for VAE): 3 cycles with 0.5 ratio
195
+
196
+ ## Model Performance
197
+
198
+ **KIRC Model**:
199
+ - Reconstruction loss (test): ~1.1
200
+
201
+ **BRCA Model**:
202
+ - Reconstruction loss (test): ~0.9
203
+
204
+ ## License
205
+
206
+ These pretrained models are provided under the same Apache 2.0 license.
207
+
208
+ ## Contact
209
+
210
+ For questions about the pretrained models, please:
211
+ 1. Check the [documentation](https://gprolcastelo.github.io/renalprog/)
212
+ 2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues)
213
+ 3. Contact the authors
214
+
215
+ ---
216
+
217
+ **Last Updated**: December 2025
218
+ **Version**: 1.0.0-alpha