File size: 6,173 Bytes
171c85a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
433e5e9
 
 
171c85a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
---
license: apache-2.0
language:
- en
pipeline_tag: tabular-regression
tags:
- VAE
- bioinformatics
- TCGA
- ccRCC
- KIRC
- cancer
---


# Pretrained Models

This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project.

These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section).

## Available Models

### KIRC (Kidney Renal Clear Cell Carcinoma)

**Location**: `KIRC/`

*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300)

**Model Files**:
- `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights
- `network_reconstruction.pth` - Reconstruction network weights
- `network_dims.csv` - Network architecture specifications

**Model Specifications**:
- Input dimension: 8,516 genes
- VAE architecture:
  - Middle dimension: 512
  - Latent dimension: 256
  - Loss function: MSE
  - Activation: ReLU
- Reconstruction network: [8954, 3512, 824, 3731, 8954]
- Training: Beta-VAE with 3 cycles, 600 epochs total

### BRCA (Breast Invasive Carcinoma)

**Location**: `BRCA/`

*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123)

**Model Files**:
- `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights
- `network_reconstruction.pth` - Reconstruction network weights
- `network_dims.csv` - Network architecture specifications

**Model Specifications**:
- Input dimension: 8,954 genes
- VAE architecture:
  - Middle dimension: 1,024
  - Latent dimension: 512
  - Loss function: MSE
  - Activation: ReLU
- Reconstruction network: [8954, 3104, 790, 4027, 8954]
- Training: Beta-VAE with 3 cycles, 600 epochs total

## Usage

### Loading Models in Python
 
See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects.


```python
import torch
import pandas as pd
import json
from pathlib import Path
import huggingface_hub as hf
from renalprog.modeling.train import VAE, NetworkReconstruction

# Configuration
cancer_type = "KIRC"  # or "BRCA"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# ============================================================================
# Load VAE Model
# ============================================================================

# Download VAE config
vae_config_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/config.json"
)

# Load configuration
with open(vae_config_path, "r") as f:
    vae_config = json.load(f)

print(f"VAE Configuration: {vae_config}")

# Download VAE model weights
if cancer_type == "KIRC":
    vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth"
elif cancer_type == "BRCA":
    vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth"
else:
    raise ValueError(f"Unknown cancer type: {cancer_type}")

vae_model_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=vae_filename
)

# Initialize and load VAE
model_vae = VAE(
    input_dim=vae_config["INPUT_DIM"],
    mid_dim=vae_config["MID_DIM"],
    features=vae_config["LATENT_DIM"]
).to(device)

checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False)
model_vae.load_state_dict(checkpoint_vae)
model_vae.eval()

print(f"VAE model loaded successfully from {cancer_type}")

# ============================================================================
# Load Reconstruction Network
# ============================================================================

# Download network dimensions
network_dims_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/network_dims.csv"
)

# Load network dimensions
network_dims = pd.read_csv(network_dims_path)
layer_dims = network_dims.values.tolist()[0]

print(f"Reconstruction Network dimensions: {layer_dims}")

# Download reconstruction network weights
recnet_model_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/network_reconstruction.pth"
)

# Initialize and load Reconstruction Network
model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device)
checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False)
model_recnet.load_state_dict(checkpoint_recnet)
model_recnet.eval()

print(f"Reconstruction Network loaded successfully from {cancer_type}")

# ============================================================================
# Use the models
# ============================================================================

# Example: Apply VAE to your data
# your_data = torch.tensor(your_data_array).float().to(device)
# with torch.no_grad():
#     vae_output = model_vae(your_data)
#     recnet_output = model_recnet(vae_output)

```

## Citation

> **⚠️ Warning**  
> This citation is temporary. It will be updated when a pre-print is released.


If you use these pretrained models, please cite:

```bibtex
@software{renalprog2024,
  title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling},
  author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]},
  year = {2025},
  url = {https://github.com/gprolcas/renalprog},
  note = {Preprint in preparation}
}
```

## Training Details

These models were trained using:
- Random seed: 2023
- Train/test split: 80/20
- Optimizer: Adam
- Learning rate: 1e-4
- Batch size: 8
- Beta annealing (for VAE): 3 cycles with 0.5 ratio

## Model Performance

**KIRC Model**:
- Reconstruction loss (test): ~1.1

**BRCA Model**:
- Reconstruction loss (test): ~0.9

## License

These pretrained models are provided under the same Apache 2.0 license.

## Contact

For questions about the pretrained models, please:
1. Check the [documentation](https://gprolcastelo.github.io/renalprog/)
2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues)
3. Contact the authors

---

**Last Updated**: December 2025
**Version**: 1.0.0-alpha