|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: tabular-regression |
|
|
tags: |
|
|
- VAE |
|
|
- bioinformatics |
|
|
- TCGA |
|
|
- ccRCC |
|
|
- KIRC |
|
|
- cancer |
|
|
--- |
|
|
|
|
|
|
|
|
# Pretrained Models |
|
|
|
|
|
This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project. |
|
|
|
|
|
These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section). |
|
|
|
|
|
## Available Models |
|
|
|
|
|
### KIRC (Kidney Renal Clear Cell Carcinoma) |
|
|
|
|
|
**Location**: `KIRC/` |
|
|
|
|
|
*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300) |
|
|
|
|
|
**Model Files**: |
|
|
- `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights |
|
|
- `network_reconstruction.pth` - Reconstruction network weights |
|
|
- `network_dims.csv` - Network architecture specifications |
|
|
|
|
|
**Model Specifications**: |
|
|
- Input dimension: 8,516 genes |
|
|
- VAE architecture: |
|
|
- Middle dimension: 512 |
|
|
- Latent dimension: 256 |
|
|
- Loss function: MSE |
|
|
- Activation: ReLU |
|
|
- Reconstruction network: [8954, 3512, 824, 3731, 8954] |
|
|
- Training: Beta-VAE with 3 cycles, 600 epochs total |
|
|
|
|
|
### BRCA (Breast Invasive Carcinoma) |
|
|
|
|
|
**Location**: `BRCA/` |
|
|
|
|
|
*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123) |
|
|
|
|
|
**Model Files**: |
|
|
- `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights |
|
|
- `network_reconstruction.pth` - Reconstruction network weights |
|
|
- `network_dims.csv` - Network architecture specifications |
|
|
|
|
|
**Model Specifications**: |
|
|
- Input dimension: 8,954 genes |
|
|
- VAE architecture: |
|
|
- Middle dimension: 1,024 |
|
|
- Latent dimension: 512 |
|
|
- Loss function: MSE |
|
|
- Activation: ReLU |
|
|
- Reconstruction network: [8954, 3104, 790, 4027, 8954] |
|
|
- Training: Beta-VAE with 3 cycles, 600 epochs total |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Loading Models in Python |
|
|
|
|
|
See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects. |
|
|
|
|
|
|
|
|
```python |
|
|
import torch |
|
|
import pandas as pd |
|
|
import json |
|
|
from pathlib import Path |
|
|
import huggingface_hub as hf |
|
|
from renalprog.modeling.train import VAE, NetworkReconstruction |
|
|
|
|
|
# Configuration |
|
|
cancer_type = "KIRC" # or "BRCA" |
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
|
|
# ============================================================================ |
|
|
# Load VAE Model |
|
|
# ============================================================================ |
|
|
|
|
|
# Download VAE config |
|
|
vae_config_path = hf.hf_hub_download( |
|
|
repo_id="gprolcastelo/evenflow_models", |
|
|
filename=f"{cancer_type}/config.json" |
|
|
) |
|
|
|
|
|
# Load configuration |
|
|
with open(vae_config_path, "r") as f: |
|
|
vae_config = json.load(f) |
|
|
|
|
|
print(f"VAE Configuration: {vae_config}") |
|
|
|
|
|
# Download VAE model weights |
|
|
if cancer_type == "KIRC": |
|
|
vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth" |
|
|
elif cancer_type == "BRCA": |
|
|
vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth" |
|
|
else: |
|
|
raise ValueError(f"Unknown cancer type: {cancer_type}") |
|
|
|
|
|
vae_model_path = hf.hf_hub_download( |
|
|
repo_id="gprolcastelo/evenflow_models", |
|
|
filename=vae_filename |
|
|
) |
|
|
|
|
|
# Initialize and load VAE |
|
|
model_vae = VAE( |
|
|
input_dim=vae_config["INPUT_DIM"], |
|
|
mid_dim=vae_config["MID_DIM"], |
|
|
features=vae_config["LATENT_DIM"] |
|
|
).to(device) |
|
|
|
|
|
checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False) |
|
|
model_vae.load_state_dict(checkpoint_vae) |
|
|
model_vae.eval() |
|
|
|
|
|
print(f"VAE model loaded successfully from {cancer_type}") |
|
|
|
|
|
# ============================================================================ |
|
|
# Load Reconstruction Network |
|
|
# ============================================================================ |
|
|
|
|
|
# Download network dimensions |
|
|
network_dims_path = hf.hf_hub_download( |
|
|
repo_id="gprolcastelo/evenflow_models", |
|
|
filename=f"{cancer_type}/network_dims.csv" |
|
|
) |
|
|
|
|
|
# Load network dimensions |
|
|
network_dims = pd.read_csv(network_dims_path) |
|
|
layer_dims = network_dims.values.tolist()[0] |
|
|
|
|
|
print(f"Reconstruction Network dimensions: {layer_dims}") |
|
|
|
|
|
# Download reconstruction network weights |
|
|
recnet_model_path = hf.hf_hub_download( |
|
|
repo_id="gprolcastelo/evenflow_models", |
|
|
filename=f"{cancer_type}/network_reconstruction.pth" |
|
|
) |
|
|
|
|
|
# Initialize and load Reconstruction Network |
|
|
model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device) |
|
|
checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False) |
|
|
model_recnet.load_state_dict(checkpoint_recnet) |
|
|
model_recnet.eval() |
|
|
|
|
|
print(f"Reconstruction Network loaded successfully from {cancer_type}") |
|
|
|
|
|
# ============================================================================ |
|
|
# Use the models |
|
|
# ============================================================================ |
|
|
|
|
|
# Example: Apply VAE to your data |
|
|
# your_data = torch.tensor(your_data_array).float().to(device) |
|
|
# with torch.no_grad(): |
|
|
# vae_output = model_vae(your_data) |
|
|
# recnet_output = model_recnet(vae_output) |
|
|
|
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
> **⚠️ Warning** |
|
|
> This citation is temporary. It will be updated when a pre-print is released. |
|
|
|
|
|
|
|
|
If you use these pretrained models, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{renalprog2024, |
|
|
title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling}, |
|
|
author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]}, |
|
|
year = {2025}, |
|
|
url = {https://github.com/gprolcas/renalprog}, |
|
|
note = {Preprint in preparation} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
These models were trained using: |
|
|
- Random seed: 2023 |
|
|
- Train/test split: 80/20 |
|
|
- Optimizer: Adam |
|
|
- Learning rate: 1e-4 |
|
|
- Batch size: 8 |
|
|
- Beta annealing (for VAE): 3 cycles with 0.5 ratio |
|
|
|
|
|
## Model Performance |
|
|
|
|
|
**KIRC Model**: |
|
|
- Reconstruction loss (test): ~1.1 |
|
|
|
|
|
**BRCA Model**: |
|
|
- Reconstruction loss (test): ~0.9 |
|
|
|
|
|
## License |
|
|
|
|
|
These pretrained models are provided under the same Apache 2.0 license. |
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions about the pretrained models, please: |
|
|
1. Check the [documentation](https://gprolcastelo.github.io/renalprog/) |
|
|
2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues) |
|
|
3. Contact the authors |
|
|
|
|
|
--- |
|
|
|
|
|
**Last Updated**: December 2025 |
|
|
**Version**: 1.0.0-alpha |