File size: 6,173 Bytes

---
license: apache-2.0
language:
- en
pipeline_tag: tabular-regression
tags:
- VAE
- bioinformatics
- TCGA
- ccRCC
- KIRC
- cancer
---


# Pretrained Models

This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project.

These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section).

## Available Models

### KIRC (Kidney Renal Clear Cell Carcinoma)

**Location**: `KIRC/`

*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300)

**Model Files**:
- `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights
- `network_reconstruction.pth` - Reconstruction network weights
- `network_dims.csv` - Network architecture specifications

**Model Specifications**:
- Input dimension: 8,516 genes
- VAE architecture:
  - Middle dimension: 512
  - Latent dimension: 256
  - Loss function: MSE
  - Activation: ReLU
- Reconstruction network: [8954, 3512, 824, 3731, 8954]
- Training: Beta-VAE with 3 cycles, 600 epochs total

### BRCA (Breast Invasive Carcinoma)

**Location**: `BRCA/`

*Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123)

**Model Files**:
- `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights
- `network_reconstruction.pth` - Reconstruction network weights
- `network_dims.csv` - Network architecture specifications

**Model Specifications**:
- Input dimension: 8,954 genes
- VAE architecture:
  - Middle dimension: 1,024
  - Latent dimension: 512
  - Loss function: MSE
  - Activation: ReLU
- Reconstruction network: [8954, 3104, 790, 4027, 8954]
- Training: Beta-VAE with 3 cycles, 600 epochs total

## Usage

### Loading Models in Python
 
See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects.


```python
import torch
import pandas as pd
import json
from pathlib import Path
import huggingface_hub as hf
from renalprog.modeling.train import VAE, NetworkReconstruction

# Configuration
cancer_type = "KIRC"  # or "BRCA"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# ============================================================================
# Load VAE Model
# ============================================================================

# Download VAE config
vae_config_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/config.json"
)

# Load configuration
with open(vae_config_path, "r") as f:
    vae_config = json.load(f)

print(f"VAE Configuration: {vae_config}")

# Download VAE model weights
if cancer_type == "KIRC":
    vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth"
elif cancer_type == "BRCA":
    vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth"
else:
    raise ValueError(f"Unknown cancer type: {cancer_type}")

vae_model_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=vae_filename
)

# Initialize and load VAE
model_vae = VAE(
    input_dim=vae_config["INPUT_DIM"],
    mid_dim=vae_config["MID_DIM"],
    features=vae_config["LATENT_DIM"]
).to(device)

checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False)
model_vae.load_state_dict(checkpoint_vae)
model_vae.eval()

print(f"VAE model loaded successfully from {cancer_type}")

# ============================================================================
# Load Reconstruction Network
# ============================================================================

# Download network dimensions
network_dims_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/network_dims.csv"
)

# Load network dimensions
network_dims = pd.read_csv(network_dims_path)
layer_dims = network_dims.values.tolist()[0]

print(f"Reconstruction Network dimensions: {layer_dims}")

# Download reconstruction network weights
recnet_model_path = hf.hf_hub_download(
    repo_id="gprolcastelo/evenflow_models",
    filename=f"{cancer_type}/network_reconstruction.pth"
)

# Initialize and load Reconstruction Network
model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device)
checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False)
model_recnet.load_state_dict(checkpoint_recnet)
model_recnet.eval()

print(f"Reconstruction Network loaded successfully from {cancer_type}")

# ============================================================================
# Use the models
# ============================================================================

# Example: Apply VAE to your data
# your_data = torch.tensor(your_data_array).float().to(device)
# with torch.no_grad():
#     vae_output = model_vae(your_data)
#     recnet_output = model_recnet(vae_output)

```

## Citation

> **⚠️ Warning**  
> This citation is temporary. It will be updated when a pre-print is released.


If you use these pretrained models, please cite:

```bibtex
@software{renalprog2024,
  title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling},
  author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]},
  year = {2025},
  url = {https://github.com/gprolcas/renalprog},
  note = {Preprint in preparation}
}
```

## Training Details

These models were trained using:
- Random seed: 2023
- Train/test split: 80/20
- Optimizer: Adam
- Learning rate: 1e-4
- Batch size: 8
- Beta annealing (for VAE): 3 cycles with 0.5 ratio

## Model Performance

**KIRC Model**:
- Reconstruction loss (test): ~1.1

**BRCA Model**:
- Reconstruction loss (test): ~0.9

## License

These pretrained models are provided under the same Apache 2.0 license.

## Contact

For questions about the pretrained models, please:
1. Check the [documentation](https://gprolcastelo.github.io/renalprog/)
2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues)
3. Contact the authors

---

**Last Updated**: December 2025
**Version**: 1.0.0-alpha