File size: 4,517 Bytes
62de81e 14ad307 62de81e 14ad307 da235f6 14ad307 da235f6 14ad307 da235f6 62de81e 14ad307 62de81e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
---
tags: ['napistu', 'napistu-torch', 'graph-neural-networks', 'biological-networks', 'pytorch', 'graph_conv', 'distmult', 'edge_prediction', 'relation-aware']
library_name: napistu-torch
license: mit
metrics:
- auc
- average_precision
---
# graph_conv-distmult_h128_l3_edge_prediction
This model was trained using [Napistu-Torch](https://www.shackett.org/napistu_torch/), a PyTorch framework for training graph neural networks on biological pathway networks.
The dataset used for training is the 8-source ["Octopus" human consensus network](https://www.shackett.org/octopus_network/), which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions.
## Task
This model performs **edge prediction** on biological pathway networks. Given node embeddings,
the model predicts the likelihood of edges (interactions) between biological entities such as
genes, proteins, and metabolites. This is useful for:
- Discovering novel biological interactions
- Validating experimentally observed interactions
- Completing incomplete pathway databases
- Predicting functional relationships between genes/proteins
The model learns to score potential edges based on learned embeddings of source and target nodes,
optionally incorporating relation types for relation-aware prediction.
## Model Description
- **Encoder**
- Type: `graph_conv`
- Hidden Channels: `128`
- Number of Layers: `3`
- Dropout: `0.2`
- Edge Encoder: β (dim=32)
- **Head**
- Type: `distmult`
- Relation-Aware: β
**Training Date**: 2025-12-29
For detailed experiment and training settings see this repository's `config.json` file.
## Performance
| Metric | Value |
|--------|-------|
| Validation relation-weighted AUC | 0.8644 |
| Test relation-weighted AUC | 0.8650 |
| Validation AUC | 0.8277 |
| Test AUC | 0.8279 |
| Validation AP | 0.8282 |
| Test AP | 0.8283 |
## Links
- π [W&B Run](https://wandb.ai/napistu/napistu-experiments/runs/kjv3q37g)
- π [Napistu](https://napistu.com)
- π» [GitHub Repository](https://github.com/napistu/Napistu-Torch)
- π [Read the Docs](https://napistu-torch.readthedocs.io/en/latest)
- π [Napistu Wiki](https://github.com/napistu/napistu/wiki)
## Usage
### 1. Setup Environment
To reproduce the environment used for training, run the following commands:
```bash
pip install torch==2.8.0
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html
pip install 'napistu==0.8.5'
pip install 'napistu-torch[pyg,lightning]==0.3.2'
```
### 2. Setup Data Store
First, download the Octopus consensus network data to create a local `NapistuDataStore`:
```python
from napistu_torch.load.gcs import gcs_model_to_store
# Download data and create store
napistu_data_store = gcs_model_to_store(
napistu_data_dir="path/to/napistu_data",
store_dir="path/to/store",
asset_name="human_consensus",
# Pin to stable version for reproducibility
asset_version="20250923"
)
```
### 3. Load Pretrained Model from HuggingFace Hub
```python
from napistu_torch.ml.hugging_face import HFModelLoader
# Load checkpoint
loader = HFModelLoader("seanhacks/relation_prediction_distmult_128e")
checkpoint = loader.load_checkpoint()
# Load config to reproduce experiment
experiment_config = loader.load_config()
```
### 4. Use Pretrained Model for Training
You can use this pretrained model as initialization for training via the CLI:
```bash
# Create a training config that uses the pretrained model
cat > my_config.yaml << EOF
name: my_finetuned_model
model:
use_pretrained_model: true
pretrained_model_source: huggingface
pretrained_model_path: seanhacks/relation_prediction_distmult_128e
pretrained_model_freeze_encoder_weights: false # Allow fine-tuning
data:
sbml_dfs_path: path/to/sbml_dfs.pkl
napistu_graph_path: path/to/graph.pkl
napistu_data_name: edge_prediction
training:
epochs: 100
lr: 0.001
EOF
# Train with pretrained weights
napistu-torch train my_config.yaml
```
## Citation
If you use this model, please cite:
```bibtex
@software{napistu_torch,
title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis},
author = {Hackett, Sean R.},
url = {https://github.com/napistu/Napistu-Torch},
year = {2025},
note = {Model: graph_conv-distmult_h128_l3_edge_prediction}
}
```
## License
MIT License - See [LICENSE](https://github.com/napistu/Napistu-Torch/blob/main/LICENSE) for details.
|