|
|
--- |
|
|
tags: ['napistu', 'napistu-torch', 'graph-neural-networks', 'biological-networks', 'pytorch', 'graph_conv', 'relation_attention', 'edge_prediction', 'relation-aware'] |
|
|
library_name: napistu-torch |
|
|
license: mit |
|
|
metrics: |
|
|
- auc |
|
|
- average_precision |
|
|
--- |
|
|
|
|
|
# graph_conv-relation_attention_h128_l3_edge_prediction |
|
|
|
|
|
This model was trained using [Napistu-Torch](https://www.shackett.org/napistu_torch/), a PyTorch framework for training graph neural networks on biological pathway networks. |
|
|
|
|
|
The dataset used for training is the 8-source ["Octopus" human consensus network](https://www.shackett.org/octopus_network/), which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions. |
|
|
|
|
|
## Task |
|
|
|
|
|
This model performs **edge prediction** on biological pathway networks. Given node embeddings, |
|
|
the model predicts the likelihood of edges (interactions) between biological entities such as |
|
|
genes, proteins, and metabolites. This is useful for: |
|
|
|
|
|
- Discovering novel biological interactions |
|
|
- Validating experimentally observed interactions |
|
|
- Completing incomplete pathway databases |
|
|
- Predicting functional relationships between genes/proteins |
|
|
|
|
|
The model learns to score potential edges based on learned embeddings of source and target nodes, |
|
|
optionally incorporating relation types for relation-aware prediction. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Encoder** |
|
|
- Type: `graph_conv` |
|
|
- Hidden Channels: `128` |
|
|
- Number of Layers: `3` |
|
|
- Dropout: `0.2` |
|
|
- Edge Encoder: β (dim=32) |
|
|
- **Head** |
|
|
- Type: `relation_attention` |
|
|
- Relation-Aware: β |
|
|
|
|
|
**Training Date**: 2025-12-30 |
|
|
|
|
|
For detailed experiment and training settings see this repository's `config.json` file. |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| Validation relation-weighted AUC | 0.8432 | |
|
|
| Test relation-weighted AUC | 0.8441 | |
|
|
| Validation AUC | 0.7983 | |
|
|
| Test AUC | 0.7990 | |
|
|
| Validation AP | 0.7972 | |
|
|
| Test AP | 0.7984 | |
|
|
|
|
|
|
|
|
## Links |
|
|
|
|
|
- π [W&B Run](https://wandb.ai/napistu/napistu-experiments/runs/hh8kfhbv) |
|
|
- π [Napistu](https://napistu.com) |
|
|
- π» [GitHub Repository](https://github.com/napistu/Napistu-Torch) |
|
|
- π [Read the Docs](https://napistu-torch.readthedocs.io/en/latest) |
|
|
- π [Napistu Wiki](https://github.com/napistu/napistu/wiki) |
|
|
|
|
|
## Usage |
|
|
|
|
|
### 1. Setup Environment |
|
|
|
|
|
To reproduce the environment used for training, run the following commands: |
|
|
|
|
|
```bash |
|
|
pip install torch==2.8.0 |
|
|
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html |
|
|
pip install 'napistu==0.8.5' |
|
|
pip install 'napistu-torch[pyg,lightning]==0.3.4' |
|
|
``` |
|
|
|
|
|
### 2. Setup Data Store |
|
|
|
|
|
First, download the Octopus consensus network data to create a local `NapistuDataStore`: |
|
|
```python |
|
|
from napistu_torch.load.gcs import gcs_model_to_store |
|
|
|
|
|
# Download data and create store |
|
|
napistu_data_store = gcs_model_to_store( |
|
|
napistu_data_dir="path/to/napistu_data", |
|
|
store_dir="path/to/store", |
|
|
asset_name="human_consensus", |
|
|
# Pin to stable version for reproducibility |
|
|
asset_version="20250923" |
|
|
) |
|
|
``` |
|
|
|
|
|
### 3. Load Pretrained Model from HuggingFace Hub |
|
|
```python |
|
|
from napistu_torch.ml.hugging_face import HFModelLoader |
|
|
|
|
|
# Load checkpoint |
|
|
loader = HFModelLoader("seanhacks/relation_prediction_relationattention_128e") |
|
|
checkpoint = loader.load_checkpoint() |
|
|
|
|
|
# Load config to reproduce experiment |
|
|
experiment_config = loader.load_config() |
|
|
``` |
|
|
|
|
|
### 4. Use Pretrained Model for Training |
|
|
|
|
|
You can use this pretrained model as initialization for training via the CLI: |
|
|
```bash |
|
|
# Create a training config that uses the pretrained model |
|
|
cat > my_config.yaml << EOF |
|
|
name: my_finetuned_model |
|
|
|
|
|
model: |
|
|
use_pretrained_model: true |
|
|
pretrained_model_source: huggingface |
|
|
pretrained_model_path: seanhacks/relation_prediction_relationattention_128e |
|
|
pretrained_model_freeze_encoder_weights: false # Allow fine-tuning |
|
|
|
|
|
data: |
|
|
sbml_dfs_path: path/to/sbml_dfs.pkl |
|
|
napistu_graph_path: path/to/graph.pkl |
|
|
napistu_data_name: edge_prediction |
|
|
|
|
|
training: |
|
|
epochs: 100 |
|
|
lr: 0.001 |
|
|
EOF |
|
|
|
|
|
# Train with pretrained weights |
|
|
napistu-torch train my_config.yaml |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
```bibtex |
|
|
@software{napistu_torch, |
|
|
title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis}, |
|
|
author = {Hackett, Sean R.}, |
|
|
url = {https://github.com/napistu/Napistu-Torch}, |
|
|
year = {2025}, |
|
|
note = {Model: graph_conv-relation_attention_h128_l3_edge_prediction} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License - See [LICENSE](https://github.com/napistu/Napistu-Torch/blob/main/LICENSE) for details. |
|
|
|