seanhacks commited on
Commit
446a7d4
·
verified ·
1 Parent(s): e5707c7

model: graph_conv-relation_attention_mlp_h128_l3_edge_prediction | (graph_conv-relation_attention_mlp_h128_l3) | WandB: 63xqckzk

Browse files
Files changed (1) hide show
  1. README.md +147 -0
README.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags: ['graph-neural-networks', 'biological-networks', 'napistu', 'pytorch', 'graph_conv', 'relation_attention_mlp', 'edge_prediction', 'relation-aware']
3
+ library_name: napistu-torch
4
+ license: mit
5
+ metrics:
6
+ - auc
7
+ - average_precision
8
+ ---
9
+
10
+ # graph_conv-relation_attention_mlp_h128_l3_edge_prediction
11
+
12
+ This model was trained using [Napistu-Torch](https://www.shackett.org/napistu_torch/), a PyTorch framework for training graph neural networks on biological pathway networks.
13
+
14
+ The dataset used for training is the 8-source ["Octopus" human consensus network](https://www.shackett.org/octopus_network/), which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions.
15
+
16
+ ## Task
17
+
18
+ This model performs **edge prediction** on biological pathway networks. Given node embeddings,
19
+ the model predicts the likelihood of edges (interactions) between biological entities such as
20
+ genes, proteins, and metabolites. This is useful for:
21
+
22
+ - Discovering novel biological interactions
23
+ - Validating experimentally observed interactions
24
+ - Completing incomplete pathway databases
25
+ - Predicting functional relationships between genes/proteins
26
+
27
+ The model learns to score potential edges based on learned embeddings of source and target nodes,
28
+ optionally incorporating relation types for relation-aware prediction.
29
+
30
+ ## Model Description
31
+
32
+ - **Encoder**
33
+ - Type: `graph_conv`
34
+ - Hidden Channels: `128`
35
+ - Number of Layers: `3`
36
+ - Dropout: `0.2`
37
+ - Edge Encoder: ✓ (dim=32)
38
+ - **Head**
39
+ - Type: `relation_attention_mlp`
40
+ - Relation-Aware: ✓
41
+
42
+ **Training Date**: 2025-12-14
43
+
44
+ For detailed experiment and training settings see this repository's `config.json` file.
45
+
46
+ ## Performance
47
+
48
+ | Metric | Value |
49
+ |--------|-------|
50
+ | Validation AUC | 0.8187 |
51
+ | Test AUC | 0.8186 |
52
+ | Validation AP | 0.8205 |
53
+ | Test AP | 0.8204 |
54
+
55
+
56
+ ## Links
57
+
58
+ - 📊 [W&B Run](https://wandb.ai/napistu/napistu-experiments/runs/63xqckzk)
59
+ - 💻 [GitHub Repository](https://github.com/napistu/Napistu-Torch)
60
+ - 📖 [Read the Docs](https://napistu-torch.readthedocs.io/en/latest)
61
+ - 📚 [Napistu Wiki](https://github.com/napistu/napistu/wiki)
62
+
63
+ ## Usage
64
+
65
+ ### 1. Setup Environment
66
+
67
+ To reproduce the environment used for training, run the following commands:
68
+
69
+ ```bash
70
+ pip install torch==2.8.0
71
+ pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html
72
+ pip install 'napistu==0.8.2'
73
+ pip install 'napistu-torch[pyg,lightning]==0.2.14'
74
+ ```
75
+
76
+ ### 2. Setup Data Store
77
+
78
+ First, download the Octopus consensus network data to create a local `NapistuDataStore`:
79
+ ```python
80
+ from napistu_torch.load.gcs import gcs_model_to_store
81
+
82
+ # Download data and create store
83
+ napistu_data_store = gcs_model_to_store(
84
+ napistu_data_dir="path/to/napistu_data",
85
+ store_dir="path/to/store",
86
+ asset_name="human_consensus",
87
+ # Pin to stable version for reproducibility
88
+ asset_version="20250923"
89
+ )
90
+ ```
91
+
92
+ ### 3. Load Pretrained Model from HuggingFace Hub
93
+ ```python
94
+ from napistu_torch.ml.hugging_face import HuggingFaceLoader
95
+
96
+ # Load checkpoint
97
+ loader = HuggingFaceLoader("seanhacks/relation_attention_mlp")
98
+ checkpoint = loader.load_checkpoint()
99
+
100
+ # Load config to reproduce experiment
101
+ experiment_config = loader.load_config()
102
+ ```
103
+
104
+ ### 4. Use Pretrained Model for Training
105
+
106
+ You can use this pretrained model as initialization for training via the CLI:
107
+ ```bash
108
+ # Create a training config that uses the pretrained model
109
+ cat > my_config.yaml << EOF
110
+ name: my_finetuned_model
111
+
112
+ model:
113
+ use_pretrained_model: true
114
+ pretrained_model_source: huggingface
115
+ pretrained_model_path: seanhacks/relation_attention_mlp
116
+ pretrained_model_freeze_encoder_weights: false # Allow fine-tuning
117
+
118
+ data:
119
+ sbml_dfs_path: path/to/sbml_dfs.pkl
120
+ napistu_graph_path: path/to/graph.pkl
121
+ napistu_data_name: edge_prediction
122
+
123
+ training:
124
+ epochs: 100
125
+ lr: 0.001
126
+ EOF
127
+
128
+ # Train with pretrained weights
129
+ napistu-torch train my_config.yaml
130
+ ```
131
+
132
+ ## Citation
133
+
134
+ If you use this model, please cite:
135
+ ```bibtex
136
+ @software{napistu_torch,
137
+ title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis},
138
+ author = {Hackett, Sean R.},
139
+ url = {https://github.com/napistu/Napistu-Torch},
140
+ year = {2025},
141
+ note = {Model: graph_conv-relation_attention_mlp_h128_l3_edge_prediction}
142
+ }
143
+ ```
144
+
145
+ ## License
146
+
147
+ MIT License - See [LICENSE](https://github.com/napistu/Napistu-Torch/blob/main/LICENSE) for details.