seanhacks commited on
Commit
8d2fd62
Β·
verified Β·
1 Parent(s): b26e77b

model: graph_conv-mlp_h128_l3_edge_prediction | (graph_conv-mlp_h128_l3) | WandB: 3ii13ctv

Browse files
Files changed (1) hide show
  1. README.md +148 -0
README.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags: ['napistu', 'napistu-torch', 'graph-neural-networks', 'biological-networks', 'pytorch', 'graph_conv', 'mlp', 'edge_prediction']
3
+ library_name: napistu-torch
4
+ license: mit
5
+ metrics:
6
+ - auc
7
+ - average_precision
8
+ ---
9
+
10
+ # graph_conv-mlp_h128_l3_edge_prediction
11
+
12
+ This model was trained using [Napistu-Torch](https://www.shackett.org/napistu_torch/), a PyTorch framework for training graph neural networks on biological pathway networks.
13
+
14
+ The dataset used for training is the 8-source ["Octopus" human consensus network](https://www.shackett.org/octopus_network/), which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions.
15
+
16
+ ## Task
17
+
18
+ This model performs **edge prediction** on biological pathway networks. Given node embeddings,
19
+ the model predicts the likelihood of edges (interactions) between biological entities such as
20
+ genes, proteins, and metabolites. This is useful for:
21
+
22
+ - Discovering novel biological interactions
23
+ - Validating experimentally observed interactions
24
+ - Completing incomplete pathway databases
25
+ - Predicting functional relationships between genes/proteins
26
+
27
+ The model learns to score potential edges based on learned embeddings of source and target nodes,
28
+ optionally incorporating relation types for relation-aware prediction.
29
+
30
+ ## Model Description
31
+
32
+ - **Encoder**
33
+ - Type: `graph_conv`
34
+ - Hidden Channels: `128`
35
+ - Number of Layers: `3`
36
+ - Dropout: `0.2`
37
+ - Edge Encoder: βœ“ (dim=32)
38
+ - **Head**
39
+ - Type: `mlp`
40
+ - Relation-Aware: βœ—
41
+
42
+ **Training Date**: 2025-12-27
43
+
44
+ For detailed experiment and training settings see this repository's `config.json` file.
45
+
46
+ ## Performance
47
+
48
+ | Metric | Value |
49
+ |--------|-------|
50
+ | Validation AUC | 0.8160 |
51
+ | Test AUC | 0.8153 |
52
+ | Validation AP | 0.8181 |
53
+ | Test AP | 0.8177 |
54
+
55
+
56
+ ## Links
57
+
58
+ - πŸ“Š [W&B Run](https://wandb.ai/napistu/napistu-experiments/runs/3ii13ctv)
59
+ - 🌐 [Napistu](https://napistu.com)
60
+ - πŸ’» [GitHub Repository](https://github.com/napistu/Napistu-Torch)
61
+ - πŸ“– [Read the Docs](https://napistu-torch.readthedocs.io/en/latest)
62
+ - πŸ“š [Napistu Wiki](https://github.com/napistu/napistu/wiki)
63
+
64
+ ## Usage
65
+
66
+ ### 1. Setup Environment
67
+
68
+ To reproduce the environment used for training, run the following commands:
69
+
70
+ ```bash
71
+ pip install torch==2.8.0
72
+ pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html
73
+ pip install 'napistu==0.8.5'
74
+ pip install 'napistu-torch[pyg,lightning]==0.3.2'
75
+ ```
76
+
77
+ ### 2. Setup Data Store
78
+
79
+ First, download the Octopus consensus network data to create a local `NapistuDataStore`:
80
+ ```python
81
+ from napistu_torch.load.gcs import gcs_model_to_store
82
+
83
+ # Download data and create store
84
+ napistu_data_store = gcs_model_to_store(
85
+ napistu_data_dir="path/to/napistu_data",
86
+ store_dir="path/to/store",
87
+ asset_name="human_consensus",
88
+ # Pin to stable version for reproducibility
89
+ asset_version="20250923"
90
+ )
91
+ ```
92
+
93
+ ### 3. Load Pretrained Model from HuggingFace Hub
94
+ ```python
95
+ from napistu_torch.ml.hugging_face import HFModelLoader
96
+
97
+ # Load checkpoint
98
+ loader = HFModelLoader("seanhacks/relation_prediction_mlp_128e")
99
+ checkpoint = loader.load_checkpoint()
100
+
101
+ # Load config to reproduce experiment
102
+ experiment_config = loader.load_config()
103
+ ```
104
+
105
+ ### 4. Use Pretrained Model for Training
106
+
107
+ You can use this pretrained model as initialization for training via the CLI:
108
+ ```bash
109
+ # Create a training config that uses the pretrained model
110
+ cat > my_config.yaml << EOF
111
+ name: my_finetuned_model
112
+
113
+ model:
114
+ use_pretrained_model: true
115
+ pretrained_model_source: huggingface
116
+ pretrained_model_path: seanhacks/relation_prediction_mlp_128e
117
+ pretrained_model_freeze_encoder_weights: false # Allow fine-tuning
118
+
119
+ data:
120
+ sbml_dfs_path: path/to/sbml_dfs.pkl
121
+ napistu_graph_path: path/to/graph.pkl
122
+ napistu_data_name: edge_prediction
123
+
124
+ training:
125
+ epochs: 100
126
+ lr: 0.001
127
+ EOF
128
+
129
+ # Train with pretrained weights
130
+ napistu-torch train my_config.yaml
131
+ ```
132
+
133
+ ## Citation
134
+
135
+ If you use this model, please cite:
136
+ ```bibtex
137
+ @software{napistu_torch,
138
+ title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis},
139
+ author = {Hackett, Sean R.},
140
+ url = {https://github.com/napistu/Napistu-Torch},
141
+ year = {2025},
142
+ note = {Model: graph_conv-mlp_h128_l3_edge_prediction}
143
+ }
144
+ ```
145
+
146
+ ## License
147
+
148
+ MIT License - See [LICENSE](https://github.com/napistu/Napistu-Torch/blob/main/LICENSE) for details.