FuhaiLiAiLab
/

GALAX

@@ -1,13 +1,15 @@
-# GALAX: Graph-Augmented Language Model with Explainability for CRISPR Target Prioritization
 **Repository:** [FuhaiLiAiLab/GALAX](https://huggingface.co/FuhaiLiAiLab/GALAX)
 **Authors:** Heming Zhang, Fuhai Li, Yixin Chen, *et al.*
-**License:** Research-only use under [DepMap Terms](https://depmap.org/portal/termsOfUse).
 ---
 ## 🧩 Model Overview
 **GALAX** is a graph-augmented language model that integrates:
 - **LLaMA3-8B-Instruct** as the language backbone (QA-tuned).
 - **Graph Attention Network (GAT)** trained on BioMedGraphica (multi-omics + knowledge graph).
@@ -45,6 +47,31 @@ if os.path.exists(combined_model_path):
     print("Loaded best_combined_model.pt successfully")
 ```
 ## 📊 Results
 GALAX consistently outperforms baselines and ablation variants.
@@ -54,15 +81,6 @@ GALAX consistently outperforms baselines and ablation variants.
 - **Hit@10:** 0.8815
 - **Hit@5:** 0.9249
-<figure>
-  <img src="https://huggingface.co/FuhaiLiAiLab/GALAX/resolve/main/Figure4.pdf" width="100%">
-  <figcaption><b>Figure:</b> Performance across metrics and example explainable subgraph for LUAD (ACH-000860).</figcaption>
-</figure>
----
-### Performance Tables
 **Table 1. Precision and Recall across datasets**
 | Model                  | Overall Precision ↑ | Overall Recall ↑ | LUAD Precision ↑ | LUAD Recall ↑ | BRCA Precision ↑ | BRCA Recall ↑ |
@@ -78,8 +96,6 @@ GALAX consistently outperforms baselines and ablation variants.
 | G-Retriever + pre-GAT   | 0.4763 ± 0.0004     | 0.3929 ± 0.0063  | 0.4642 ± 0.0181  | 0.3881 ± 0.0264 | 0.4414 ± 0.0099 | 0.3772 ± 0.0010 |
 | **GALAX**               | **0.5472 ± 0.0053** | **0.5332 ± 0.0031** | **0.5345 ± 0.0185** | **0.5157 ± 0.0043** | **0.5608 ± 0.0031** | **0.5533 ± 0.0033** |
----
 **Table 2. Hit@10 and Hit@5 across datasets**
 | Model                  | Overall Hit@10 ↑ | Overall Hit@5 ↑ | LUAD Hit@10 ↑ | LUAD Hit@5 ↑ | BRCA Hit@10 ↑ | BRCA Hit@5 ↑ |
@@ -97,33 +113,11 @@ GALAX consistently outperforms baselines and ablation variants.
 ---
-## ⚙️ Experimental Setup
-- **Backbone LM:** LLaMA3-8B-Instruct (QA-tuned).
-- **Graph Encoder:** BioBERT-v1.1 embeddings + GAT with edge masking.
-- **Training:** Adam optimizer on 2× NVIDIA H100 (80GB).
-- **Top features per omics modality:** K = 10.
-- **Subgraph rollout depth:** L = 5, candidate nodes η = 20.
-- **Evaluation:** Precision, Recall, F1, Jaccard, Hit@5, Hit@10.
----
-## 📉 Baselines & Ablations
-- **M2T (Multiomic2Target):** Only omics → poor performance.
-- **L3+Omics:** No QA finetuning → weak results.
-- **L3-FT(QA)+Omics:** Large gain from QA finetuning.
-- **GAT / G-Retriever+pre-GAT:** Partial improvements, unstable.
-- **+ Static KG:** Minimal or negative gains.
-- **GALAX (QA + KG + RL):** Consistent cross-dataset gains (2–5% absolute).
----
 ## 🔬 Intended Uses
 - **Research use only**
 - Target prioritization in **cancer biology**
-- Benchmarking **graph-language foundation models** in bioinformatics
 ---
@@ -141,7 +135,7 @@ If you use this model, please cite:
 ```bibtex
 @article{zhang2025galax,
-  title={GALAX: Graph-Augmented Language Model with Explainability for CRISPR Target Prioritization},
   author={Zhang, Heming and Li, Fuhai and Chen, Yixin and others},
   year={2025},
   journal={Preprint}

+# GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine
 **Repository:** [FuhaiLiAiLab/GALAX](https://huggingface.co/FuhaiLiAiLab/GALAX)
 **Authors:** Heming Zhang, Fuhai Li, Yixin Chen, *et al.*
+**License:** Research-only use under [DepMap Terms](https://depmap.org).
 ---
 ## 🧩 Model Overview
+![GALAX Overall Architecture](./Figure2.png)
 **GALAX** is a graph-augmented language model that integrates:
 - **LLaMA3-8B-Instruct** as the language backbone (QA-tuned).
 - **Graph Attention Network (GAT)** trained on BioMedGraphica (multi-omics + knowledge graph).
     print("Loaded best_combined_model.pt successfully")
 ```
+---
+## ⚙️ Experimental Setup
+- **Backbone LM:** LLaMA3-8B-Instruct (QA-tuned).
+- **Graph Encoder:** BioBERT-v1.1 embeddings + GAT with edge masking.
+- **Training:** Adam optimizer on 2× NVIDIA H100 (80GB).
+- **Top features per omics modality:** K = 10.
+- **Subgraph rollout depth:** L = 5, candidate nodes η = 20.
+- **Evaluation:** Precision, Recall, F1, Jaccard, Hit@5, Hit@10.
+---
+## 📉 Baselines & Ablations
+- **M2T (Multiomic2Target):** Only omics → poor performance.
+- **L3+Omics:** No QA finetuning → weak results.
+- **L3-FT(QA)+Omics:** Large gain from QA finetuning.
+- **GAT / G-Retriever+pre-GAT:** Partial improvements, unstable.
+- **+ Static KG:** Minimal or negative gains.
+- **GALAX (QA + KG + RL):** Consistent cross-dataset gains (2–5% absolute).
+---
 ## 📊 Results
 GALAX consistently outperforms baselines and ablation variants.
 - **Hit@10:** 0.8815
 - **Hit@5:** 0.9249
 **Table 1. Precision and Recall across datasets**
 | Model                  | Overall Precision ↑ | Overall Recall ↑ | LUAD Precision ↑ | LUAD Recall ↑ | BRCA Precision ↑ | BRCA Recall ↑ |
 | G-Retriever + pre-GAT   | 0.4763 ± 0.0004     | 0.3929 ± 0.0063  | 0.4642 ± 0.0181  | 0.3881 ± 0.0264 | 0.4414 ± 0.0099 | 0.3772 ± 0.0010 |
 | **GALAX**               | **0.5472 ± 0.0053** | **0.5332 ± 0.0031** | **0.5345 ± 0.0185** | **0.5157 ± 0.0043** | **0.5608 ± 0.0031** | **0.5533 ± 0.0033** |
 **Table 2. Hit@10 and Hit@5 across datasets**
 | Model                  | Overall Hit@10 ↑ | Overall Hit@5 ↑ | LUAD Hit@10 ↑ | LUAD Hit@5 ↑ | BRCA Hit@10 ↑ | BRCA Hit@5 ↑ |
 ---
 ## 🔬 Intended Uses
 - **Research use only**
 - Target prioritization in **cancer biology**
+- Benchmarking **graph-language foundation models** in target priorization
 ---
 ```bibtex
 @article{zhang2025galax,
+  title={GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine},
   author={Zhang, Heming and Li, Fuhai and Chen, Yixin and others},
   year={2025},
   journal={Preprint}