| --- |
| license: mit |
| tags: |
| - biology |
| - protein |
| - drug-target-interaction |
| - mitochondria |
| - apoptosis |
| - binding-affinity |
| - esm2 |
| - chemberta |
| datasets: |
| - jglaser/binding_affinity |
| metrics: |
| - pearsonr |
| - spearmanr |
| - rmse |
| - mae |
| --- |
| |
| # MitoInteract: Protein-Molecule Binding Affinity Prediction for Mitochondrial Apoptosis Research |
|
|
| ## Overview |
|
|
| MitoInteract is a dual-encoder model that predicts **binding affinity (pKd)** between any protein and any molecule. |
| It combines: |
| - **ESM-2 650M** (protein encoder) for protein sequence understanding |
| - **ChemBERTa** (molecule encoder) for SMILES-based molecular representation |
| - **Bidirectional cross-attention** fusion layer |
| - **4-layer MLP** regression head |
|
|
| ## Intended Use |
|
|
| This model is designed for **mitochondrial apoptosis research**, enabling researchers to: |
| - Predict how ceramides interact with mitochondrial membrane proteins (VDAC1, VDAC2) |
| - Screen BCL-2 family protein interactions with BH3 mimetic drugs (venetoclax, navitoclax, ABT-737) |
| - Explore protein-lipid interactions in the apoptosis pathway |
| - Run in-silico binding experiments before wet-lab validation |
|
|
| ## Quick Start |
|
|
| ```python |
| from model import load_model, predict_binding |
| |
| # Load model |
| model, config = load_model("full_model.pt", device="cuda") |
| |
| # Predict ceramide C16 binding to VDAC1 |
| result = predict_binding( |
| model, |
| protein_seq="MPPYLTFGLKAGALLPLTLPYVRAEAVTKLKLTLNAFEGASK...", # VDAC1 |
| smiles="CCCCCCCCCCCCCCCC(=O)N[C@@H](CO)[C@H](O)/C=C/CCCCCCCCCCCCC", # Ceramide C16 |
| device="cuda" |
| ) |
| print(f"Predicted pKd: {result['pKd']:.3f}") |
| print(f"Predicted Kd: {result['Kd_uM']:.3f} µM") |
| ``` |
|
|
| ## Key Apoptosis Targets |
|
|
| | Protein | Role in Apoptosis | |
| |---------|-------------------| |
| | BCL-2 | Anti-apoptotic, prevents MOMP | |
| | BCL-XL | Anti-apoptotic, sequesters BAX/BAK | |
| | BAX | Pro-apoptotic, forms pores in outer membrane | |
| | BAK | Pro-apoptotic, oligomerizes in membrane | |
| | VDAC1 | Voltage-dependent anion channel, ceramide target | |
| | Cytochrome c | Released during MOMP, activates caspase cascade | |
|
|
| ## Key Molecules |
|
|
| | Molecule | Role | |
| |----------|------| |
| | Ceramide C16 | Lipid mediator, promotes MOMP via VDAC | |
| | Ceramide C2 | Short-chain ceramide analog | |
| | Venetoclax | BCL-2 inhibitor (FDA-approved) | |
| | Navitoclax | BCL-2/BCL-XL dual inhibitor | |
| | ABT-737 | BCL-2/BCL-XL/BCL-w inhibitor | |
| | Cardiolipin | Mitochondrial inner membrane lipid | |
|
|
| ## Training Details |
|
|
| - **Dataset**: jglaser/binding_affinity (1.9M protein-ligand pairs) |
| - **Architecture**: ESM-2 650M (frozen) + ChemBERTa (frozen) + Cross-Attention + MLP |
| - **Training**: AdamW, lr=1e-3, cosine schedule, early stopping |
| - **Best Validation Pearson R**: -0.9107 |
| |
| ## Citation |
| |
| Based on: |
| - BAPULM (arxiv:2411.04150) - frozen encoder + MLP pattern |
| - SSM-DTA (arxiv:2206.09818) - CLS cross-attention fusion |
| - ESM-2 (arxiv:2202.03555) - protein language model |
| - ChemBERTa (arxiv:2010.09885) - molecular language model |
| |