File size: 2,912 Bytes
98ed1b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: mit
tags:
- biology
- protein
- drug-target-interaction
- mitochondria
- apoptosis
- binding-affinity
- esm2
- chemberta
datasets:
- jglaser/binding_affinity
metrics:
- pearsonr
- spearmanr
- rmse
- mae
---

# MitoInteract: Protein-Molecule Binding Affinity Prediction for Mitochondrial Apoptosis Research

## Overview

MitoInteract is a dual-encoder model that predicts **binding affinity (pKd)** between any protein and any molecule. 
It combines:
- **ESM-2 650M** (protein encoder) for protein sequence understanding
- **ChemBERTa** (molecule encoder) for SMILES-based molecular representation
- **Bidirectional cross-attention** fusion layer
- **4-layer MLP** regression head

## Intended Use

This model is designed for **mitochondrial apoptosis research**, enabling researchers to:
- Predict how ceramides interact with mitochondrial membrane proteins (VDAC1, VDAC2)
- Screen BCL-2 family protein interactions with BH3 mimetic drugs (venetoclax, navitoclax, ABT-737)
- Explore protein-lipid interactions in the apoptosis pathway
- Run in-silico binding experiments before wet-lab validation

## Quick Start

```python
from model import load_model, predict_binding

# Load model
model, config = load_model("full_model.pt", device="cuda")

# Predict ceramide C16 binding to VDAC1
result = predict_binding(
    model,
    protein_seq="MPPYLTFGLKAGALLPLTLPYVRAEAVTKLKLTLNAFEGASK...",  # VDAC1
    smiles="CCCCCCCCCCCCCCCC(=O)N[C@@H](CO)[C@H](O)/C=C/CCCCCCCCCCCCC",  # Ceramide C16
    device="cuda"
)
print(f"Predicted pKd: {result['pKd']:.3f}")
print(f"Predicted Kd: {result['Kd_uM']:.3f} µM")
```

## Key Apoptosis Targets

| Protein | Role in Apoptosis |
|---------|-------------------|
| BCL-2 | Anti-apoptotic, prevents MOMP |
| BCL-XL | Anti-apoptotic, sequesters BAX/BAK |
| BAX | Pro-apoptotic, forms pores in outer membrane |
| BAK | Pro-apoptotic, oligomerizes in membrane |
| VDAC1 | Voltage-dependent anion channel, ceramide target |
| Cytochrome c | Released during MOMP, activates caspase cascade |

## Key Molecules

| Molecule | Role |
|----------|------|
| Ceramide C16 | Lipid mediator, promotes MOMP via VDAC |
| Ceramide C2 | Short-chain ceramide analog |
| Venetoclax | BCL-2 inhibitor (FDA-approved) |
| Navitoclax | BCL-2/BCL-XL dual inhibitor |
| ABT-737 | BCL-2/BCL-XL/BCL-w inhibitor |
| Cardiolipin | Mitochondrial inner membrane lipid |

## Training Details

- **Dataset**: jglaser/binding_affinity (1.9M protein-ligand pairs)
- **Architecture**: ESM-2 650M (frozen) + ChemBERTa (frozen) + Cross-Attention + MLP
- **Training**: AdamW, lr=1e-3, cosine schedule, early stopping
- **Best Validation Pearson R**: -0.9107

## Citation

Based on:
- BAPULM (arxiv:2411.04150) - frozen encoder + MLP pattern
- SSM-DTA (arxiv:2206.09818) - CLS cross-attention fusion
- ESM-2 (arxiv:2202.03555) - protein language model
- ChemBERTa (arxiv:2010.09885) - molecular language model