Add model documentation
Browse files
README.md
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- drug-discovery
|
| 5 |
+
- protein-ligand-binding
|
| 6 |
+
- binding-kinetics
|
| 7 |
+
- deep-learning
|
| 8 |
+
- computational-biology
|
| 9 |
+
- bioinformatics
|
| 10 |
+
library_name: pytorch
|
| 11 |
+
datasets:
|
| 12 |
+
- kineticX
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# BiCoA-Net: Bidirectional Co-Attention Network
|
| 16 |
+
|
| 17 |
+
## Model Description
|
| 18 |
+
|
| 19 |
+
BiCoA-Net predicts protein-ligand dissociation rate constants (k_off) using bidirectional co-attention mechanisms between protein and ligand representations.
|
| 20 |
+
|
| 21 |
+
**Key Features:**
|
| 22 |
+
- Predicts binding kinetics (k_off) for drug-target interactions
|
| 23 |
+
- Uses ESM-2 protein embeddings + MolFormer ligand embeddings
|
| 24 |
+
- Bidirectional co-attention fusion mechanism
|
| 25 |
+
- Trained on curated KineticX datasets
|
| 26 |
+
|
| 27 |
+
## Quick Start
|
| 28 |
+
```python
|
| 29 |
+
from huggingface_hub import hf_hub_download
|
| 30 |
+
import torch
|
| 31 |
+
|
| 32 |
+
# Download model weights
|
| 33 |
+
model_path = hf_hub_download(
|
| 34 |
+
repo_id="Daisyli95/BiCoA-Net",
|
| 35 |
+
filename="pytorch_model.pt"
|
| 36 |
+
)
|
| 37 |
+
|
| 38 |
+
# Load model (FP16 format)
|
| 39 |
+
state_dict = torch.load(model_path, map_location='cpu')
|
| 40 |
+
|
| 41 |
+
# Convert to FP32 for inference (recommended)
|
| 42 |
+
state_dict_fp32 = {k: v.float() if v.dtype == torch.float16 else v
|
| 43 |
+
for k, v in state_dict.items()}
|
| 44 |
+
|
| 45 |
+
# Load into your model architecture
|
| 46 |
+
model.load_state_dict(state_dict_fp32)
|
| 47 |
+
model.eval()
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
## Model Details
|
| 51 |
+
|
| 52 |
+
- **Architecture**: ESM-2 (650M) + MolFormer + Bidirectional Co-Attention
|
| 53 |
+
- **Training Data**: PDBbind v2020 + Custom kinetics data
|
| 54 |
+
- **Format**: PyTorch FP16 (1.80 GB)
|
| 55 |
+
- **Parameters**: ~960M
|
| 56 |
+
- **Input**:
|
| 57 |
+
- Protein sequence (FASTA)
|
| 58 |
+
- Ligand SMILES string
|
| 59 |
+
- **Output**: Predicted log(k_off) value
|
| 60 |
+
|
| 61 |
+
## Performance
|
| 62 |
+
|
| 63 |
+
- Concordance Index (C-index): [Add your metrics]
|
| 64 |
+
- Pearson Correlation: [Add your metrics]
|
| 65 |
+
- Test on held-out GPCR targets: [Add your metrics]
|
| 66 |
+
|
| 67 |
+
## Usage Example
|
| 68 |
+
```python
|
| 69 |
+
# Assuming you have BiCoA-Net model class defined
|
| 70 |
+
from your_model import BiCoANet
|
| 71 |
+
|
| 72 |
+
# Initialize model
|
| 73 |
+
model = BiCoANet()
|
| 74 |
+
|
| 75 |
+
# Load pretrained weights
|
| 76 |
+
state_dict = torch.load(model_path, map_location='cpu')
|
| 77 |
+
state_dict = {k: v.float() for k, v in state_dict.items()}
|
| 78 |
+
model.load_state_dict(state_dict)
|
| 79 |
+
model.eval()
|
| 80 |
+
|
| 81 |
+
# Predict
|
| 82 |
+
protein_seq = "MSLQKEVQKL..."
|
| 83 |
+
ligand_smiles = "CC(C)Cc1ccc(cc1)C(C)C(O)=O"
|
| 84 |
+
|
| 85 |
+
with torch.no_grad():
|
| 86 |
+
prediction = model(protein_seq, ligand_smiles)
|
| 87 |
+
print(f"Predicted log(k_off): {prediction.item()}")
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
## Training Details
|
| 91 |
+
|
| 92 |
+
- Optimizer: AdamW
|
| 93 |
+
- Learning Rate: 1e-4
|
| 94 |
+
- Batch Size: 32
|
| 95 |
+
- Epochs: 100
|
| 96 |
+
- Loss Function: MSE on log(k_off)
|
| 97 |
+
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
## License
|
| 101 |
+
|
| 102 |
+
MIT License - Free for academic and commercial use.
|
| 103 |
+
|
| 104 |
+
## Contact
|
| 105 |
+
|
| 106 |
+
For questions or issues, please open an issue on the GitHub repository or contact the authors.
|