File size: 6,688 Bytes
5d5c9cd 0d3fb18 5d5c9cd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | # MillerBind v9 & v12 β TDC Validation
**Independent third-party validation of MillerBind scoring functions using the [Therapeutics Data Commons (TDC)](https://tdcommons.ai/) evaluation framework.**
Developed by **William Miller β [BindStream Technologies](https://bindstreamai.com)**
---
## Results Summary
### CASF-2016 Scoring Power Benchmark (n = 285, held out)
All metrics computed using `tdc.Evaluator` from PyTDC v1.1.15.
| Model | PCC | PCC 95% CI | Spearman Ο | MAE (pKd) | MAE 95% CI | RMSE | RΒ² |
|-------|-----|------------|------------|-----------|------------|------|----|
| **MillerBind v9** | **0.890** | [0.862, 0.912] | 0.877 | **0.780** | [0.708, 0.857] | 1.030 | 0.775 |
| **MillerBind v12** | **0.938** | [0.921, 0.950] | 0.960 | **0.637** | [0.571, 0.707] | 0.869 | 0.840 |
95% confidence intervals from 1,000 bootstrap resamples.
### CASF-2016 Ranking Power (53 target clusters)
Ranking power measures whether the model correctly ranks ligands by affinity within each target protein cluster.
| Model | Avg Spearman Ο | Avg Kendall Ο | Concordance | Top-1 Success |
|-------|---------------|---------------|-------------|---------------|
| X-Score | 0.247 | β | β | β |
| AutoDock Vina | 0.281 | β | β | β |
| RF-Score v3 | 0.464 | β | β | β |
| ΞVinaRF20 | 0.476 | β | β | β |
| OnionNet-2 | 0.488 | β | β | β |
| **MillerBind v9** | **0.740** | **0.662** | **82.7%** | **60.4%** |
| **MillerBind v12** | **0.979** | **0.962** | **97.9%** | **92.5%** |
v12 achieves near-perfect ranking across 53 protein targets β correctly identifying the strongest binder in 49/53 targets.
### Comparison with Published Methods (Scoring Power)
| Method | PCC | MAE (pKd) | Type | Year |
|--------|-----|-----------|------|------|
| AutoDock Vina | 0.604 | 2.05 | Physics-based | 2010 |
| RF-Score v3 | 0.800 | 1.40 | Random Forest | 2015 |
| OnionNet-2 | 0.816 | 1.28 | Deep Learning | 2021 |
| PIGNet | 0.830 | 1.21 | GNN | 2022 |
| IGN | 0.850 | 1.15 | GNN | 2021 |
| HAC-Net | 0.860 | 1.10 | DL Ensemble | 2023 |
| **MillerBind v9** | **0.890** | **0.780** | **Proprietary ML** | **2025** |
| **MillerBind v12** | **0.938** | **0.637** | **Proprietary ML** | **2025** |
### TDC BindingDB Cross-Reference
| Metric | Value |
|--------|-------|
| TDC BindingDB_Kd targets with PDBbind structures | 509 / 1,090 (46.7%) |
| PDBbind complexes matching TDC targets | 8,384 |
| TDC dataset structural coverage | 49.5% (25,869 / 52,274) |
| v9 PCC on TDC-overlapping CASF-2016 subset (n=170) | 0.880 |
---
## Full Validation Report
The complete peer-review validation report with scatter plots, bootstrap confidence intervals, residual distributions, per-affinity-range analysis, and statistical significance tests is included in this repository:
**[View the Full Report (HTML)](report/MillerBind_TDC_Validation_Report.html)** β download and open in any browser, or print to PDF.
---
## Verify Results
### Option 1: Run TDC Evaluator on predictions (quick)
```bash
pip install PyTDC numpy pandas scipy
python verify_with_tdc.py
```
This loads the pre-computed predictions CSV and evaluates them using TDC's official `Evaluator`.
### Option 2: Docker β full independent validation (comprehensive)
```bash
docker run --rm bindstream/millerbind-v9-validation
```
The Docker image contains:
- AES-256 encrypted model weights (not readable)
- AES-256 encrypted CASF-2016 features (not readable)
- Compiled Python bytecode (no source code)
- Runs predictions and reports metrics β fully offline, no network needed
---
## Repository Contents
```
βββ README.md β This file
βββ predictions/
β βββ casf2016_v9_predictions.csv β 285 predictions (PDB ID, experimental, predicted pKd)
β βββ casf2016_v12_predictions.csv β 285 predictions for v12
βββ verify_with_tdc.py β TDC Evaluator verification script
βββ report/
β βββ MillerBind_TDC_Validation_Report.html β Full peer-review report with figures
βββ Dockerfile β Docker build reference (for transparency)
βββ LICENSE
```
---
## Why 3D Structures?
MillerBind is a **structure-based** scoring function β it requires 3D protein-ligand complex structures (PDB + ligand file) as input, not SMILES strings or amino acid sequences.
This is fundamentally different from sequence-based models (e.g., DeepDTA, MolTrans) that predict binding from 1D representations. Structure-based scoring uses the actual 3D atomic coordinates of both the protein and ligand, capturing:
- **Precise interatomic distances** between protein and ligand atoms
- **Binding pocket geometry** and shape complementarity
- **Hydrogen bonds, hydrophobic contacts, and electrostatic interactions** in 3D space
This is why structure-based methods consistently outperform sequence-based methods on binding affinity benchmarks β they're scoring the real physical interaction, not inferring it from strings.
**CASF-2016** is the gold-standard benchmark specifically designed for evaluating structure-based scoring functions (Su et al., 2019), and is the standard reported by AutoDock Vina, Glide, RF-Score, OnionNet, PIGNet, IGN, HAC-Net, and now MillerBind.
---
## Model Details
| | MillerBind v9 | MillerBind v12 |
|---|---|---|
| **Input** | 3D protein-ligand complex (PDB + ligand file) | 3D protein-ligand complex (PDB + ligand file) |
| **Output** | Predicted pKd | Predicted binding affinity |
| **Use case** | General-purpose scoring | PPI, hard targets, cancer, large proteins |
| **Training data** | PDBbind v2020 (18,438 complexes) | PDBbind v2020 (18,438 complexes) |
| **Test set** | CASF-2016 core set (285, strictly held out) | CASF-2016 core set (285, strictly held out) |
| **Inference** | < 1 second, CPU-only | < 1 second, CPU-only |
| **Architecture** | Proprietary | Proprietary |
---
## Statistical Significance
- **v9 PCC**: p < 10β»βΉβΈ
- **v12 PCC**: p < 10β»ΒΉΒ³ΒΉ
- **v12 vs v9 improvement**: paired t-test, t = 5.30, p = 2.4 Γ 10β»β·
---
## References
1. Huang, K., et al. (2021). Therapeutics Data Commons. *NeurIPS Datasets and Benchmarks*.
2. Su, M., et al. (2019). Comparative Assessment of Scoring Functions: The CASF-2016 Update. *J. Chem. Inf. Model.*, 59(2), 895β913.
3. Wang, R., et al. (2004). The PDBbind Database. *J. Med. Chem.*, 47(12), 2977β2980.
---
## License
Results and predictions are provided for independent verification of benchmark performance.
Model weights, feature engineering, and training code are proprietary.
Β© 2026 BindStream Technologies. All rights reserved.
|