willdabeatz commited on
Commit
5d5c9cd
Β·
verified Β·
1 Parent(s): 7cf3649

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ============================================================================
2
+ # MillerBind v9 β€” CASF-2016 Benchmark Validation (Docker)
3
+ # BindStream Technologies
4
+ # ============================================================================
5
+ #
6
+ # For full independent validation with encrypted model weights:
7
+ #
8
+ # docker run --rm bindstream/millerbind-v9-validation
9
+ #
10
+ # This Dockerfile is provided for transparency. The actual Docker image
11
+ # is pre-built and available from Docker Hub.
12
+ #
13
+ # SECURITY:
14
+ # - Model weights are AES-256 encrypted at rest
15
+ # - Python source compiled to .pyc bytecode (no readable code)
16
+ # - Runs as non-root user with read-only filesystem
17
+ # - No network access required β€” fully offline validation
18
+ #
19
+ # ============================================================================
20
+
21
+ FROM python:3.11-slim
22
+
23
+ LABEL maintainer="BindStream Technologies"
24
+ LABEL description="MillerBind v9 β€” CASF-2016 Benchmark Validation"
25
+ LABEL version="9.0.0"
26
+
27
+ WORKDIR /app
28
+
29
+ # NOTE: The full Docker image includes:
30
+ # - AES-256 encrypted ExtraTrees model weights (et_model.aes)
31
+ # - AES-256 encrypted XGBoost model weights (xgb_model.aes)
32
+ # - AES-256 encrypted CASF-2016 benchmark features (benchmark.aes)
33
+ # - Compiled validation bytecode (validate.pyc)
34
+ #
35
+ # None of these files contain readable model weights or source code.
36
+ # See README.md for instructions on running the pre-built image.
LICENSE ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright (c) 2026 BindStream Technologies. All rights reserved.
2
+
3
+ BENCHMARK RESULTS LICENSE
4
+
5
+ The prediction results, evaluation metrics, and verification scripts in this
6
+ repository are provided for independent verification of published benchmark
7
+ performance.
8
+
9
+ You MAY:
10
+ - Run the verification script to confirm reported metrics
11
+ - Cite these results in academic publications
12
+ - Reference this repository for reproducibility
13
+
14
+ You MAY NOT:
15
+ - Use predictions to train competing models
16
+ - Reverse-engineer model architecture from predictions
17
+ - Redistribute without attribution
18
+
19
+ The MillerBind model weights, feature engineering methodology, and training
20
+ pipeline are proprietary and are NOT included in this repository.
21
+
22
+ For licensing inquiries: support@bindstreamai.com
README.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MillerBind v9 & v12 β€” TDC Validation
2
+
3
+ **Independent third-party validation of MillerBind scoring functions using the [Therapeutics Data Commons (TDC)](https://tdcommons.ai/) evaluation framework.**
4
+
5
+ Developed by **William Miller β€” [BindStream Technologies](https://bindstreamai.com)**
6
+
7
+ ---
8
+
9
+ ## Results Summary
10
+
11
+ ### CASF-2016 Scoring Power Benchmark (n = 285, held out)
12
+
13
+ All metrics computed using `tdc.Evaluator` from PyTDC v1.1.15.
14
+
15
+ | Model | PCC | PCC 95% CI | Spearman ρ | MAE (pKd) | MAE 95% CI | RMSE | R² |
16
+ |-------|-----|------------|------------|-----------|------------|------|----|
17
+ | **MillerBind v9** | **0.890** | [0.862, 0.912] | 0.877 | **0.780** | [0.708, 0.857] | 1.030 | 0.775 |
18
+ | **MillerBind v12** | **0.938** | [0.921, 0.950] | 0.960 | **0.637** | [0.571, 0.707] | 0.869 | 0.840 |
19
+
20
+ 95% confidence intervals from 1,000 bootstrap resamples.
21
+
22
+ ### Comparison with Published Methods
23
+
24
+ | Method | PCC | MAE (pKd) | Type | Year |
25
+ |--------|-----|-----------|------|------|
26
+ | AutoDock Vina | 0.604 | 2.05 | Physics-based | 2010 |
27
+ | RF-Score v3 | 0.800 | 1.40 | Random Forest | 2015 |
28
+ | OnionNet-2 | 0.816 | 1.28 | Deep Learning | 2021 |
29
+ | PIGNet | 0.830 | 1.21 | GNN | 2022 |
30
+ | IGN | 0.850 | 1.15 | GNN | 2021 |
31
+ | HAC-Net | 0.860 | 1.10 | DL Ensemble | 2023 |
32
+ | **MillerBind v9** | **0.890** | **0.780** | **Proprietary ML** | **2025** |
33
+ | **MillerBind v12** | **0.938** | **0.637** | **Proprietary ML** | **2025** |
34
+
35
+ ### TDC BindingDB Cross-Reference
36
+
37
+ | Metric | Value |
38
+ |--------|-------|
39
+ | TDC BindingDB_Kd targets with PDBbind structures | 509 / 1,090 (46.7%) |
40
+ | PDBbind complexes matching TDC targets | 8,384 |
41
+ | TDC dataset structural coverage | 49.5% (25,869 / 52,274) |
42
+ | v9 PCC on TDC-overlapping CASF-2016 subset (n=170) | 0.880 |
43
+
44
+ ---
45
+
46
+ ## Full Validation Report
47
+
48
+ The complete peer-review validation report with scatter plots, bootstrap confidence intervals, residual distributions, per-affinity-range analysis, and statistical significance tests is included in this repository:
49
+
50
+ **[View the Full Report (HTML)](report/MillerBind_TDC_Validation_Report.html)** β€” download and open in any browser, or print to PDF.
51
+
52
+ ---
53
+
54
+ ## Verify Results
55
+
56
+ ### Option 1: Run TDC Evaluator on predictions (quick)
57
+
58
+ ```bash
59
+ pip install PyTDC numpy pandas scipy
60
+ python verify_with_tdc.py
61
+ ```
62
+
63
+ This loads the pre-computed predictions CSV and evaluates them using TDC's official `Evaluator`.
64
+
65
+ ### Option 2: Docker β€” full independent validation (comprehensive)
66
+
67
+ ```bash
68
+ docker run --rm bindstream/millerbind-v9-validation
69
+ ```
70
+
71
+ The Docker image contains:
72
+ - AES-256 encrypted model weights (not readable)
73
+ - AES-256 encrypted CASF-2016 features (not readable)
74
+ - Compiled Python bytecode (no source code)
75
+ - Runs predictions and reports metrics β€” fully offline, no network needed
76
+
77
+ ---
78
+
79
+ ## Repository Contents
80
+
81
+ ```
82
+ β”œβ”€β”€ README.md ← This file
83
+ β”œβ”€β”€ predictions/
84
+ β”‚ β”œβ”€β”€ casf2016_v9_predictions.csv ← 285 predictions (PDB ID, experimental, predicted pKd)
85
+ β”‚ └── casf2016_v12_predictions.csv ← 285 predictions for v12
86
+ β”œβ”€β”€ verify_with_tdc.py ← TDC Evaluator verification script
87
+ β”œβ”€β”€ report/
88
+ β”‚ └── MillerBind_TDC_Validation_Report.html ← Full peer-review report with figures
89
+ β”œβ”€β”€ Dockerfile ← Docker build reference (for transparency)
90
+ └── LICENSE
91
+ ```
92
+
93
+ ---
94
+
95
+ ## Why 3D Structures?
96
+
97
+ MillerBind is a **structure-based** scoring function β€” it requires 3D protein-ligand complex structures (PDB + ligand file) as input, not SMILES strings or amino acid sequences.
98
+
99
+ This is fundamentally different from sequence-based models (e.g., DeepDTA, MolTrans) that predict binding from 1D representations. Structure-based scoring uses the actual 3D atomic coordinates of both the protein and ligand, capturing:
100
+
101
+ - **Precise interatomic distances** between protein and ligand atoms
102
+ - **Binding pocket geometry** and shape complementarity
103
+ - **Hydrogen bonds, hydrophobic contacts, and electrostatic interactions** in 3D space
104
+
105
+ This is why structure-based methods consistently outperform sequence-based methods on binding affinity benchmarks β€” they're scoring the real physical interaction, not inferring it from strings.
106
+
107
+ **CASF-2016** is the gold-standard benchmark specifically designed for evaluating structure-based scoring functions (Su et al., 2019), and is the standard reported by AutoDock Vina, Glide, RF-Score, OnionNet, PIGNet, IGN, HAC-Net, and now MillerBind.
108
+
109
+ ---
110
+
111
+ ## Model Details
112
+
113
+ | | MillerBind v9 | MillerBind v12 |
114
+ |---|---|---|
115
+ | **Input** | 3D protein-ligand complex (PDB + ligand file) | 3D protein-ligand complex (PDB + ligand file) |
116
+ | **Output** | Predicted pKd | Predicted binding affinity |
117
+ | **Use case** | General-purpose scoring | PPI, hard targets, cancer, large proteins |
118
+ | **Training data** | PDBbind v2020 (18,438 complexes) | PDBbind v2020 (18,438 complexes) |
119
+ | **Test set** | CASF-2016 core set (285, strictly held out) | CASF-2016 core set (285, strictly held out) |
120
+ | **Inference** | < 1 second, CPU-only | < 1 second, CPU-only |
121
+ | **Architecture** | Proprietary | Proprietary |
122
+
123
+ ---
124
+
125
+ ## Statistical Significance
126
+
127
+ - **v9 PCC**: p < 10⁻⁹⁸
128
+ - **v12 PCC**: p < 10⁻¹³¹
129
+ - **v12 vs v9 improvement**: paired t-test, t = 5.30, p = 2.4 Γ— 10⁻⁷
130
+
131
+ ---
132
+
133
+ ## References
134
+
135
+ 1. Huang, K., et al. (2021). Therapeutics Data Commons. *NeurIPS Datasets and Benchmarks*.
136
+ 2. Su, M., et al. (2019). Comparative Assessment of Scoring Functions: The CASF-2016 Update. *J. Chem. Inf. Model.*, 59(2), 895–913.
137
+ 3. Wang, R., et al. (2004). The PDBbind Database. *J. Med. Chem.*, 47(12), 2977–2980.
138
+
139
+ ---
140
+
141
+ ## License
142
+
143
+ Results and predictions are provided for independent verification of benchmark performance.
144
+
145
+ Model weights, feature engineering, and training code are proprietary.
146
+
147
+ Β© 2026 BindStream Technologies. All rights reserved.
predictions/casf2016_v12_predictions.csv ADDED
@@ -0,0 +1,286 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ pdb_id,experimental_pkd,predicted_pkd
2
+ 3ao4,2.07,3.220233
3
+ 3gv9,2.12,2.757775
4
+ 1uto,2.27,2.930804
5
+ 1ps3,2.28,2.997658
6
+ 4ddk,2.29,2.887722
7
+ 4jsz,2.3,2.764679
8
+ 3g2z,2.36,2.654782
9
+ 3dxg,2.4,3.218832
10
+ 3l7b,2.4,3.180872
11
+ 3gr2,2.52,3.087765
12
+ 3kgp,2.57,3.482487
13
+ 3fcq,2.77,3.29235
14
+ 3lka,2.82,3.279423
15
+ 3zt2,2.84,3.66294
16
+ 3udh,2.85,3.308038
17
+ 3g31,2.89,3.02087
18
+ 4llx,2.89,3.253089
19
+ 4u4s,2.92,3.323386
20
+ 4owm,2.96,3.309959
21
+ 5aba,2.98,3.397143
22
+ 2xdl,3.1,3.694296
23
+ 4kz6,3.1,3.358904
24
+ 2ymd,3.16,3.544859
25
+ 3aru,3.22,3.660439
26
+ 1bcu,3.28,3.738278
27
+ 3zsx,3.28,4.085717
28
+ 4ddh,3.3,3.576896
29
+ 4eky,3.52,3.844593
30
+ 4abg,3.57,3.629122
31
+ 5a7b,3.57,4.010878
32
+ 3dx1,3.58,3.518005
33
+ 4bkt,3.62,3.706062
34
+ 2v00,3.66,3.765893
35
+ 4cig,3.67,4.193232
36
+ 3n7a,3.7,3.650373
37
+ 3d6q,3.76,3.755187
38
+ 2hb1,3.8,3.763251
39
+ 3twp,3.92,3.712645
40
+ 4agn,3.97,4.11237
41
+ 1c5z,4.01,4.021584
42
+ 3nq9,4.03,3.774597
43
+ 2w66,4.05,4.11131
44
+ 3kwa,4.08,4.140685
45
+ 3g2n,4.09,4.07267
46
+ 4cr9,4.1,3.928056
47
+ 4ih5,4.11,4.068548
48
+ 4de2,4.12,4.24722
49
+ 3ozt,4.13,4.220526
50
+ 3f3a,4.19,4.144307
51
+ 1a30,4.3,4.214303
52
+ 3ivg,4.3,4.309512
53
+ 3u9q,4.38,4.203977
54
+ 3rsx,4.41,3.853237
55
+ 3pxf,4.43,4.398958
56
+ 2wbg,4.45,4.229671
57
+ 3rr4,4.55,4.410624
58
+ 4w9c,4.65,4.476858
59
+ 3mss,4.66,4.391254
60
+ 4agp,4.69,4.389833
61
+ 4mgd,4.69,4.542971
62
+ 1vso,4.72,4.315415
63
+ 4jxs,4.74,4.452345
64
+ 1q8t,4.76,4.480579
65
+ 3acw,4.76,4.570966
66
+ 4lzs,4.8,4.32502
67
+ 3r88,4.82,4.357856
68
+ 4ciw,4.82,4.449844
69
+ 2w4x,4.85,4.341208
70
+ 2brb,4.86,4.702393
71
+ 1p1q,4.89,4.417227
72
+ 3d4z,4.89,4.239356
73
+ 1bzc,4.92,4.578009
74
+ 1nc3,5.0,4.739872
75
+ 4agq,5.01,4.692508
76
+ 4w9l,5.02,4.597499
77
+ 2yge,5.06,4.69641
78
+ 1e66,9.89,8.156246
79
+ 1gpk,5.37,6.316021
80
+ 1h23,8.35,7.71586
81
+ 1mq6,11.15,9.323365
82
+ 1nvq,8.25,8.081467
83
+ 1o3f,7.96,7.174602
84
+ 1o5b,5.77,6.509901
85
+ 1oyt,7.24,7.552136
86
+ 1q8u,5.96,6.695236
87
+ 1r5y,6.46,6.376092
88
+ 1sqa,9.21,7.827857
89
+ 1u1b,7.8,6.944384
90
+ 1w4o,5.22,5.965201
91
+ 1yc1,6.17,6.729734
92
+ 1z95,7.12,7.401438
93
+ 2cet,8.02,7.491305
94
+ 2fvd,8.52,7.868778
95
+ 2iwx,6.68,6.798249
96
+ 2j78,6.42,6.433222
97
+ 2p4y,9.0,7.765625
98
+ 2qbp,8.4,7.61813
99
+ 2qbr,6.33,6.446949
100
+ 2v7a,8.3,7.83254
101
+ 2vvn,7.3,7.271792
102
+ 2vw5,8.52,7.582892
103
+ 2wca,5.6,6.523808
104
+ 2weg,6.5,6.484488
105
+ 2wtv,8.74,8.031542
106
+ 2x00,11.33,8.213355
107
+ 2xb8,7.59,7.169479
108
+ 2xbv,8.43,8.192604
109
+ 2xnb,6.83,7.113731
110
+ 2xys,7.42,7.153071
111
+ 2y5h,5.79,6.901262
112
+ 2yfe,6.63,7.153531
113
+ 2yki,9.46,8.346503
114
+ 2zcq,8.82,7.628735
115
+ 2zcr,6.87,7.13174
116
+ 3ag9,8.05,6.993589
117
+ 3b68,8.4,8.014533
118
+ 3coy,6.02,7.005155
119
+ 3dd0,9.0,7.475416
120
+ 3e93,8.85,8.088731
121
+ 3ebp,5.91,6.299493
122
+ 3ehy,5.85,6.318022
123
+ 3ejr,8.57,6.815097
124
+ 3f3c,6.02,6.163343
125
+ 3f3e,7.7,6.885954
126
+ 3fv1,9.3,8.038605
127
+ 3g0w,9.52,8.578282
128
+ 3gbb,6.9,6.822221
129
+ 3ge7,8.7,7.728306
130
+ 3gnw,9.1,7.593097
131
+ 3gy4,5.1,5.616162
132
+ 3jvs,6.54,6.725752
133
+ 3k5v,6.3,6.825623
134
+ 3myg,10.7,8.407535
135
+ 3n86,5.64,6.272579
136
+ 3nw9,9.0,8.093534
137
+ 3oe5,6.88,7.204938
138
+ 3pww,7.32,7.571386
139
+ 3ueu,5.24,6.02025
140
+ 3uex,6.92,6.697157
141
+ 3uo4,6.52,7.127918
142
+ 3uri,9.0,7.96947
143
+ 3utu,10.92,8.16405
144
+ 4de1,5.96,6.439205
145
+ 4djv,6.72,6.884073
146
+ 4gid,10.77,8.970684
147
+ 4tmn,10.17,7.321037
148
+ 1eby,9.7,8.051152
149
+ 1g2k,7.96,8.055674
150
+ 1gpn,6.48,6.710404
151
+ 1h22,9.1,7.976433
152
+ 1k1i,6.58,6.831446
153
+ 1lpg,7.09,7.718361
154
+ 1nc1,6.12,6.806753
155
+ 1o0h,5.92,6.172748
156
+ 1owh,7.4,6.959412
157
+ 1p1n,6.8,6.966115
158
+ 1pxn,7.15,7.212241
159
+ 1qf1,7.32,7.122235
160
+ 1qkt,9.04,7.699411
161
+ 1s38,5.15,5.773963
162
+ 1syi,5.44,6.190537
163
+ 1y6r,10.11,8.058996
164
+ 1ydr,5.52,6.339293
165
+ 1ydt,7.32,7.22835
166
+ 1z6e,9.72,8.519292
167
+ 1z9g,5.64,6.19718
168
+ 2al5,8.4,7.518038
169
+ 2br1,5.14,6.509781
170
+ 2c3i,7.6,7.157173
171
+ 2cbv,5.48,6.174889
172
+ 2fxs,6.06,6.473402
173
+ 2j7h,7.19,6.657337
174
+ 2p15,10.3,8.440512
175
+ 2pog,9.54,7.964287
176
+ 2qbq,7.44,7.038012
177
+ 2qe4,7.96,7.534106
178
+ 2qnq,6.11,7.447042
179
+ 2r9w,5.1,5.875535
180
+ 2vkm,8.74,8.117166
181
+ 2wer,7.05,6.893118
182
+ 2wn9,8.52,6.609912
183
+ 2wnc,6.32,6.948906
184
+ 2wvt,6.12,6.268077
185
+ 2xii,7.2,7.277735
186
+ 2xj7,6.66,6.87891
187
+ 2zb1,6.32,7.140104
188
+ 2zda,8.4,7.633778
189
+ 2zy1,7.4,6.974599
190
+ 3arp,7.15,7.303368
191
+ 3arq,6.4,6.64457
192
+ 3arv,5.64,6.128925
193
+ 3ary,6.0,6.194659
194
+ 3b1m,8.48,7.703173
195
+ 3b27,5.16,5.970044
196
+ 3b5r,8.77,8.079166
197
+ 3b65,9.27,8.253976
198
+ 3bgz,6.26,6.56735
199
+ 3bv9,5.36,7.377466
200
+ 3cj4,6.51,6.552823
201
+ 3coz,5.57,6.685991
202
+ 3dx2,6.82,6.748783
203
+ 3e5a,8.23,7.596058
204
+ 3e92,8.0,7.610025
205
+ 3f3d,7.16,6.866644
206
+ 3fur,8.0,7.748777
207
+ 3fv2,8.11,7.407421
208
+ 3gc5,7.26,7.225648
209
+ 3jvr,5.72,6.123022
210
+ 3jya,6.89,6.642729
211
+ 3kr8,8.1,7.211321
212
+ 3n76,6.85,6.868225
213
+ 3nx7,8.1,7.285659
214
+ 3o9i,11.82,8.425344
215
+ 3oe4,7.47,7.442399
216
+ 3ozs,5.33,6.660798
217
+ 3p5o,7.3,7.189129
218
+ 3prs,7.82,7.642602
219
+ 3pyy,6.86,7.213902
220
+ 3qgy,7.8,7.680682
221
+ 3qqs,5.82,6.027813
222
+ 3rlr,7.52,7.376025
223
+ 3ryj,7.8,7.11269
224
+ 3syr,5.1,6.049204
225
+ 3tsk,7.17,7.550055
226
+ 3u5j,5.61,6.260873
227
+ 3u8k,8.66,7.591896
228
+ 3u8n,10.17,8.198868
229
+ 3uev,5.89,6.364786
230
+ 3uew,6.31,6.474203
231
+ 3ui7,9.0,7.547233
232
+ 3up2,7.4,7.648805
233
+ 3uuo,7.96,7.128018
234
+ 3wtj,6.53,6.72257
235
+ 3wz8,5.82,6.482207
236
+ 3zdg,7.1,7.036211
237
+ 3zso,5.12,6.598506
238
+ 4cra,7.22,7.496027
239
+ 4crc,8.72,8.349225
240
+ 4de3,5.52,6.171047
241
+ 4dld,5.82,6.330369
242
+ 4dli,5.62,6.518345
243
+ 4e5w,7.66,7.551615
244
+ 4e6q,8.36,7.738251
245
+ 4ea2,6.44,6.785963
246
+ 4eo8,8.15,7.118613
247
+ 4eor,6.3,6.980742
248
+ 4f09,6.7,6.885994
249
+ 4f2w,11.3,9.446108
250
+ 4f3c,11.82,9.886894
251
+ 4f9w,6.94,7.440498
252
+ 4gfm,7.22,6.91743
253
+ 4gkm,5.17,5.84638
254
+ 4gr0,9.55,8.828611
255
+ 4hge,7.92,7.623112
256
+ 4ih7,5.24,6.000339
257
+ 4ivb,8.72,7.741673
258
+ 4ivc,10.0,8.045869
259
+ 4ivd,9.52,7.9108
260
+ 4j21,7.41,7.026266
261
+ 4j28,5.7,6.211088
262
+ 4j3l,7.8,7.355695
263
+ 4jfs,5.27,5.967202
264
+ 4jia,9.22,8.042027
265
+ 4k18,8.96,8.332936
266
+ 4k77,6.63,6.814537
267
+ 4kzq,6.1,6.428299
268
+ 4kzu,6.5,6.70372
269
+ 4m0y,6.46,6.648212
270
+ 4m0z,5.19,6.614455
271
+ 4mme,6.5,6.728513
272
+ 4ogj,6.79,6.782561
273
+ 4pcs,7.85,6.789604
274
+ 4qac,9.4,7.616769
275
+ 4qd6,8.64,6.984825
276
+ 4rfm,10.05,8.556711
277
+ 4twp,10.0,8.737644
278
+ 4ty7,9.52,8.346503
279
+ 4w9h,6.73,7.083075
280
+ 4w9i,5.96,7.009717
281
+ 4wiv,6.26,6.551082
282
+ 4x6p,8.3,8.226742
283
+ 5c28,5.66,5.902049
284
+ 5c2h,11.09,7.760082
285
+ 5dwr,11.22,9.145754
286
+ 5tmn,8.04,7.327661
predictions/casf2016_v9_predictions.csv ADDED
@@ -0,0 +1,286 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ pdb_id,experimental_pkd,predicted_pkd
2
+ 3ao4,2.07,3.4103438552093515
3
+ 3gv9,2.12,2.997089909954833
4
+ 1uto,2.27,2.611788054957581
5
+ 1ps3,2.28,3.8409908609710706
6
+ 4ddk,2.29,3.2133813719360353
7
+ 4jsz,2.3,3.4439509889556885
8
+ 3g2z,2.36,2.571271295208741
9
+ 3dxg,2.4,3.80447843336792
10
+ 3l7b,2.4,3.9064154149963386
11
+ 3gr2,2.52,2.8891867265319817
12
+ 3kgp,2.57,2.9883645861145007
13
+ 3fcq,2.77,3.730020530682373
14
+ 3lka,2.82,3.904061172821044
15
+ 3zt2,2.84,2.9614314835845885
16
+ 3udh,2.85,3.49123650326233
17
+ 3g31,2.89,3.0698465562194825
18
+ 4llx,2.89,3.479611090936279
19
+ 4u4s,2.92,4.18904780914612
20
+ 4owm,2.96,4.028422431274415
21
+ 5aba,2.98,4.089326443188478
22
+ 2xdl,3.1,4.0136036055999735
23
+ 4kz6,3.1,3.414942495471193
24
+ 2ymd,3.16,4.145679129724121
25
+ 3aru,3.22,3.691513273999024
26
+ 1bcu,3.28,3.9748835479003906
27
+ 3zsx,3.28,3.505084961987301
28
+ 4ddh,3.3,3.3480168149627705
29
+ 4eky,3.52,3.895416665057373
30
+ 4abg,3.57,3.7905611997985846
31
+ 5a7b,3.57,4.4440988035552955
32
+ 3dx1,3.58,2.8153865238403335
33
+ 4bkt,3.62,3.513510453173828
34
+ 2v00,3.66,3.532361969735719
35
+ 4cig,3.67,4.296937358624269
36
+ 3n7a,3.7,3.7194380914367726
37
+ 3d6q,3.76,3.6142653523254418
38
+ 2hb1,3.8,3.925011574414063
39
+ 3twp,3.92,3.829758412893676
40
+ 4agn,3.97,4.260496788381956
41
+ 1c5z,4.01,3.9057902594024663
42
+ 3nq9,4.03,3.2923914468780517
43
+ 2w66,4.05,4.144561911642453
44
+ 3kwa,4.08,4.33356823682251
45
+ 3g2n,4.09,4.384795254528808
46
+ 4cr9,4.1,4.0847937427948
47
+ 4ih5,4.11,3.9535144394287114
48
+ 4de2,4.12,4.264003551010132
49
+ 3ozt,4.13,4.404660299697875
50
+ 3f3a,4.19,4.513862899578859
51
+ 1a30,4.3,4.235489810394286
52
+ 3ivg,4.3,4.228865531756588
53
+ 3u9q,4.38,4.155873401019285
54
+ 3rsx,4.41,4.080683740103149
55
+ 3pxf,4.43,4.517639647879028
56
+ 2wbg,4.45,3.9403762986114494
57
+ 3rr4,4.55,4.421191576327517
58
+ 4w9c,4.65,4.554356536437986
59
+ 3mss,4.66,4.372615561541747
60
+ 4agp,4.69,4.567405117315669
61
+ 4mgd,4.69,4.477381313397215
62
+ 1vso,4.72,4.23496441261902
63
+ 4jxs,4.74,4.498925769073484
64
+ 1q8t,4.76,3.9675011885253912
65
+ 3acw,4.76,3.8448213159515383
66
+ 4lzs,4.8,4.171122724661253
67
+ 3r88,4.82,3.771034620791625
68
+ 4ciw,4.82,4.547881289761353
69
+ 2w4x,4.85,4.089517209735107
70
+ 2brb,4.86,4.732903124523926
71
+ 1p1q,4.89,4.193486153259274
72
+ 3d4z,4.89,3.7955817773895286
73
+ 1bzc,4.92,4.539960212866209
74
+ 1nc3,5.0,4.495808076937866
75
+ 4agq,5.01,4.714692616073604
76
+ 4w9l,5.02,4.5492668968200665
77
+ 2yge,5.06,4.502787339559934
78
+ 1e66,9.89,7.188278556848147
79
+ 1gpk,5.37,6.627061509606929
80
+ 1h23,8.35,7.640120467767334
81
+ 1mq6,11.15,8.182997659912111
82
+ 1nvq,8.25,7.991674160308843
83
+ 1o3f,7.96,7.929832432086188
84
+ 1o5b,5.77,6.377937931426995
85
+ 1oyt,7.24,7.438043994689939
86
+ 1q8u,5.96,6.246946044714352
87
+ 1r5y,6.46,6.495986522070313
88
+ 1sqa,9.21,7.132408417932131
89
+ 1u1b,7.8,6.950309609649665
90
+ 1w4o,5.22,5.631865372967537
91
+ 1yc1,6.17,7.138161297869875
92
+ 1z95,7.12,8.16023122529907
93
+ 2cet,8.02,7.4743312950378415
94
+ 2fvd,8.52,7.548192197772217
95
+ 2iwx,6.68,6.427065835626221
96
+ 2j78,6.42,6.511890819445802
97
+ 2p4y,9.0,7.3757074037475565
98
+ 2qbp,8.4,6.683465866406247
99
+ 2qbr,6.33,6.45223804780273
100
+ 2v7a,8.3,7.659672416650389
101
+ 2vvn,7.3,8.04473292189331
102
+ 2vw5,8.52,6.655207552044676
103
+ 2wca,5.6,6.835288710302733
104
+ 2weg,6.5,6.758468464373772
105
+ 2wtv,8.74,8.096293521722412
106
+ 2x00,11.33,8.133705259179687
107
+ 2xb8,7.59,6.829767467089842
108
+ 2xbv,8.43,8.429342031884767
109
+ 2xnb,6.83,7.912239552307128
110
+ 2xys,7.42,7.0454130675842315
111
+ 2y5h,5.79,7.466176282324223
112
+ 2yfe,6.63,6.686462293432614
113
+ 2yki,9.46,6.548638507910159
114
+ 2zcq,8.82,6.621591004406737
115
+ 2zcr,6.87,6.881761762066647
116
+ 3ag9,8.05,6.465643539935302
117
+ 3b68,8.4,8.190354584289548
118
+ 3coy,6.02,7.256264232788091
119
+ 3dd0,9.0,8.898755427398681
120
+ 3e93,8.85,7.927928440600588
121
+ 3ebp,5.91,5.834585341717523
122
+ 3ehy,5.85,7.041906984350589
123
+ 3ejr,8.57,7.275660043749999
124
+ 3f3c,6.02,6.169969203448485
125
+ 3f3e,7.7,6.9496301306030235
126
+ 3fv1,9.3,9.0139474576172
127
+ 3g0w,9.52,8.308721103222656
128
+ 3gbb,6.9,7.010007227856446
129
+ 3ge7,8.7,7.691785282739258
130
+ 3gnw,9.1,7.789149969110104
131
+ 3gy4,5.1,5.642898486169434
132
+ 3jvs,6.54,6.395871287023924
133
+ 3k5v,6.3,6.755995747637935
134
+ 3myg,10.7,7.480339961962891
135
+ 3n86,5.64,6.87884320557861
136
+ 3nw9,9.0,7.80833524720459
137
+ 3oe5,6.88,7.384849244354249
138
+ 3pww,7.32,7.206603011730956
139
+ 3ueu,5.24,6.669425869512939
140
+ 3uex,6.92,6.248880837237545
141
+ 3uo4,6.52,7.3048854471313485
142
+ 3uri,9.0,7.043507076727296
143
+ 3utu,10.92,9.076639426464844
144
+ 4de1,5.96,5.965974482391352
145
+ 4djv,6.72,6.763027065606689
146
+ 4gid,10.77,7.850791482153319
147
+ 4tmn,10.17,6.899314961224363
148
+ 1eby,9.7,9.071972182458495
149
+ 1g2k,7.96,9.015022155181883
150
+ 1gpn,6.48,6.737561861480713
151
+ 1h22,9.1,7.832304157910154
152
+ 1k1i,6.58,6.706198623937991
153
+ 1lpg,7.09,8.34666724058838
154
+ 1nc1,6.12,8.698183402862547
155
+ 1o0h,5.92,6.212684369488523
156
+ 1owh,7.4,6.82366798342285
157
+ 1p1n,6.8,6.3699154938171345
158
+ 1pxn,7.15,6.7172147785705585
159
+ 1qf1,7.32,7.0326707888916005
160
+ 1qkt,9.04,8.15731841669922
161
+ 1s38,5.15,6.428597907104489
162
+ 1syi,5.44,6.6256740905090314
163
+ 1y6r,10.11,8.702975983715817
164
+ 1ydr,5.52,6.023821814739987
165
+ 1ydt,7.32,6.731516535614017
166
+ 1z6e,9.72,9.223851026629642
167
+ 1z9g,5.64,6.090745917303456
168
+ 2al5,8.4,7.0378992705017165
169
+ 2br1,5.14,6.51611666404419
170
+ 2c3i,7.6,7.345220322833253
171
+ 2cbv,5.48,6.931224698944094
172
+ 2fxs,6.06,6.579123080712893
173
+ 2j7h,7.19,6.763879922875976
174
+ 2p15,10.3,8.12079060496826
175
+ 2pog,9.54,8.011308512695315
176
+ 2qbq,7.44,6.769573812530515
177
+ 2qe4,7.96,7.956994184594728
178
+ 2qnq,6.11,7.065995957580567
179
+ 2r9w,5.1,6.314880683123778
180
+ 2vkm,8.74,7.754257847521972
181
+ 2wer,7.05,7.076451121698001
182
+ 2wn9,8.52,6.491593399285888
183
+ 2wnc,6.32,6.890702031243896
184
+ 2wvt,6.12,6.788617755456541
185
+ 2xii,7.2,7.50235239619751
186
+ 2xj7,6.66,7.24886094121704
187
+ 2zb1,6.32,7.899370414654539
188
+ 2zda,8.4,7.590816088482666
189
+ 2zy1,7.4,6.625800997259521
190
+ 3arp,7.15,7.616060122827148
191
+ 3arq,6.4,7.252470330157471
192
+ 3arv,5.64,6.603162358056644
193
+ 3ary,6.0,6.32991121161499
194
+ 3b1m,8.48,6.975845003326417
195
+ 3b27,5.16,6.659595438983153
196
+ 3b5r,8.77,8.377510907250972
197
+ 3b65,9.27,7.818731971026608
198
+ 3bgz,6.26,6.53960380723267
199
+ 3bv9,5.36,7.568603907348635
200
+ 3cj4,6.51,6.248213882629393
201
+ 3coz,5.57,6.856269945117189
202
+ 3dx2,6.82,7.036528105249025
203
+ 3e5a,8.23,7.579197435076909
204
+ 3e92,8.0,7.955468993945312
205
+ 3f3d,7.16,6.852962016766357
206
+ 3fur,8.0,7.274226264727783
207
+ 3fv2,8.11,7.611759847460945
208
+ 3gc5,7.26,6.498214943859865
209
+ 3jvr,5.72,6.534867458081054
210
+ 3jya,6.89,6.210371053778073
211
+ 3kr8,8.1,7.514179323492433
212
+ 3n76,6.85,6.923076480364987
213
+ 3nx7,8.1,7.754999334240722
214
+ 3o9i,11.82,10.571575213317876
215
+ 3oe4,7.47,7.584636740136718
216
+ 3ozs,5.33,7.618785433905028
217
+ 3p5o,7.3,7.040106249493409
218
+ 3prs,7.82,7.061791739733884
219
+ 3pyy,6.86,6.777588291430662
220
+ 3qgy,7.8,7.521549591198731
221
+ 3qqs,5.82,5.700895336437992
222
+ 3rlr,7.52,7.24726395013428
223
+ 3ryj,7.8,7.819236793084718
224
+ 3syr,5.1,5.998993099548338
225
+ 3tsk,7.17,8.039678021966544
226
+ 3u5j,5.61,6.672076888208007
227
+ 3u8k,8.66,7.536642004400634
228
+ 3u8n,10.17,7.756551432366942
229
+ 3uev,5.89,6.662140793658446
230
+ 3uew,6.31,6.42144006773071
231
+ 3ui7,9.0,7.683850057873535
232
+ 3up2,7.4,7.815783694757076
233
+ 3uuo,7.96,7.269778711804197
234
+ 3wtj,6.53,7.083982210357666
235
+ 3wz8,5.82,6.306025235314942
236
+ 3zdg,7.1,6.93934756566772
237
+ 3zso,5.12,6.866331244750977
238
+ 4cra,7.22,7.864527570416258
239
+ 4crc,8.72,8.009984481304931
240
+ 4de3,5.52,6.368543999426265
241
+ 4dld,5.82,6.85392163234253
242
+ 4dli,5.62,6.715533710498047
243
+ 4e5w,7.66,8.167620159588623
244
+ 4e6q,8.36,7.694266949725342
245
+ 4ea2,6.44,7.024924207910157
246
+ 4eo8,8.15,7.719084939398188
247
+ 4eor,6.3,7.258288448657219
248
+ 4f09,6.7,7.021392902545162
249
+ 4f2w,11.3,9.941376510070796
250
+ 4f3c,11.82,10.949540416296388
251
+ 4f9w,6.94,7.133326093872067
252
+ 4gfm,7.22,6.852683067993161
253
+ 4gkm,5.17,5.5801841756713895
254
+ 4gr0,9.55,8.837895747052007
255
+ 4hge,7.92,7.7040420236450196
256
+ 4ih7,5.24,6.380635533618165
257
+ 4ivb,8.72,8.618383055297848
258
+ 4ivc,10.0,8.455736893969727
259
+ 4ivd,9.52,8.182122064904785
260
+ 4j21,7.41,6.959935733471679
261
+ 4j28,5.7,6.786026134820559
262
+ 4j3l,7.8,7.428130288195794
263
+ 4jfs,5.27,6.757419265069585
264
+ 4jia,9.22,7.775911537829589
265
+ 4k18,8.96,8.670148361224367
266
+ 4k77,6.63,7.700984968170163
267
+ 4kzq,6.1,6.754642571972658
268
+ 4kzu,6.5,6.805336975384526
269
+ 4m0y,6.46,6.64111072616577
270
+ 4m0z,5.19,7.040763745886235
271
+ 4mme,6.5,6.4431962990295455
272
+ 4ogj,6.79,6.750924728326414
273
+ 4pcs,7.85,6.55677436200562
274
+ 4qac,9.4,7.925974257635493
275
+ 4qd6,8.64,7.134632056262205
276
+ 4rfm,10.05,8.291213344226074
277
+ 4twp,10.0,8.140951649804691
278
+ 4ty7,9.52,7.714599139434814
279
+ 4w9h,6.73,6.400476244042966
280
+ 4w9i,5.96,6.039873965832514
281
+ 4wiv,6.26,6.366264964404298
282
+ 4x6p,8.3,8.644163160906981
283
+ 5c28,5.66,6.420668611987305
284
+ 5c2h,11.09,8.308975265411373
285
+ 5dwr,11.22,9.191689653570556
286
+ 5tmn,8.04,6.161969176776122
report/MillerBind_TDC_Validation_Report.html ADDED
The diff for this file is too large to render. See raw diff
 
verify_with_tdc.py ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ MillerBind v9 & v12 β€” TDC Evaluator Verification
4
+ ==================================================
5
+
6
+ Loads pre-computed predictions and evaluates them using
7
+ the Therapeutics Data Commons (TDC) official Evaluator.
8
+
9
+ Usage:
10
+ pip install PyTDC numpy pandas scipy
11
+ python verify_with_tdc.py
12
+
13
+ Author: William Miller / BindStream Technologies
14
+ """
15
+
16
+ import os
17
+ import sys
18
+ import numpy as np
19
+ import pandas as pd
20
+ from pathlib import Path
21
+
22
+ SCRIPT_DIR = Path(__file__).resolve().parent
23
+ PREDICTIONS_DIR = SCRIPT_DIR / "predictions"
24
+
25
+
26
+ def evaluate_model(name, csv_path):
27
+ """Evaluate a model's predictions using TDC Evaluator."""
28
+ from tdc import Evaluator
29
+
30
+ df = pd.read_csv(csv_path)
31
+ y_true = df["experimental_pkd"].values
32
+ y_pred = df["predicted_pkd"].values
33
+
34
+ print(f"\n{'=' * 60}")
35
+ print(f" {name}")
36
+ print(f" CASF-2016 Scoring Power (n = {len(df)})")
37
+ print(f"{'=' * 60}")
38
+
39
+ results = {}
40
+ for metric in ["pcc", "spearman", "mae", "rmse", "r2", "mse"]:
41
+ evaluator = Evaluator(name=metric)
42
+ score = evaluator(y_true, y_pred)
43
+ results[metric] = float(score)
44
+
45
+ print(f"\n {'Metric':<20s} {'Value':>10s}")
46
+ print(f" {'-' * 35}")
47
+ print(f" {'PCC':<20s} {results['pcc']:>10.4f}")
48
+ print(f" {'Spearman rho':<20s} {results['spearman']:>10.4f}")
49
+ print(f" {'MAE (pKd)':<20s} {results['mae']:>10.4f}")
50
+ print(f" {'RMSE':<20s} {results['rmse']:>10.4f}")
51
+ print(f" {'RΒ²':<20s} {results['r2']:>10.4f}")
52
+ print(f" {'MSE':<20s} {results['mse']:>10.4f}")
53
+ print(f" {'-' * 35}")
54
+
55
+ return results
56
+
57
+
58
+ def main():
59
+ print("╔════════════════════════════════════════════════════════╗")
60
+ print("β•‘ MillerBind β€” TDC Evaluator Verification β•‘")
61
+ print("β•‘ BindStream Technologies β•‘")
62
+ print("β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•")
63
+
64
+ all_results = {}
65
+
66
+ v9_path = PREDICTIONS_DIR / "casf2016_v9_predictions.csv"
67
+ if v9_path.exists():
68
+ all_results["v9"] = evaluate_model("MillerBind v9 (Ensemble)", v9_path)
69
+
70
+ v12_path = PREDICTIONS_DIR / "casf2016_v12_predictions.csv"
71
+ if v12_path.exists():
72
+ all_results["v12"] = evaluate_model("MillerBind v12 (Hard Targets)", v12_path)
73
+
74
+ # Comparison table
75
+ print(f"\n{'=' * 60}")
76
+ print(" COMPARISON WITH PUBLISHED METHODS (CASF-2016)")
77
+ print(f"{'=' * 60}")
78
+ print(f"\n {'Method':<22s} {'PCC':>8s} {'MAE':>10s} {'Year':>6s}")
79
+ print(f" {'-' * 50}")
80
+ for name, r, mae, year in [
81
+ ("AutoDock Vina", 0.604, 2.05, 2010),
82
+ ("RF-Score v3", 0.800, 1.40, 2015),
83
+ ("OnionNet-2", 0.816, 1.28, 2021),
84
+ ("PIGNet", 0.830, 1.21, 2022),
85
+ ("IGN", 0.850, 1.15, 2021),
86
+ ("HAC-Net", 0.860, 1.10, 2023),
87
+ ]:
88
+ print(f" {name:<22s} {r:>8.3f} {mae:>10.2f} {year:>6d}")
89
+
90
+ if "v9" in all_results:
91
+ r = all_results["v9"]["pcc"]
92
+ mae = all_results["v9"]["mae"]
93
+ print(f" {'MillerBind v9 *':<22s} {r:>8.3f} {mae:>10.3f} {'2025':>6s}")
94
+
95
+ if "v12" in all_results:
96
+ r = all_results["v12"]["pcc"]
97
+ mae = all_results["v12"]["mae"]
98
+ print(f" {'MillerBind v12 *':<22s} {r:>8.3f} {mae:>10.3f} {'2025':>6s}")
99
+
100
+ print(f" {'-' * 50}")
101
+ print(" * Validated using TDC Evaluator (PyTDC)")
102
+ print(f"\n{'=' * 60}")
103
+ print(" VERIFICATION COMPLETE")
104
+ print(f"{'=' * 60}")
105
+
106
+
107
+ if __name__ == "__main__":
108
+ main()