Commit ·
cac6789
verified ·
0
Parent(s):
initial commit
Browse filesCo-authored-by: evasnow1992 <evasnow1992@users.noreply.huggingface.co>
- .gitattributes +35 -0
- LICENSE +35 -0
- README.md +103 -0
- dualbind_toxbench.ckpt +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
LICENSE
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
NVIDIA License
|
| 2 |
+
|
| 3 |
+
1. Definitions
|
| 4 |
+
|
| 5 |
+
“Licensor” means any person or entity that distributes its Work.
|
| 6 |
+
“Work” means (a) the original work of authorship made available under this license, which may include software, documentation, or other files, and (b) any additions to or derivative works thereof that are made available under this license.
|
| 7 |
+
The terms “reproduce,” “reproduction,” “derivative works,” and “distribution” have the meaning as provided under U.S. copyright law; provided, however, that for the purposes of this license, derivative works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work.
|
| 8 |
+
Works are “made available” under this license by including in or with the Work either (a) a copyright notice referencing the applicability of this license to the Work, or (b) a copy of this license.
|
| 9 |
+
|
| 10 |
+
2. License Grant
|
| 11 |
+
|
| 12 |
+
2.1 Copyright Grant. Subject to the terms and conditions of this license, each Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free, copyright license to use, reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute its Work and any resulting derivative works in any form.
|
| 13 |
+
|
| 14 |
+
3. Limitations
|
| 15 |
+
|
| 16 |
+
3.1 Redistribution. You may reproduce or distribute the Work only if (a) you do so under this license, (b) you include a complete copy of this license with your distribution, and (c) you retain without modification any copyright, patent, trademark, or attribution notices that are present in the Work.
|
| 17 |
+
|
| 18 |
+
3.2 Derivative Works. You may specify that additional or different terms apply to the use, reproduction, and distribution of your derivative works of the Work (“Your Terms”) only if (a) Your Terms provide that the use limitation in Section 3.3 applies to your derivative works, and (b) you identify the specific derivative works that are subject to Your Terms. Notwithstanding Your Terms, this license (including the redistribution requirements in Section 3.1) will continue to apply to the Work itself.
|
| 19 |
+
|
| 20 |
+
3.3 Use Limitation. The Work and any derivative works thereof only may be used or intended for use non-commercially. Notwithstanding the foregoing, NVIDIA Corporation and its affiliates may use the Work and any derivative works commercially. As used herein, “non-commercially” means for research or evaluation purposes only.
|
| 21 |
+
|
| 22 |
+
3.4 Patent Claims. If you bring or threaten to bring a patent claim against any Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce any patents that you allege are infringed by any Work, then your rights under this license from such Licensor (including the grant in Section 2.1) will terminate immediately.
|
| 23 |
+
|
| 24 |
+
3.5 Trademarks. This license does not grant any rights to use any Licensor’s or its affiliates’ names, logos, or trademarks, except as necessary to reproduce the notices described in this license.
|
| 25 |
+
|
| 26 |
+
3.6 Termination. If you violate any term of this license, then your rights under this license (including the grant in Section 2.1) will terminate immediately.
|
| 27 |
+
|
| 28 |
+
4. Disclaimer of Warranty.
|
| 29 |
+
|
| 30 |
+
THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF
|
| 31 |
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER THIS LICENSE.
|
| 32 |
+
|
| 33 |
+
5. Limitation of Liability.
|
| 34 |
+
|
| 35 |
+
EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK (INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
|
README.md
ADDED
|
@@ -0,0 +1,103 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: nsclv1
|
| 4 |
+
license_link: LICENSE
|
| 5 |
+
---
|
| 6 |
+
# DualBind Overview
|
| 7 |
+
|
| 8 |
+
The code for using the DualBind model checkpoint is available in the [official Github repository](https://github.com/NVIDIA-Digital-Bio/dualbind).
|
| 9 |
+
|
| 10 |
+
## Description
|
| 11 |
+
|
| 12 |
+
DualBind is a state-of-the-art 3D structure-based deep learning model that predicts protein-ligand binding affinity, which plays a critical role in drug discovery. It leverages 3D structural information and employs a dual-loss framework to effectively learn the binding energy landscape. Trained on AB-FEP-calculated labels, DualBind achieves accurate and generalizable predictions at a fraction of the computational cost of physics-based approaches.
|
| 13 |
+
|
| 14 |
+
This model is ready for non-commercial use and is for research and development only.
|
| 15 |
+
|
| 16 |
+
### License/Terms of Use
|
| 17 |
+
|
| 18 |
+
DualBind is released under NSCLv1.
|
| 19 |
+
|
| 20 |
+
### Deployment Geography
|
| 21 |
+
|
| 22 |
+
Global
|
| 23 |
+
|
| 24 |
+
### Use Case
|
| 25 |
+
|
| 26 |
+
DualBind can be used by researchers and practitioners interested in predicting protein-ligand binding affinities.
|
| 27 |
+
|
| 28 |
+
### Release Date
|
| 29 |
+
|
| 30 |
+
Github [07/17/2025] via [NVIDIA-Digital-Bio/dualbind](https://github.com/NVIDIA-Digital-Bio/dualbind)
|
| 31 |
+
|
| 32 |
+
## Reference(s)
|
| 33 |
+
|
| 34 |
+
The associated paper can be found [here](https://arxiv.org/abs/2507.08966).
|
| 35 |
+
|
| 36 |
+
[1] Meng Liu, Karl Leswing, Simon KS Chu, Farhad Ramezanghorbani, Griffin Young, Gabriel Marques, Prerna Das et al. "ToxBench: A Binding Affinity Prediction Benchmark with AB-FEP-Calculated Labels for Human Estrogen Receptor Alpha." arXiv preprint arXiv:2507.08966 (2025).
|
| 37 |
+
|
| 38 |
+
## Model Architecture
|
| 39 |
+
|
| 40 |
+
**Architecture Type:** Graph Neural Networks (GNN)
|
| 41 |
+
**Network Architecture:** Transformer, Frame Averaging Neural Network (FANN)
|
| 42 |
+
|
| 43 |
+
DualBind employs a dual-loss framework, which combines supervised mean squared error (MSE) loss with unsupervised denoising score matching (DSM) loss to effectively learn the binding energy function. The network architecture is a 3D-invariant graph neural network. Specifically, it is built based on Frame Averaging Neural Network (FANN), within which Transformer layers are used.
|
| 44 |
+
|
| 45 |
+
## Input
|
| 46 |
+
|
| 47 |
+
**Input Type(s):** Text (Protein, Ligand)
|
| 48 |
+
**Input Format(s):** Text: String (Protein Data Bank (PDB) files for protein), String (Structural Data Files (SDF) for ligand)
|
| 49 |
+
**Input Parameters:** One-Dimensional (1D) (SDF and PDB files)
|
| 50 |
+
**Other Properties Related to Input:** The PDB file includes the 3D structure information of the protein and the SDF file includes the 3D structure information of the ligand.
|
| 51 |
+
|
| 52 |
+
## Output
|
| 53 |
+
|
| 54 |
+
**Output Type(s):** Number
|
| 55 |
+
**Output Format:** Number: Floating number
|
| 56 |
+
**Output Parameters:** One-Dimensional (1D)
|
| 57 |
+
**Other Properties Related to Output:** The floating number represents the predicted binding affinity.
|
| 58 |
+
|
| 59 |
+
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
|
| 60 |
+
|
| 61 |
+
## Software Integration
|
| 62 |
+
|
| 63 |
+
**Runtime Engine(s):** PyTorch
|
| 64 |
+
|
| 65 |
+
**Supported Hardware Microarchitecture Compatibility:**
|
| 66 |
+
NVIDIA Ampere (tested on A100)
|
| 67 |
+
|
| 68 |
+
**[Preferred/Supported] Operating System(s):**
|
| 69 |
+
* [Linux]
|
| 70 |
+
|
| 71 |
+
## Model Version(s)
|
| 72 |
+
|
| 73 |
+
DualBind v1.0 (trained on ToxBench, with ~1M parameters)
|
| 74 |
+
|
| 75 |
+
# Training and Testing Datasets
|
| 76 |
+
|
| 77 |
+
DualBind is trained and tested on the [ToxBench dataset](https://arxiv.org/abs/2507.08966).
|
| 78 |
+
|
| 79 |
+
## Training/Testing Dataset
|
| 80 |
+
|
| 81 |
+
**Link:** [ToxBench](https://huggingface.co/datasets/karlleswing/toxbench)
|
| 82 |
+
|
| 83 |
+
**Data Collection Method by dataset:**
|
| 84 |
+
Synthetic (complex structures are generated by Schrodinger's docking method)
|
| 85 |
+
|
| 86 |
+
**Labeling Method by dataset:**
|
| 87 |
+
Synthetic (affinity labels are computed by Schrodinger's physics-based computational method, ABFEP)
|
| 88 |
+
|
| 89 |
+
**Properties:**
|
| 90 |
+
ToxBench is the first large-scale AB-FEP dataset designed for ML development and focused on a single pharmaceutically critical target, Human Estrogen Receptor Alpha (ERα). ToxBench contains 8,770 ERα-ligand complex structures with binding free energies computed via AB-FEP. Using a 70%/15%/15% random split and ensuring no SMILES overlap, we obtain 5,651 training data, 1,202 validation data, and 1,204 test data.
|
| 91 |
+
|
| 92 |
+
## Inference
|
| 93 |
+
|
| 94 |
+
**Engine:** PyTorch
|
| 95 |
+
**Test Hardware:** A100
|
| 96 |
+
|
| 97 |
+
## Ethical Considerations
|
| 98 |
+
|
| 99 |
+
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
|
| 100 |
+
|
| 101 |
+
Users are responsible for ensuring that predictions given by DualBind are appropriately evaluated and used in compliance with relevant safety regulations and ethical standards.
|
| 102 |
+
|
| 103 |
+
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|
dualbind_toxbench.ckpt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2757c09ece2b7ba537a6d2aa7a5dfab55f2440009e0c86ac1128acd5a1c4cec1
|
| 3 |
+
size 12273645
|