|
|
--- |
|
|
license: apache-2.0 |
|
|
extra_gated_prompt: "The AQAffinity preview model is released under Apache 2.0 license. You will automatically get access to the model after answering the following simple questions:" |
|
|
extra_gated_fields: |
|
|
First Name: text |
|
|
Last Name: text |
|
|
Company Name or Affiliation: text |
|
|
Role or Job Title: text |
|
|
I want to use AQAffinity for: text |
|
|
tags: |
|
|
- chemistry |
|
|
- biology |
|
|
- protein |
|
|
- ligand |
|
|
- binding |
|
|
- affinity |
|
|
- binding affinity |
|
|
- drug discovery |
|
|
--- |
|
|
|
|
|
Introducing AQAffinity |
|
|
============ |
|
|
***SandboxAQ's open protein-ligand binding affinity prediction head built on top of OpenFold3*** |
|
|
|
|
|
 |
|
|
|
|
|
**In collboration with the OpenFold Consortium** |
|
|
|
|
|
## Overview ## |
|
|
This repository contains an implementation of a Binding Affinity Head designed to operate on top of the OpenFold3 architecture. |
|
|
It is a direct replication of the affinity prediction module introduced in Boltz-2 (by MIT/Recursion). |
|
|
|
|
|
The goal of this project is not to provide a final, closed commercial product, but to establish a strong, transparent baseline for the structural biology community. |
|
|
We believe that binding affinity prediction, one of the "holy grails" of drug discovery, advances fastest when training data, pipelines, and model architectures are fully open for inspection and improvement. |
|
|
|
|
|
## Installation |
|
|
### Prerequisites |
|
|
This model requires `kalign2` for sequence alignment. Please install it using Mamba or Conda before installing the Python package: |
|
|
|
|
|
```bash |
|
|
mamba install kalign2 -c bioconda |
|
|
``` |
|
|
|
|
|
### Option 1: Direct Install (When publically released) |
|
|
```bash |
|
|
pip install git+[https://huggingface.co/SandboxAQ/aqaffinity](https://huggingface.co/SandboxAQ/aqaffinity) |
|
|
``` |
|
|
|
|
|
### Option 2: Local Install |
|
|
If the repository is private or you wish to download the source code first, use the Hugging Face CLI to download the repository and then install it locally. |
|
|
```bash |
|
|
huggingface-cli login |
|
|
hf download SandboxAQ/aqaffinity --local-dir ./aqaffinity |
|
|
pip install ./aqaffinity |
|
|
``` |
|
|
## Running AQAffinity |
|
|
```bash |
|
|
aqaffinity predict --query_json <of3_type_input_json> --runner_yaml <of3_runner_yaml> --inference_ckpt_path <of3_model_weights> --use_msa_server true --output_dir <output_dir> --binding_affinity_ckpt_path <binding_head_model_weights> |
|
|
``` |
|
|
|
|
|
## Contributing to AQAffinity |
|
|
We welcome contributions! This is a community effort. |
|
|
Found a bug? Open an issue. |
|
|
Have a better loss function? Submit a PR. |
|
|
Want to benchmark on a new dataset? Share your results in the Discussions tab. |
|
|
|
|
|
We believe "open source" means nothing without open data. Upon the full public release (v1.0), this repository will include: |
|
|
|
|
|
1. Manifests: Exact SQL commands used to apply filters to extract training and validation sets from the full Chembl, BindingDB, and Pubchem databases. |
|
|
|
|
|
2. Preprocessing Scripts: The exact logic used to clean, perform post processing, and embedding generation for the train and validation sets. |
|
|
|
|
|
3. Training Loop: PyTorch Lightning based training pipeline using cached embeddings. |
|
|
|
|
|
Together, we can build a useful transparent instrument for scientific discovery. |