license: apache-2.0
extra_gated_prompt: >-
The AQAffinity preview model is released under Apache 2.0 license. You will
automatically get access to the model after answering the following simple
questions:
extra_gated_fields:
First Name: text
Last Name: text
Company Name or Affiliation: text
Role or Job Title: text
I want to use AQAffinity for: text
tags:
- chemistry
- biology
- protein
- ligand
- binding
- affinity
- binding affinity
- drug discovery
Introducing AQAffinity
SandboxAQ's open protein-ligand binding affinity prediction head built on top of OpenFold3
In collboration with the OpenFold Consortium
Overview
This repository contains an implementation of a Binding Affinity Head designed to operate on top of the OpenFold3 architecture. It is a direct replication of the affinity prediction module introduced in Boltz-2 (by MIT/Recursion).
The goal of this project is not to provide a final, closed commercial product, but to establish a strong, transparent baseline for the structural biology community. We believe that binding affinity prediction, one of the "holy grails" of drug discovery, advances fastest when training data, pipelines, and model architectures are fully open for inspection and improvement.
Installation
Prerequisites
This model requires kalign2 for sequence alignment. Please install it using Mamba or Conda before installing the Python package:
mamba install kalign2 -c bioconda
Option 1: Direct Install (When publically released)
pip install git+[https://huggingface.co/SandboxAQ/aqaffinity](https://huggingface.co/SandboxAQ/aqaffinity)
Option 2: Local Install
If the repository is private or you wish to download the source code first, use the Hugging Face CLI to download the repository and then install it locally.
huggingface-cli login
hf download SandboxAQ/aqaffinity --local-dir ./aqaffinity
pip install ./aqaffinity
Running AQAffinity
aqaffinity predict --query_json <of3_type_input_json> --runner_yaml <of3_runner_yaml> --inference_ckpt_path <of3_model_weights> --use_msa_server true --output_dir <output_dir> --binding_affinity_ckpt_path <binding_head_model_weights>
Contributing to AQAffinity
We welcome contributions! This is a community effort. Found a bug? Open an issue. Have a better loss function? Submit a PR. Want to benchmark on a new dataset? Share your results in the Discussions tab.
We believe "open source" means nothing without open data. Upon the full public release (v1.0), this repository will include:
Manifests: Exact SQL commands used to apply filters to extract training and validation sets from the full Chembl, BindingDB, and Pubchem databases.
Preprocessing Scripts: The exact logic used to clean, perform post processing, and embedding generation for the train and validation sets.
Training Loop: PyTorch Lightning based training pipeline using cached embeddings.
Together, we can build a useful transparent instrument for scientific discovery.
