Any working example for affinity inputs/outputs

Jan 16

Hi,

I have got the thing "running" and it says "Successful predictions: 1" in the summary, but it doesn't seem to output any kind of affinity value anywhere in the output_dir
Trying to work out if it is something wrong with my query.

Could you provide a correct simple example query and the expected output location for the affinity prediction? (And possibly the runner.yaml if anything needs to be added there)

BenedictWJIrwin

Jan 16

My query is with a sequence and smiles in the correct fields

{
        "queries": {
                "query1": {
                        "chains": [
                                {
                                        "molecule_type": "protein",
                                        "chain_ids": ["A"],
                                        "sequence": "..."
                                },
                                {
                                        "molecule_type": "ligand",
                                        "chain_ids": ["B"],
                                        "smiles": "..."
                                }
                        ]
                }
        }
}

BenedictWJIrwin

Jan 19

Changing the runner output type from pdb to cif seems to have produced additional files.

For 1 seed and 5x samples I now see a single file query1_seed_1_sample_5_binding_head.txt.

Should I be expecting one prediction per sample? Why is it only sample 5?

BenedictWJIrwin

Jan 19

I've run it again with seed2 and now there is only a single file query1_seed_2_sample_2_binding_head.txt
So the sample is 2/5 rather than 5/5. Is the choice of sample random, or some other rule?

BenedictWJIrwin

Jan 19

•

edited Jan 20

Another thought, I'm getting binding numbers out spanning ~ 0.3 - 2.5 for strong known strong binders in my set.
Had initially interpreted this as perhaps binding strength in nM, but running a likely inactive through I'm now getting -0.47.

What is the binding output supposed to represent here? If it is the same as the Boltz-2 model then the predictions are pretty bad for this target at least.

EDIT If you add a minus sign to the aqaffinity output, it inverts the picture and the predictions are correlating well.

i.e. comparing (note the '-')
10^(-aqaffinity) vs 10^(boltz_affinity) vs uM experimental

sb-sb

SandboxAQ org Jan 21

Changing the runner output type from pdb to cif seems to have produced additional files.

For 1 seed and 5x samples I now see a single file query1_seed_1_sample_5_binding_head.txt.

Should I be expecting one prediction per sample? Why is it only sample 5?

Hi Ben, thanks for checking out the repo. Currently, the model should only do 1 affinity prediction based on the structure with the highest iptm score, not one affinity prediction per sample. Also as of this moment, the affinity prediction only works with the cif output format, we'll work on getting pdbs working. We are also working on created more examples for io for training and inference.

Another thought, I'm getting binding numbers out spanning ~ 0.3 - 2.5 for strong known strong binders in my set.
Had initially interpreted this as perhaps binding strength in nM, but running a likely inactive through I'm now getting -0.47.

What is the binding output supposed to represent here? If it is the same as the Boltz-2 model then the predictions are pretty bad for this target at least.

EDIT If you add a minus sign to the aqaffinity output, it inverts the picture and the predictions are correlating well.

i.e. comparing (note the '-')
10^(-aqaffinity) vs 10^(boltz_affinity) vs uM experimental

Yes, another thing we will make more explicit in the documentation + readme is that we predict the negative of the output of boltz2's head.

BenedictWJIrwin changed discussion status to closed Feb 10

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment