Any working example for affinity inputs/outputs
Hi,
I have got the thing "running" and it says "Successful predictions: 1" in the summary, but it doesn't seem to output any kind of affinity value anywhere in the output_dir
Trying to work out if it is something wrong with my query.
Could you provide a correct simple example query and the expected output location for the affinity prediction? (And possibly the runner.yaml if anything needs to be added there)
My query is with a sequence and smiles in the correct fields
{
"queries": {
"query1": {
"chains": [
{
"molecule_type": "protein",
"chain_ids": ["A"],
"sequence": "..."
},
{
"molecule_type": "ligand",
"chain_ids": ["B"],
"smiles": "..."
}
]
}
}
}
Changing the runner output type from pdb to cif seems to have produced additional files.
For 1 seed and 5x samples I now see a single file query1_seed_1_sample_5_binding_head.txt.
Should I be expecting one prediction per sample? Why is it only sample 5?
I've run it again with seed2 and now there is only a single file query1_seed_2_sample_2_binding_head.txt
So the sample is 2/5 rather than 5/5. Is the choice of sample random, or some other rule?
Another thought, I'm getting binding numbers out spanning ~ 0.3 - 2.5 for strong known strong binders in my set.
Had initially interpreted this as perhaps binding strength in nM, but running a likely inactive through I'm now getting -0.47.
What is the binding output supposed to represent here? If it is the same as the Boltz-2 model then the predictions are pretty bad for this target at least.
EDIT If you add a minus sign to the aqaffinity output, it inverts the picture and the predictions are correlating well.
i.e. comparing (note the '-')
10^(-aqaffinity) vs 10^(boltz_affinity) vs uM experimental
Changing the runner output type from pdb to cif seems to have produced additional files.
For 1 seed and 5x samples I now see a single file
query1_seed_1_sample_5_binding_head.txt.Should I be expecting one prediction per sample? Why is it only sample 5?
Hi Ben, thanks for checking out the repo. Currently, the model should only do 1 affinity prediction based on the structure with the highest iptm score, not one affinity prediction per sample. Also as of this moment, the affinity prediction only works with the cif output format, we'll work on getting pdbs working. We are also working on created more examples for io for training and inference.
Another thought, I'm getting binding numbers out spanning ~ 0.3 - 2.5 for strong known strong binders in my set.
Had initially interpreted this as perhaps binding strength in nM, but running a likely inactive through I'm now getting -0.47.What is the binding output supposed to represent here? If it is the same as the Boltz-2 model then the predictions are pretty bad for this target at least.
EDIT If you add a minus sign to the aqaffinity output, it inverts the picture and the predictions are correlating well.
i.e. comparing (note the '-')
10^(-aqaffinity) vs 10^(boltz_affinity) vs uM experimental
Yes, another thing we will make more explicit in the documentation + readme is that we predict the negative of the output of boltz2's head.