BioAssayAlign Qwen3-Embedding-0.6B Compatibility
What this model is
BioAssayAlign is an assay-conditioned small-molecule ranking model.
It takes:
- one assay definition
- a submitted list of candidate SMILES
and returns:
- one compatibility score per candidate
- a ranked shortlist for that assay
This model is designed to answer a practical question:
Given this assay, which molecules in my current candidate list should I screen first?
It is not:
- a chatbot
- a generative chemistry model
- a direct potency regressor
- a calibrated probability model
Companion dataset
Public dataset:
The published model was trained on the prepared compatibility-ranking subset inside that dataset release.
Intended use
Use this model when you already have a candidate set and want a ranking signal for one assay at a time.
Reasonable uses:
- shortlist triage before wet-lab screening
- retrospective ranking experiments
- assay-conditioned ranking features in a downstream workflow
Not reasonable uses:
- reading the raw score as a probability of success
- predicting exact IC50 / EC50 / Ki values
- comparing raw scores across unrelated runs as if they were globally calibrated
How to run it locally
This repository is self-contained for inference. You do not need the original training codebase to run the published model.
Install
python -m pip install -r requirements.txt
Minimal local example
from bioassayalign_compatibility import (
AssayQuery,
load_compatibility_model_from_hub,
rank_compounds,
serialize_assay_query,
)
model = load_compatibility_model_from_hub(
"lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility"
)
assay_text = serialize_assay_query(
AssayQuery(
title="JAK2 inhibition assay",
description="Cell-based luminescence assay measuring JAK2 inhibition in HEK293 cells.",
organism="Homo sapiens",
readout="luminescence",
assay_format="cell-based",
assay_type="inhibition",
target_uniprot=["O60674"],
)
)
results = rank_compounds(
model,
assay_text=assay_text,
smiles_list=[
"CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1",
"c1ccccc1",
"CCO",
],
)
for row in results:
print(row)
What to provide
Best practice:
- provide structured assay fields rather than one free-form paragraph
- include target, readout, organism, and format when known
- submit one parent or cleaned SMILES per candidate
Recommended assay fields:
titledescriptionorganismreadoutassay_formatassay_typetarget_uniprot
The model is reasonably robust to wording changes, but missing metadata can reduce ranking quality.
Model details
Published artifact configuration:
| Component | Value |
|---|---|
| Assay encoder | Qwen/Qwen3-Embedding-0.6B |
| Assay encoder training | Frozen |
| Assay metadata features | Enabled, 128 dims |
| Molecule features | Morgan fingerprints (r=2,3, 2048 bits each), chirality, MACCS, 30 RDKit descriptors |
| Projection dimension | 512 |
| Hidden dimension | 1024 |
| Dropout | 0.12 |
| Final score | Learned compatibility head output |
Important:
- the published score is not a raw embedding dot product
- the ranking comes from the learned scorer head
Training data
The public artifact was trained on a frozen assay-compound corpus derived from:
- PubChem BioAssay
- ChEMBL
The published model uses the prepared compatibility-ranking subset from:
Prepared training dataset
| Field | Value |
|---|---|
| Assays | 11,195 |
| Candidate-pool rows | 1,432,532 |
| Training groups | 508,216 |
| Train assays | 8,967 |
| Validation assays | 1,117 |
| Test assays | 1,111 |
Preparation rules
| Rule | Value |
|---|---|
| Minimum actives per assay | 4 |
| Minimum inactives per assay | 16 |
| Maximum actives per assay | 48 |
| Maximum inactives per assay | 192 |
| Molecule standardization | Enabled |
| Source manifest SHA256 | e4766477b64860952258cb4b76567b83061d5de44bb5f3b322ecdfe54f19910b |
Each training group contains:
- one assay
- one positive compound
- multiple explicit same-assay inactive compounds
This is a ranking setup, not a generic text-retrieval setup.
Training configuration
| Field | Value |
|---|---|
| Framework | pytorch_head_only_compatibility_ranking |
| Learning rate | 1.5e-3 |
| Batch size | 192 |
| Weight decay | 1e-4 |
| Hard-negative fraction | 0.5 |
| Negatives per example | 15 |
| Negative sets per positive | 2 |
| Max epochs | 30 |
| Early stopping patience | 5 |
| Early stopping min delta | 0.001 |
| Best epoch | 9 |
Results
Main evaluation
| Split | Mean AUPRC | Random-baseline AUPRC | Hit@10 | Mean AUROC | Mean nDCG@50 |
|---|---|---|---|---|---|
| Validation | 0.6214 |
0.2678 |
0.9722 |
0.7767 |
0.7140 |
| Test | 0.6339 |
0.2749 |
0.9739 |
0.7815 |
0.7250 |
Interpretation:
- the model materially beats the random ranking baseline
- it is strongest as a within-list ranking tool
- the main output to trust is the ranking order and shortlist separation, not the raw score magnitude
Score interpretation
The raw output is a learned compatibility logit-like score.
What it means:
- higher is better
- differences are meaningful within the same submitted list
- absolute values are not calibrated across unrelated runs
Example:
- candidate A score:
6.25 - candidate B score:
-8.65 - candidate C score:
-23.37
This does not mean A has a literal probability or potency attached to it. It means A ranked substantially above B and C for that submitted assay and candidate set.
For user-facing interpretation, the recommended order is:
- rank
- relative shortlist score within the submitted list
- chemistry context columns
- raw model score only for debugging or export
If you want a normalized within-list view, you can compute:
- min-max scaling to
0–100 - or softmax over the submitted list
Those are still not calibrated biological probabilities.
Example predictions
These examples were produced from the published weights.
Example: JAK2 cell assay
Assay:
- title:
JAK2 inhibition assay - description:
Cell-based luminescence assay measuring JAK2 inhibition in HEK293 cells. - organism:
Homo sapiens - readout:
luminescence - assay format:
cell-based - assay type:
inhibition - target UniProt:
O60674
| Rank | Candidate SMILES | Raw score |
|---|---|---|
| 1 | CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1 |
6.2590 |
| 2 | Cc1cc(=O)n(C)c(=O)[nH]1 |
-8.6542 |
| 3 | CCO |
-12.8678 |
| 4 | CCOc1ccc2nc(N3CCN(C)CC3)n(C)c(=O)c2c1 |
-23.3741 |
Example: ALDH1A1 fluorescence assay
Assay:
- title:
ALDH1A1 inhibition assay - description:
Cell-based fluorescence assay measuring ALDH1A1 inhibition in human cells. - organism:
Homo sapiens - readout:
fluorescence - assay format:
cell-based - assay type:
inhibition - target UniProt:
P00352
| Rank | Candidate SMILES | Raw score |
|---|---|---|
| 1 | CCOc1ccccc1 |
-26.9257 |
| 2 | Cc1cc(=O)n(C)c(=O)[nH]1 |
-38.5073 |
| 3 | CCN(CC)CCOc1ccccc1 |
-39.1753 |
| 4 | CCO |
-42.9016 |
Limitations
- The score is not a calibrated probability.
- The model does not predict exact potency values.
- The benchmark is assay-held-out, not a universal unseen-scaffold benchmark.
- Public assay data is noisy and assay protocols are heterogeneous.
- Some assays remain difficult and yield only moderate separation.
- Use the model as a ranking aid, not as a stand-alone medicinal chemistry decision system.
Repository contents
Files provided in this HF model repo:
best_model.pttraining_metadata.jsontraining_summary.jsonbioassayalign_compatibility.pyrequirements.txt
Interactive Space
Use the model in the companion Space:
https://huggingface.co/spaces/lighteternal/BioAssayAlign-Compatibility-Explorer