BioAssayAlign Qwen3-Embedding-0.6B Compatibility

BioAssayAlign logo

What this model is

BioAssayAlign is an assay-conditioned small-molecule ranking model.

It takes:

  • one assay definition
  • a submitted list of candidate SMILES

and returns:

  • one compatibility score per candidate
  • a ranked shortlist for that assay

This model is designed to answer a practical question:

Given this assay, which molecules in my current candidate list should I screen first?

It is not:

  • a chatbot
  • a generative chemistry model
  • a direct potency regressor
  • a calibrated probability model

Companion dataset

Public dataset:

The published model was trained on the prepared compatibility-ranking subset inside that dataset release.

Intended use

Use this model when you already have a candidate set and want a ranking signal for one assay at a time.

Reasonable uses:

  • shortlist triage before wet-lab screening
  • retrospective ranking experiments
  • assay-conditioned ranking features in a downstream workflow

Not reasonable uses:

  • reading the raw score as a probability of success
  • predicting exact IC50 / EC50 / Ki values
  • comparing raw scores across unrelated runs as if they were globally calibrated

How to run it locally

This repository is self-contained for inference. You do not need the original training codebase to run the published model.

Install

python -m pip install -r requirements.txt

Minimal local example

from bioassayalign_compatibility import (
    AssayQuery,
    load_compatibility_model_from_hub,
    rank_compounds,
    serialize_assay_query,
)

model = load_compatibility_model_from_hub(
    "lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility"
)

assay_text = serialize_assay_query(
    AssayQuery(
        title="JAK2 inhibition assay",
        description="Cell-based luminescence assay measuring JAK2 inhibition in HEK293 cells.",
        organism="Homo sapiens",
        readout="luminescence",
        assay_format="cell-based",
        assay_type="inhibition",
        target_uniprot=["O60674"],
    )
)

results = rank_compounds(
    model,
    assay_text=assay_text,
    smiles_list=[
        "CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1",
        "c1ccccc1",
        "CCO",
    ],
)

for row in results:
    print(row)

What to provide

Best practice:

  • provide structured assay fields rather than one free-form paragraph
  • include target, readout, organism, and format when known
  • submit one parent or cleaned SMILES per candidate

Recommended assay fields:

  • title
  • description
  • organism
  • readout
  • assay_format
  • assay_type
  • target_uniprot

The model is reasonably robust to wording changes, but missing metadata can reduce ranking quality.

Model details

Published artifact configuration:

Component Value
Assay encoder Qwen/Qwen3-Embedding-0.6B
Assay encoder training Frozen
Assay metadata features Enabled, 128 dims
Molecule features Morgan fingerprints (r=2,3, 2048 bits each), chirality, MACCS, 30 RDKit descriptors
Projection dimension 512
Hidden dimension 1024
Dropout 0.12
Final score Learned compatibility head output

Important:

  • the published score is not a raw embedding dot product
  • the ranking comes from the learned scorer head

Training data

The public artifact was trained on a frozen assay-compound corpus derived from:

  • PubChem BioAssay
  • ChEMBL

The published model uses the prepared compatibility-ranking subset from:

Prepared training dataset

Field Value
Assays 11,195
Candidate-pool rows 1,432,532
Training groups 508,216
Train assays 8,967
Validation assays 1,117
Test assays 1,111

Preparation rules

Rule Value
Minimum actives per assay 4
Minimum inactives per assay 16
Maximum actives per assay 48
Maximum inactives per assay 192
Molecule standardization Enabled
Source manifest SHA256 e4766477b64860952258cb4b76567b83061d5de44bb5f3b322ecdfe54f19910b

Each training group contains:

  • one assay
  • one positive compound
  • multiple explicit same-assay inactive compounds

This is a ranking setup, not a generic text-retrieval setup.

Training configuration

Field Value
Framework pytorch_head_only_compatibility_ranking
Learning rate 1.5e-3
Batch size 192
Weight decay 1e-4
Hard-negative fraction 0.5
Negatives per example 15
Negative sets per positive 2
Max epochs 30
Early stopping patience 5
Early stopping min delta 0.001
Best epoch 9

Results

Main evaluation

Split Mean AUPRC Random-baseline AUPRC Hit@10 Mean AUROC Mean nDCG@50
Validation 0.6214 0.2678 0.9722 0.7767 0.7140
Test 0.6339 0.2749 0.9739 0.7815 0.7250

Interpretation:

  • the model materially beats the random ranking baseline
  • it is strongest as a within-list ranking tool
  • the main output to trust is the ranking order and shortlist separation, not the raw score magnitude

Score interpretation

The raw output is a learned compatibility logit-like score.

What it means:

  • higher is better
  • differences are meaningful within the same submitted list
  • absolute values are not calibrated across unrelated runs

Example:

  • candidate A score: 6.25
  • candidate B score: -8.65
  • candidate C score: -23.37

This does not mean A has a literal probability or potency attached to it. It means A ranked substantially above B and C for that submitted assay and candidate set.

For user-facing interpretation, the recommended order is:

  1. rank
  2. relative shortlist score within the submitted list
  3. chemistry context columns
  4. raw model score only for debugging or export

If you want a normalized within-list view, you can compute:

  • min-max scaling to 0–100
  • or softmax over the submitted list

Those are still not calibrated biological probabilities.

Example predictions

These examples were produced from the published weights.

Example: JAK2 cell assay

Assay:

  • title: JAK2 inhibition assay
  • description: Cell-based luminescence assay measuring JAK2 inhibition in HEK293 cells.
  • organism: Homo sapiens
  • readout: luminescence
  • assay format: cell-based
  • assay type: inhibition
  • target UniProt: O60674
Rank Candidate SMILES Raw score
1 CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1 6.2590
2 Cc1cc(=O)n(C)c(=O)[nH]1 -8.6542
3 CCO -12.8678
4 CCOc1ccc2nc(N3CCN(C)CC3)n(C)c(=O)c2c1 -23.3741

Example: ALDH1A1 fluorescence assay

Assay:

  • title: ALDH1A1 inhibition assay
  • description: Cell-based fluorescence assay measuring ALDH1A1 inhibition in human cells.
  • organism: Homo sapiens
  • readout: fluorescence
  • assay format: cell-based
  • assay type: inhibition
  • target UniProt: P00352
Rank Candidate SMILES Raw score
1 CCOc1ccccc1 -26.9257
2 Cc1cc(=O)n(C)c(=O)[nH]1 -38.5073
3 CCN(CC)CCOc1ccccc1 -39.1753
4 CCO -42.9016

Limitations

  • The score is not a calibrated probability.
  • The model does not predict exact potency values.
  • The benchmark is assay-held-out, not a universal unseen-scaffold benchmark.
  • Public assay data is noisy and assay protocols are heterogeneous.
  • Some assays remain difficult and yield only moderate separation.
  • Use the model as a ranking aid, not as a stand-alone medicinal chemistry decision system.

Repository contents

Files provided in this HF model repo:

  • best_model.pt
  • training_metadata.json
  • training_summary.json
  • bioassayalign_compatibility.py
  • requirements.txt

Interactive Space

Use the model in the companion Space:

  • https://huggingface.co/spaces/lighteternal/BioAssayAlign-Compatibility-Explorer
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility 1