---
title: BioAssayAlign Compatibility Explorer
emoji: 🧪
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 6.9.0
python_version: "3.10"
app_file: app.py
pinned: false
license: mit
short_description: Rank a candidate molecule list against a bioassay.
---

# BioAssayAlign Compatibility Explorer

BioAssayAlign is an **assay-conditioned molecule ranking** tool.

You provide:
- a bioassay description and optional metadata
- a list of candidate SMILES

The model returns:
- a ranked list of molecules
- a compatibility score for each one
- explicit flags for invalid SMILES

## What It Is

This is not a chatbot. It is not a potency predictor.

It is a **ranking model** trained on a frozen public bioassay dataset built from PubChem BioAssay and ChEMBL. It is designed to answer:

> “Given this assay, which molecules should I look at first?”

## What The Score Means

- The app shows a **priority band** and a **list-relative score** first.
- Those values explain the ranking better than the raw model score.
- The raw score is **not** a probability. It is an uncalibrated ranking value from the scorer head.
- The strongest molecule in your submitted list will be near the top of the `0–100` relative scale.

## How To Use It

1. Enter the assay title and description in plain scientific language.
2. Add metadata if you know it:
   - organism
   - readout
   - assay format
   - assay type
   - target UniProt ID
3. Paste one SMILES per line or upload a CSV with a `smiles` column.
4. Run ranking.
5. Read the output in this order:
   - `priority`
   - `relative score`
   - chemistry context columns (`MolWt`, `logP`, `TPSA`)
   - raw model score only if needed

## Recommended Input Style

The model is most reliable when assay information is provided as structured fields:
- title
- description
- organism
- readout
- assay format
- assay type
- target UniProt IDs

You can paste SMILES directly or upload a CSV with a `smiles` or `canonical_smiles` column.

## Good Uses

- ranking a screening shortlist for a new assay concept
- triaging compounds before a more expensive downstream model or wet-lab step
- testing how sensitive rankings are to assay wording and metadata

## Example Assays Included In The UI

- JAK2 cell assay
- ALDH1A1 fluorescence assay
- BTK binding quick check

These examples call the live model. They are not screenshots or mocked outputs.

## Limits

- This is a public-data model, not a medicinal chemistry oracle.
- It does not predict IC50 directly.
- It is strongest as a **relative ranking tool** over a candidate list you already care about.

## Runtime Notes

- The first request can be slower because the Space warms the model in the background.
- Large candidate lists increase runtime. For interactive use, start with a few hundred molecules.

## Model

The Space reads the model repo from the `MODEL_REPO_ID` environment variable.

Default:
- `lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility`

If the champion changes later, the Space can point to a new model repo without changing the UI.