lighteternal's picture
Clarify score semantics and keep stable public model binding
f230c49 verified
---
title: BioAssayAlign Compatibility Explorer
emoji: 🧪
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 6.9.0
python_version: "3.10"
app_file: app.py
pinned: false
license: mit
short_description: Rank a candidate molecule list against a bioassay.
---
# BioAssayAlign Compatibility Explorer
BioAssayAlign is an **assay-conditioned molecule ranking** tool.
You provide:
- a bioassay description and optional metadata
- a list of candidate SMILES
The model returns:
- a ranked list of molecules
- a compatibility score for each one
- explicit flags for invalid SMILES
## What It Is
This is not a chatbot. It is not a potency predictor.
It is a **ranking model** trained on a frozen public bioassay dataset built from PubChem BioAssay and ChEMBL. It is designed to answer:
> “Given this assay, which molecules should I look at first?”
## What The Score Means
- The app shows a **priority band** and a **list-relative score** first.
- Those values explain the ranking better than the raw model score.
- The raw score is **not** a probability. It is an uncalibrated ranking value from the scorer head.
- The strongest molecule in your submitted list will be near the top of the `0–100` relative scale.
## How To Use It
1. Enter the assay title and description in plain scientific language.
2. Add metadata if you know it:
- organism
- readout
- assay format
- assay type
- target UniProt ID
3. Paste one SMILES per line or upload a CSV with a `smiles` column.
4. Run ranking.
5. Read the output in this order:
- `priority`
- `relative score`
- chemistry context columns (`MolWt`, `logP`, `TPSA`)
- raw model score only if needed
## Recommended Input Style
The model is most reliable when assay information is provided as structured fields:
- title
- description
- organism
- readout
- assay format
- assay type
- target UniProt IDs
You can paste SMILES directly or upload a CSV with a `smiles` or `canonical_smiles` column.
## Good Uses
- ranking a screening shortlist for a new assay concept
- triaging compounds before a more expensive downstream model or wet-lab step
- testing how sensitive rankings are to assay wording and metadata
## Example Assays Included In The UI
- JAK2 cell assay
- ALDH1A1 fluorescence assay
- BTK binding quick check
These examples call the live model. They are not screenshots or mocked outputs.
## Limits
- This is a public-data model, not a medicinal chemistry oracle.
- It does not predict IC50 directly.
- It is strongest as a **relative ranking tool** over a candidate list you already care about.
## Runtime Notes
- The first request can be slower because the Space warms the model in the background.
- Large candidate lists increase runtime. For interactive use, start with a few hundred molecules.
## Model
The Space reads the model repo from the `MODEL_REPO_ID` environment variable.
Default:
- `lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility`
If the champion changes later, the Space can point to a new model repo without changing the UI.