--- title: BioAssayAlign Compatibility Explorer emoji: 🧪 colorFrom: green colorTo: red sdk: gradio sdk_version: 6.9.0 python_version: "3.10" app_file: app.py pinned: false license: mit short_description: Rank a candidate molecule list against a bioassay. --- # BioAssayAlign Compatibility Explorer BioAssayAlign is an **assay-conditioned molecule ranking** tool. You provide: - a bioassay description and optional metadata - a list of candidate SMILES The model returns: - a ranked list of molecules - a compatibility score for each one - explicit flags for invalid SMILES ## What It Is This is not a chatbot. It is not a potency predictor. It is a **ranking model** trained on a frozen public bioassay dataset built from PubChem BioAssay and ChEMBL. It is designed to answer: > “Given this assay, which molecules should I look at first?” ## What The Score Means - The app shows a **priority band** and a **list-relative score** first. - Those values explain the ranking better than the raw model score. - The raw score is **not** a probability. It is an uncalibrated ranking value from the scorer head. - The strongest molecule in your submitted list will be near the top of the `0–100` relative scale. ## How To Use It 1. Enter the assay title and description in plain scientific language. 2. Add metadata if you know it: - organism - readout - assay format - assay type - target UniProt ID 3. Paste one SMILES per line or upload a CSV with a `smiles` column. 4. Run ranking. 5. Read the output in this order: - `priority` - `relative score` - chemistry context columns (`MolWt`, `logP`, `TPSA`) - raw model score only if needed ## Recommended Input Style The model is most reliable when assay information is provided as structured fields: - title - description - organism - readout - assay format - assay type - target UniProt IDs You can paste SMILES directly or upload a CSV with a `smiles` or `canonical_smiles` column. ## Good Uses - ranking a screening shortlist for a new assay concept - triaging compounds before a more expensive downstream model or wet-lab step - testing how sensitive rankings are to assay wording and metadata ## Example Assays Included In The UI - JAK2 cell assay - ALDH1A1 fluorescence assay - BTK binding quick check These examples call the live model. They are not screenshots or mocked outputs. ## Limits - This is a public-data model, not a medicinal chemistry oracle. - It does not predict IC50 directly. - It is strongest as a **relative ranking tool** over a candidate list you already care about. ## Runtime Notes - The first request can be slower because the Space warms the model in the background. - Large candidate lists increase runtime. For interactive use, start with a few hundred molecules. ## Model The Space reads the model repo from the `MODEL_REPO_ID` environment variable. Default: - `lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility` If the champion changes later, the Space can point to a new model repo without changing the UI.