Spaces:

lighteternal
/

BioAssayAlign-Compatibility-Explorer

Running

App Files Files Community

BioAssayAlign-Compatibility-Explorer / README.md

lighteternal

Clarify score semantics and keep stable public model binding

f230c49 verified 3 days ago

preview code

raw

history blame contribute delete

3.06 kB

	---
	title: BioAssayAlign Compatibility Explorer
	emoji: 🧪
	colorFrom: green
	colorTo: red
	sdk: gradio
	sdk_version: 6.9.0
	python_version: "3.10"
	app_file: app.py
	pinned: false
	license: mit
	short_description: Rank a candidate molecule list against a bioassay.
	---

	# BioAssayAlign Compatibility Explorer

	BioAssayAlign is an assay-conditioned molecule ranking tool.

	You provide:
	- a bioassay description and optional metadata
	- a list of candidate SMILES

	The model returns:
	- a ranked list of molecules
	- a compatibility score for each one
	- explicit flags for invalid SMILES

	## What It Is

	This is not a chatbot. It is not a potency predictor.

	It is a ranking model trained on a frozen public bioassay dataset built from PubChem BioAssay and ChEMBL. It is designed to answer:

	> “Given this assay, which molecules should I look at first?”

	## What The Score Means

	- The app shows a priority band and a list-relative score first.
	- Those values explain the ranking better than the raw model score.
	- The raw score is not a probability. It is an uncalibrated ranking value from the scorer head.
	- The strongest molecule in your submitted list will be near the top of the `0–100` relative scale.

	## How To Use It

	1. Enter the assay title and description in plain scientific language.
	2. Add metadata if you know it:
	- organism
	- readout
	- assay format
	- assay type
	- target UniProt ID
	3. Paste one SMILES per line or upload a CSV with a `smiles` column.
	4. Run ranking.
	5. Read the output in this order:
	- `priority`
	- `relative score`
	- chemistry context columns (`MolWt`, `logP`, `TPSA`)
	- raw model score only if needed

	## Recommended Input Style

	The model is most reliable when assay information is provided as structured fields:
	- title
	- description
	- organism
	- readout
	- assay format
	- assay type
	- target UniProt IDs

	You can paste SMILES directly or upload a CSV with a `smiles` or `canonical_smiles` column.

	## Good Uses

	- ranking a screening shortlist for a new assay concept
	- triaging compounds before a more expensive downstream model or wet-lab step
	- testing how sensitive rankings are to assay wording and metadata

	## Example Assays Included In The UI

	- JAK2 cell assay
	- ALDH1A1 fluorescence assay
	- BTK binding quick check

	These examples call the live model. They are not screenshots or mocked outputs.

	## Limits

	- This is a public-data model, not a medicinal chemistry oracle.
	- It does not predict IC50 directly.
	- It is strongest as a relative ranking tool over a candidate list you already care about.

	## Runtime Notes

	- The first request can be slower because the Space warms the model in the background.
	- Large candidate lists increase runtime. For interactive use, start with a few hundred molecules.

	## Model

	The Space reads the model repo from the `MODEL_REPO_ID` environment variable.

	Default:
	- `lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility`

	If the champion changes later, the Space can point to a new model repo without changing the UI.