lighteternal commited on
Commit
f230c49
·
verified ·
1 Parent(s): 0afd4cc

Clarify score semantics and keep stable public model binding

Browse files
Files changed (2) hide show
  1. README.md +1 -1
  2. app.py +3 -2
README.md CHANGED
@@ -37,7 +37,7 @@ It is a **ranking model** trained on a frozen public bioassay dataset built from
37
 
38
  - The app shows a **priority band** and a **list-relative score** first.
39
  - Those values explain the ranking better than the raw model score.
40
- - The raw score is **not** a probability. Use it only for debugging.
41
  - The strongest molecule in your submitted list will be near the top of the `0–100` relative scale.
42
 
43
  ## How To Use It
 
37
 
38
  - The app shows a **priority band** and a **list-relative score** first.
39
  - Those values explain the ranking better than the raw model score.
40
+ - The raw score is **not** a probability. It is an uncalibrated ranking value from the scorer head.
41
  - The strongest molecule in your submitted list will be near the top of the `0–100` relative scale.
42
 
43
  ## How To Use It
app.py CHANGED
@@ -290,7 +290,7 @@ def _build_summary(query_text: str, valid_rows: list[dict[str, Any]], invalid_ro
290
  chunks.append(f"- Warning: {warning}")
291
  chunks.append("")
292
  chunks.append(
293
- "Use the **priority band** and **list-relative score** first. The raw model score is only a debugging value. "
294
  "A candidate with `relative score 100` is the strongest item in your submitted list, not in all chemistry."
295
  )
296
  return "\n".join(chunks)
@@ -535,7 +535,8 @@ Use structured assay fields when possible. Missing fields are allowed, but speci
535
  - `Middle pack`
536
  - `Low priority`
537
  - **relative_score_100** rescales the submitted list so the strongest candidate is near `100` and the weakest is near `0`.
538
- - **model_score** is the raw internal score. It is useful for debugging, not for scientific interpretation.
 
539
  - **mol_wt / logp / tpsa** are quick chemistry context columns so you can sanity-check what the model surfaced.
540
 
541
  ### Good input habits
 
290
  chunks.append(f"- Warning: {warning}")
291
  chunks.append("")
292
  chunks.append(
293
+ "Use the **priority band** and **list-relative score** first. The raw model score is an uncalibrated logit-like ranking value. "
294
  "A candidate with `relative score 100` is the strongest item in your submitted list, not in all chemistry."
295
  )
296
  return "\n".join(chunks)
 
535
  - `Middle pack`
536
  - `Low priority`
537
  - **relative_score_100** rescales the submitted list so the strongest candidate is near `100` and the weakest is near `0`.
538
+ - **model_score** is the raw internal ranking score. It behaves like a logit-like utility value, not a probability.
539
+ - If you need a normalized shortlist view, treat the model score as a list-relative ranking signal and rescale within your submitted list.
540
  - **mol_wt / logp / tpsa** are quick chemistry context columns so you can sanity-check what the model surfaced.
541
 
542
  ### Good input habits