Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -119,10 +119,10 @@ Assay:
|
|
| 119 |
- target UniProt: `O60674`
|
| 120 |
|
| 121 |
Candidate list ranked by the model:
|
| 122 |
-
1. `CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1` β `-
|
| 123 |
-
2. `c1ccccc1` β `-
|
| 124 |
-
3. `CCO` β `-
|
| 125 |
-
4. `CCOc1ccc2nc(N3CCN(C)CC3)n(C)c(=O)c2c1` β `-
|
| 126 |
|
| 127 |
### Example 2: ALDH1A1 fluorescence assay
|
| 128 |
|
|
@@ -136,10 +136,10 @@ Assay:
|
|
| 136 |
- target UniProt: `P00352`
|
| 137 |
|
| 138 |
Candidate list ranked by the model:
|
| 139 |
-
1. `CCOc1ccccc1` β `-
|
| 140 |
-
2. `
|
| 141 |
-
3. `
|
| 142 |
-
4. `CCO` β `-
|
| 143 |
|
| 144 |
The raw values above are model scores. In practice, read them as list-relative ranking values, not calibrated probabilities.
|
| 145 |
|
|
@@ -153,8 +153,8 @@ You can think of it as a **logit-like utility value**:
|
|
| 153 |
- absolute values across unrelated lists are not directly comparable
|
| 154 |
|
| 155 |
Example:
|
| 156 |
-
- a top candidate with score `-
|
| 157 |
-
- another candidate with score `-
|
| 158 |
|
| 159 |
does **not** mean the first compound has negative biological value. It only means the first item scored much better than the second one for that submitted assay-and-list context.
|
| 160 |
|
|
@@ -167,7 +167,7 @@ Softmax example for one list:
|
|
| 167 |
```python
|
| 168 |
from bioassayalign_compatibility import list_softmax_scores
|
| 169 |
|
| 170 |
-
scores = [-
|
| 171 |
relative_probs = list_softmax_scores(scores)
|
| 172 |
print(relative_probs)
|
| 173 |
```
|
|
@@ -295,14 +295,13 @@ The score is:
|
|
| 295 |
- Public assay data contains label noise and heterogeneous assay protocols.
|
| 296 |
- Some assays remain difficult and produce only moderate ranking quality.
|
| 297 |
|
| 298 |
-
##
|
| 299 |
-
|
| 300 |
-
Project code:
|
| 301 |
-
- `https://github.com/lighteternal/bioassayalign-private`
|
| 302 |
|
| 303 |
Model files in this repo:
|
| 304 |
- `best_model.pt`
|
| 305 |
- `training_metadata.json`
|
| 306 |
- `training_summary.json`
|
|
|
|
|
|
|
| 307 |
|
| 308 |
-
|
|
|
|
| 119 |
- target UniProt: `O60674`
|
| 120 |
|
| 121 |
Candidate list ranked by the model:
|
| 122 |
+
1. `CC(=O)Nc1ncc(C#N)c(Nc2ccc(F)c(Cl)c2)n1` β `-8.87`
|
| 123 |
+
2. `c1ccccc1` β `-13.53`
|
| 124 |
+
3. `CCO` β `-21.92`
|
| 125 |
+
4. `CCOc1ccc2nc(N3CCN(C)CC3)n(C)c(=O)c2c1` β `-27.76`
|
| 126 |
|
| 127 |
### Example 2: ALDH1A1 fluorescence assay
|
| 128 |
|
|
|
|
| 136 |
- target UniProt: `P00352`
|
| 137 |
|
| 138 |
Candidate list ranked by the model:
|
| 139 |
+
1. `CCOc1ccccc1` β `-26.93`
|
| 140 |
+
2. `Cc1cc(=O)n(C)c(=O)[nH]1` β `-38.51`
|
| 141 |
+
3. `CCN(CC)CCOc1ccccc1` β `-39.18`
|
| 142 |
+
4. `CCO` β `-42.90`
|
| 143 |
|
| 144 |
The raw values above are model scores. In practice, read them as list-relative ranking values, not calibrated probabilities.
|
| 145 |
|
|
|
|
| 153 |
- absolute values across unrelated lists are not directly comparable
|
| 154 |
|
| 155 |
Example:
|
| 156 |
+
- a top candidate with score `-8.9`
|
| 157 |
+
- another candidate with score `-21.9`
|
| 158 |
|
| 159 |
does **not** mean the first compound has negative biological value. It only means the first item scored much better than the second one for that submitted assay-and-list context.
|
| 160 |
|
|
|
|
| 167 |
```python
|
| 168 |
from bioassayalign_compatibility import list_softmax_scores
|
| 169 |
|
| 170 |
+
scores = [-8.8686, -13.5325, -21.9168]
|
| 171 |
relative_probs = list_softmax_scores(scores)
|
| 172 |
print(relative_probs)
|
| 173 |
```
|
|
|
|
| 295 |
- Public assay data contains label noise and heterogeneous assay protocols.
|
| 296 |
- Some assays remain difficult and produce only moderate ranking quality.
|
| 297 |
|
| 298 |
+
## Files In This Repo
|
|
|
|
|
|
|
|
|
|
| 299 |
|
| 300 |
Model files in this repo:
|
| 301 |
- `best_model.pt`
|
| 302 |
- `training_metadata.json`
|
| 303 |
- `training_summary.json`
|
| 304 |
+
- `bioassayalign_compatibility.py`
|
| 305 |
+
- `requirements.txt`
|
| 306 |
|
| 307 |
+
You can load and run the published model directly from this repo without cloning any separate project codebase.
|