Instructions to use gokulalgates/nrtoxpred-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use gokulalgates/nrtoxpred-models with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("gokulalgates/nrtoxpred-models", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
File size: 3,477 Bytes
93ec7ee | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | ---
language: en
license: mit
tags:
- toxicity
- cheminformatics
- nuclear-receptors
- sklearn
- svm
- rdkit
- drug-discovery
library_name: sklearn
---
# NR-ToxPred Models
Pre-trained machine learning models for predicting the binding activity of small molecules against **nine human nuclear receptors (NRs)**.
These models are used by the [NR-ToxPred GUI application](https://github.com/gokulalgates/NRToxPred-GUI) β a desktop app that requires no coding experience.
---
## What this repository contains
| Folder | Contents |
|--------|----------|
| `MODELS/morgan/` | SVM classifiers trained on Morgan (ECFP6) fingerprints β one per receptor |
| `MODELS/MACCS/` | SVM classifiers trained on MACCS Keys β one per receptor |
| `MODELS/ARclasses.npy` | Label encoder (Active / Inactive) |
| `X_train/` | Training set SMILES used for Applicability Domain assessment |
> SuperLearner ensemble models are not included here due to their size (1β1.5 GB each).
---
## Receptors covered
| Receptor | Full Name |
|----------|-----------|
| AR | Androgen Receptor |
| ERA | Estrogen Receptor Alpha |
| ERB | Estrogen Receptor Beta |
| FXR | Farnesoid X Receptor |
| GR | Glucocorticoid Receptor |
| PPARD | Peroxisome Proliferator-Activated Receptor Delta |
| PPARG | Peroxisome Proliferator-Activated Receptor Gamma |
| PR | Progesterone Receptor |
| RXR | Retinoid X Receptor |
---
## How to use
### Option A β Desktop GUI (recommended, no coding needed)
Download the NR-ToxPred GUI from GitHub and run the installer. The app will download these models automatically on first launch.
π **[NR-ToxPred GUI on GitHub](https://github.com/gokulalgates/NRToxPred-GUI)**
### Option B β Python (programmatic use)
```python
from huggingface_hub import hf_hub_download
import pickle, numpy as np
from rdkit import Chem
from rdkit.Chem import AllChem
# Download a model
model_path = hf_hub_download(
repo_id="gokulalgates/nrtoxpred-models",
filename="MODELS/morgan/ARsvm_best.model",
repo_type="model",
)
# Load model
model = pickle.load(open(model_path, "rb"))
# Generate Morgan fingerprint (ECFP6, 1024 bits)
mol = Chem.MolFromSmiles("CC(C)(c1ccc(O)cc1)c1ccc(O)cc1") # bisphenol A
fp = AllChem.GetMorganFingerprintAsBitVect(mol, radius=3, nBits=1024)
X = np.array(fp).reshape(1, -1)
# Predict
label_enc = {0: "Inactive", 1: "Active"}
pred = model.predict(X)[0]
print(f"AR prediction: {pred}")
```
---
## Model details
| Property | Value |
|----------|-------|
| Algorithm | Support Vector Machine (SVM) |
| Fingerprints | Morgan ECFP6 (radius=3, 1024 bits) and MACCS Keys (167 bits) |
| Framework | scikit-learn 0.23.2 |
| Task | Binary classification (Active / Inactive) |
| Applicability Domain | Tanimoto fingerprint similarity to training set |
---
## Applicability Domain
Each prediction comes with a reliability label:
- **Reliable** β the compound is similar (Tanimoto β₯ 0.25) to at least one training set compound
- **Unreliable** β the compound lies outside the training chemical space; interpret with caution
The `X_train/` folder contains the training set SMILES used to compute these assessments.
---
## Citation
If you use these models in your research, please cite:
> Predicting the binding of small molecules to nuclear receptors using machine learning.
> *Brief Bioinform.* 2022 May 13;23(3):bbac114.
> doi: [10.1093/bib/bbac114](https://doi.org/10.1093/bib/bbac114)
---
## License
MIT License
|