Spaces:
Sleeping
Sleeping
File size: 5,275 Bytes
0024d0e 9db4040 0024d0e 0084cd1 0024d0e 0084cd1 0024d0e 1843b40 0024d0e 1843b40 0024d0e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | ---
title: Rasayan Tox21 Classifier
emoji: ☠️
colorFrom: red
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: SNN ensemble for Tox21 toxicity prediction
tags:
- toxicity
- tox21
- drug-discovery
- chemistry
- snn
- molecular-property-prediction
models:
- rasayan-labs/rasayan-tox21-snn
---
# Rasayan Tox21 Classifier
<p align="center">
<img src="https://img.shields.io/badge/Tox21-Challenge-red" alt="Tox21">
<img src="https://img.shields.io/badge/Architecture-SNN-blue" alt="SNN">
<img src="https://img.shields.io/badge/Endpoints-12-green" alt="12 Endpoints">
<img src="https://img.shields.io/badge/License-Apache_2.0-yellow" alt="License">
</p>
A production-ready **Self-Normalizing Neural Network (SNN) ensemble** for predicting molecular toxicity across the 12 Tox21 Challenge endpoints. Built for the [ml-jku Tox21 Leaderboard](https://huggingface.co/spaces/ml-jku/tox21_leaderboard).
## Model Overview
| Property | Value |
|----------|-------|
| **Architecture** | 10-fold ensemble of SNNs |
| **Parameters** | ~19M total |
| **Hidden Layers** | 8 layers × 768 units |
| **Activation** | SELU + AlphaDropout |
| **Training** | 300 epochs, 40-fold CV |
| **CV AUC** | 0.882 ± 0.021 |
## Molecular Features (11,377 total)
| Feature Type | Dimensions | Description |
|--------------|------------|-------------|
| **ECFP6** | 8,192 | Extended-connectivity fingerprints (radius 3) |
| **MACCS Keys** | 167 | Structural keys for substructure screening |
| **RDKit Descriptors** | 208 | Physicochemical properties (LogP, TPSA, MW, etc.) |
| **Toxicophores** | 1,868 | SMARTS-based toxicity structural alerts |
| **Structural Filters** | 815 | PAINS, BRENK, NIH, ZINC filter alerts |
| **Target Similarity** | 127 | Tanimoto similarity to known receptor ligands |
## Training Details
- **Loss Function**: Focal Loss (γ=2.5, α=0.25) for class imbalance
- **Regularization**: Label smoothing (0.1), Mixup augmentation (α=0.2)
- **Feature Selection**: Variance-based selection per fold (ECFP, toxicophores)
- **Normalization**: SquashScaler (StandardScaler → tanh → StandardScaler)
- **Ensemble Selection**: Top-10 folds from 40-fold stratified CV
## Tox21 Endpoints
### Nuclear Receptor Panel
| Endpoint | Target | Biological Significance |
|----------|--------|------------------------|
| **NR-AR** | Androgen Receptor | Male reproductive toxicity |
| **NR-AR-LBD** | AR Ligand Binding Domain | Direct AR modulation |
| **NR-AhR** | Aryl Hydrocarbon Receptor | Dioxin-like toxicity, carcinogenesis |
| **NR-Aromatase** | CYP19A1 Enzyme | Estrogen synthesis disruption |
| **NR-ER** | Estrogen Receptor | Endocrine disruption |
| **NR-ER-LBD** | ER Ligand Binding Domain | Direct ER modulation |
| **NR-PPAR-gamma** | PPARγ | Metabolic disruption |
### Stress Response Panel
| Endpoint | Target | Biological Significance |
|----------|--------|------------------------|
| **SR-ARE** | Antioxidant Response Element | Oxidative stress |
| **SR-ATAD5** | ATAD5 | DNA damage response |
| **SR-HSE** | Heat Shock Element | Protein folding stress |
| **SR-MMP** | Mitochondrial Membrane Potential | Mitochondrial toxicity |
| **SR-p53** | Tumor Protein p53 | Genotoxicity |
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/metadata` | GET | Model configuration and capabilities |
| `/predict` | POST | Toxicity predictions for SMILES |
| `/health` | GET | Health check |
## Usage
### Python
```python
import requests
response = requests.post(
"https://rasayan-labs-rasayan-tox21.hf.space/predict",
json={"smiles": ["CC(=O)Nc1ccc(O)cc1", "c1ccccc1"]}
)
predictions = response.json()["predictions"]
for smiles, scores in predictions.items():
print(f"{smiles}:")
for target, prob in sorted(scores.items(), key=lambda x: -x[1])[:3]:
print(f" {target}: {prob:.1%}")
```
### cURL
```bash
curl -X POST "https://rasayan-labs-rasayan-tox21.hf.space/predict" \
-H "Content-Type: application/json" \
-d '{"smiles": ["CCO", "c1ccccc1"]}'
```
## Response Format
```json
{
"predictions": {
"CCO": {
"NR-AR": 0.041,
"NR-AR-LBD": 0.040,
"NR-AhR": 0.049,
"NR-Aromatase": 0.078,
"NR-ER": 0.133,
"NR-ER-LBD": 0.076,
"NR-PPAR-gamma": 0.058,
"SR-ARE": 0.100,
"SR-ATAD5": 0.038,
"SR-HSE": 0.066,
"SR-MMP": 0.082,
"SR-p53": 0.052
}
},
"model_info": {
"name": "Rasayan Tox21 SNN Ensemble",
"version": "1.0.0"
}
}
```
## Interpretation Guide
| Probability | Risk Level | Recommendation |
|-------------|------------|----------------|
| < 0.2 | Minimal | Unlikely to be active |
| 0.2 - 0.4 | Low | Monitor for chronic exposure |
| 0.4 - 0.7 | Moderate | Further investigation warranted |
| ≥ 0.7 | High | Strong toxicity signal |
## References
- **Tox21 Challenge**: [NIH Tox21 Data Challenge](https://tripod.nih.gov/tox21/challenge/)
- **SNN Architecture**: [Klambauer et al., 2017](https://arxiv.org/abs/1706.02515)
- **Leaderboard**: [ml-jku Tox21 Leaderboard](https://huggingface.co/spaces/ml-jku/tox21_leaderboard)
## License
Apache 2.0
---
<p align="center">
Built by <a href="https://rasayan.ai">Rasayan Labs</a>
</p>
|