Demo
Rasayan Tox21 SNN Ensemble
A Self-Normalizing Neural Network (SNN) ensemble for predicting molecular toxicity across 12 Tox21 Challenge endpoints. Trained on the NIH Tox21 dataset with extensive feature engineering and rigorous cross-validation.
Model Description
This model predicts the probability of a molecule being active (toxic) against 12 biological targets from the Tox21 Challenge:
Nuclear Receptor Panel
| Endpoint | Target | Description |
|---|---|---|
| NR-AR | Androgen Receptor | Male reproductive toxicity |
| NR-AR-LBD | AR Ligand Binding Domain | Direct AR modulation |
| NR-AhR | Aryl Hydrocarbon Receptor | Dioxin-like toxicity |
| NR-Aromatase | CYP19A1 | Estrogen synthesis disruption |
| NR-ER | Estrogen Receptor | Endocrine disruption |
| NR-ER-LBD | ER Ligand Binding Domain | Direct ER modulation |
| NR-PPAR-gamma | PPARγ | Metabolic disruption |
Stress Response Panel
| Endpoint | Target | Description |
|---|---|---|
| SR-ARE | Antioxidant Response Element | Oxidative stress |
| SR-ATAD5 | ATAD5 | DNA damage response |
| SR-HSE | Heat Shock Element | Protein folding stress |
| SR-MMP | Mitochondrial Membrane Potential | Mitochondrial toxicity |
| SR-p53 | Tumor Protein p53 | Genotoxicity |
Architecture
| Component | Specification |
|---|---|
| Type | Self-Normalizing Neural Network |
| Ensemble | 10 models (top from 40-fold CV) |
| Hidden Layers | 8 layers × 768 units |
| Activation | SELU |
| Regularization | AlphaDropout (0.1) |
| Output | Sigmoid (12 endpoints) |
| Parameters | ~19M total |
Molecular Features (11,377 dimensions)
| Feature Type | Dimensions | Description |
|---|---|---|
| ECFP6 | 8,192 | Extended-connectivity fingerprints (radius 3) |
| MACCS Keys | 167 | Structural keys for substructure screening |
| RDKit Descriptors | 208 | Physicochemical properties |
| Toxicophores | 1,868 | SMARTS-based toxicity alerts |
| Structural Filters | 815 | PAINS, BRENK, NIH, ZINC alerts |
| Target Similarity | 127 | Tanimoto similarity to known ligands |
Training
| Parameter | Value |
|---|---|
| Dataset | Tox21 Challenge (7,831 compounds) |
| Validation | 40-fold Stratified CV |
| Epochs | 300 |
| Batch Size | 256 |
| Optimizer | AdamW (lr=1e-4, weight_decay=0.01) |
| Loss | Focal Loss (γ=2.5, α=0.25) |
| Regularization | Label Smoothing (0.1), Mixup (α=0.2) |
| CV AUC | 0.882 ± 0.021 |
Usage
With the Inference API
import requests
response = requests.post(
"https://rasayan-labs-rasayan-tox21.hf.space/predict",
json={"smiles": ["CCO", "c1ccccc1"]}
)
predictions = response.json()["predictions"]
for smiles, scores in predictions.items():
print(f"{smiles}:")
for target, prob in sorted(scores.items(), key=lambda x: -x[1])[:3]:
print(f" {target}: {prob:.1%}")
Direct Model Loading
import torch
import json
checkpoint = torch.load("ensemble.pt", map_location="cpu")
scalers = checkpoint["scalers"]
feature_indices = checkpoint["feature_indices"]
models = checkpoint["models"]
print(f"Loaded {len(models)} ensemble members")
Files
| File | Description |
|---|---|
ensemble.pt |
PyTorch checkpoint with 10 models + scalers |
config.json |
Model configuration |
toxicophores_validated.json |
1,868 toxicophore SMARTS patterns |
target_ligands_validated.json |
Reference ligands for similarity |
Intended Use
This model is intended for:
- Early-stage drug discovery toxicity screening
- Prioritization of compounds for experimental testing
- Educational purposes in computational toxicology
Limitations
- Trained on Tox21 assay data which may not capture all toxicity mechanisms
- Performance may vary for chemical spaces outside the training domain
- Should not replace experimental validation
Citation
@misc{rasayan-tox21-2026,
author = {Rasayan Labs},
title = {Rasayan Tox21 SNN Ensemble},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/rasayan-labs/rasayan-tox21-snn}
}
License
Apache 2.0
Built by Rasayan Labs
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support