rasayan-tox21 / README.md
root
Link Space to model
9db4040
metadata
title: Rasayan Tox21 Classifier
emoji: ☠️
colorFrom: red
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: SNN ensemble for Tox21 toxicity prediction
tags:
  - toxicity
  - tox21
  - drug-discovery
  - chemistry
  - snn
  - molecular-property-prediction
models:
  - rasayan-labs/rasayan-tox21-snn

Rasayan Tox21 Classifier

Tox21 SNN 12 Endpoints License

A production-ready Self-Normalizing Neural Network (SNN) ensemble for predicting molecular toxicity across the 12 Tox21 Challenge endpoints. Built for the ml-jku Tox21 Leaderboard.

Model Overview

Property Value
Architecture 10-fold ensemble of SNNs
Parameters ~19M total
Hidden Layers 8 layers × 768 units
Activation SELU + AlphaDropout
Training 300 epochs, 40-fold CV
CV AUC 0.882 ± 0.021

Molecular Features (11,377 total)

Feature Type Dimensions Description
ECFP6 8,192 Extended-connectivity fingerprints (radius 3)
MACCS Keys 167 Structural keys for substructure screening
RDKit Descriptors 208 Physicochemical properties (LogP, TPSA, MW, etc.)
Toxicophores 1,868 SMARTS-based toxicity structural alerts
Structural Filters 815 PAINS, BRENK, NIH, ZINC filter alerts
Target Similarity 127 Tanimoto similarity to known receptor ligands

Training Details

  • Loss Function: Focal Loss (γ=2.5, α=0.25) for class imbalance
  • Regularization: Label smoothing (0.1), Mixup augmentation (α=0.2)
  • Feature Selection: Variance-based selection per fold (ECFP, toxicophores)
  • Normalization: SquashScaler (StandardScaler → tanh → StandardScaler)
  • Ensemble Selection: Top-10 folds from 40-fold stratified CV

Tox21 Endpoints

Nuclear Receptor Panel

Endpoint Target Biological Significance
NR-AR Androgen Receptor Male reproductive toxicity
NR-AR-LBD AR Ligand Binding Domain Direct AR modulation
NR-AhR Aryl Hydrocarbon Receptor Dioxin-like toxicity, carcinogenesis
NR-Aromatase CYP19A1 Enzyme Estrogen synthesis disruption
NR-ER Estrogen Receptor Endocrine disruption
NR-ER-LBD ER Ligand Binding Domain Direct ER modulation
NR-PPAR-gamma PPARγ Metabolic disruption

Stress Response Panel

Endpoint Target Biological Significance
SR-ARE Antioxidant Response Element Oxidative stress
SR-ATAD5 ATAD5 DNA damage response
SR-HSE Heat Shock Element Protein folding stress
SR-MMP Mitochondrial Membrane Potential Mitochondrial toxicity
SR-p53 Tumor Protein p53 Genotoxicity

API Endpoints

Endpoint Method Description
/metadata GET Model configuration and capabilities
/predict POST Toxicity predictions for SMILES
/health GET Health check

Usage

Python

import requests

response = requests.post(
    "https://rasayan-labs-rasayan-tox21.hf.space/predict",
    json={"smiles": ["CC(=O)Nc1ccc(O)cc1", "c1ccccc1"]}
)

predictions = response.json()["predictions"]
for smiles, scores in predictions.items():
    print(f"{smiles}:")
    for target, prob in sorted(scores.items(), key=lambda x: -x[1])[:3]:
        print(f"  {target}: {prob:.1%}")

cURL

curl -X POST "https://rasayan-labs-rasayan-tox21.hf.space/predict" \
  -H "Content-Type: application/json" \
  -d '{"smiles": ["CCO", "c1ccccc1"]}'

Response Format

{
  "predictions": {
    "CCO": {
      "NR-AR": 0.041,
      "NR-AR-LBD": 0.040,
      "NR-AhR": 0.049,
      "NR-Aromatase": 0.078,
      "NR-ER": 0.133,
      "NR-ER-LBD": 0.076,
      "NR-PPAR-gamma": 0.058,
      "SR-ARE": 0.100,
      "SR-ATAD5": 0.038,
      "SR-HSE": 0.066,
      "SR-MMP": 0.082,
      "SR-p53": 0.052
    }
  },
  "model_info": {
    "name": "Rasayan Tox21 SNN Ensemble",
    "version": "1.0.0"
  }
}

Interpretation Guide

Probability Risk Level Recommendation
< 0.2 Minimal Unlikely to be active
0.2 - 0.4 Low Monitor for chronic exposure
0.4 - 0.7 Moderate Further investigation warranted
≥ 0.7 High Strong toxicity signal

References

License

Apache 2.0


Built by Rasayan Labs