--- title: Rasayan Tox21 Classifier emoji: ☠️ colorFrom: red colorTo: purple sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: SNN ensemble for Tox21 toxicity prediction tags: - toxicity - tox21 - drug-discovery - chemistry - snn - molecular-property-prediction models: - rasayan-labs/rasayan-tox21-snn --- # Rasayan Tox21 Classifier

A production-ready **Self-Normalizing Neural Network (SNN) ensemble** for predicting molecular toxicity across the 12 Tox21 Challenge endpoints. Built for the [ml-jku Tox21 Leaderboard](https://huggingface.co/spaces/ml-jku/tox21_leaderboard). ## Model Overview | Property | Value | |----------|-------| | **Architecture** | 10-fold ensemble of SNNs | | **Parameters** | ~19M total | | **Hidden Layers** | 8 layers × 768 units | | **Activation** | SELU + AlphaDropout | | **Training** | 300 epochs, 40-fold CV | | **CV AUC** | 0.882 ± 0.021 | ## Molecular Features (11,377 total) | Feature Type | Dimensions | Description | |--------------|------------|-------------| | **ECFP6** | 8,192 | Extended-connectivity fingerprints (radius 3) | | **MACCS Keys** | 167 | Structural keys for substructure screening | | **RDKit Descriptors** | 208 | Physicochemical properties (LogP, TPSA, MW, etc.) | | **Toxicophores** | 1,868 | SMARTS-based toxicity structural alerts | | **Structural Filters** | 815 | PAINS, BRENK, NIH, ZINC filter alerts | | **Target Similarity** | 127 | Tanimoto similarity to known receptor ligands | ## Training Details - **Loss Function**: Focal Loss (γ=2.5, α=0.25) for class imbalance - **Regularization**: Label smoothing (0.1), Mixup augmentation (α=0.2) - **Feature Selection**: Variance-based selection per fold (ECFP, toxicophores) - **Normalization**: SquashScaler (StandardScaler → tanh → StandardScaler) - **Ensemble Selection**: Top-10 folds from 40-fold stratified CV ## Tox21 Endpoints ### Nuclear Receptor Panel | Endpoint | Target | Biological Significance | |----------|--------|------------------------| | **NR-AR** | Androgen Receptor | Male reproductive toxicity | | **NR-AR-LBD** | AR Ligand Binding Domain | Direct AR modulation | | **NR-AhR** | Aryl Hydrocarbon Receptor | Dioxin-like toxicity, carcinogenesis | | **NR-Aromatase** | CYP19A1 Enzyme | Estrogen synthesis disruption | | **NR-ER** | Estrogen Receptor | Endocrine disruption | | **NR-ER-LBD** | ER Ligand Binding Domain | Direct ER modulation | | **NR-PPAR-gamma** | PPARγ | Metabolic disruption | ### Stress Response Panel | Endpoint | Target | Biological Significance | |----------|--------|------------------------| | **SR-ARE** | Antioxidant Response Element | Oxidative stress | | **SR-ATAD5** | ATAD5 | DNA damage response | | **SR-HSE** | Heat Shock Element | Protein folding stress | | **SR-MMP** | Mitochondrial Membrane Potential | Mitochondrial toxicity | | **SR-p53** | Tumor Protein p53 | Genotoxicity | ## API Endpoints | Endpoint | Method | Description | |----------|--------|-------------| | `/metadata` | GET | Model configuration and capabilities | | `/predict` | POST | Toxicity predictions for SMILES | | `/health` | GET | Health check | ## Usage ### Python ```python import requests response = requests.post( "https://rasayan-labs-rasayan-tox21.hf.space/predict", json={"smiles": ["CC(=O)Nc1ccc(O)cc1", "c1ccccc1"]} ) predictions = response.json()["predictions"] for smiles, scores in predictions.items(): print(f"{smiles}:") for target, prob in sorted(scores.items(), key=lambda x: -x[1])[:3]: print(f" {target}: {prob:.1%}") ``` ### cURL ```bash curl -X POST "https://rasayan-labs-rasayan-tox21.hf.space/predict" \ -H "Content-Type: application/json" \ -d '{"smiles": ["CCO", "c1ccccc1"]}' ``` ## Response Format ```json { "predictions": { "CCO": { "NR-AR": 0.041, "NR-AR-LBD": 0.040, "NR-AhR": 0.049, "NR-Aromatase": 0.078, "NR-ER": 0.133, "NR-ER-LBD": 0.076, "NR-PPAR-gamma": 0.058, "SR-ARE": 0.100, "SR-ATAD5": 0.038, "SR-HSE": 0.066, "SR-MMP": 0.082, "SR-p53": 0.052 } }, "model_info": { "name": "Rasayan Tox21 SNN Ensemble", "version": "1.0.0" } } ``` ## Interpretation Guide | Probability | Risk Level | Recommendation | |-------------|------------|----------------| | < 0.2 | Minimal | Unlikely to be active | | 0.2 - 0.4 | Low | Monitor for chronic exposure | | 0.4 - 0.7 | Moderate | Further investigation warranted | | ≥ 0.7 | High | Strong toxicity signal | ## References - **Tox21 Challenge**: [NIH Tox21 Data Challenge](https://tripod.nih.gov/tox21/challenge/) - **SNN Architecture**: [Klambauer et al., 2017](https://arxiv.org/abs/1706.02515) - **Leaderboard**: [ml-jku Tox21 Leaderboard](https://huggingface.co/spaces/ml-jku/tox21_leaderboard) ## License Apache 2.0 ---

Built by Rasayan Labs