Spaces:

HarriziSaad
/

Toxicity_2

Build error

App Files Files Community

Harrizi Saad commited on Nov 26, 2025

Commit

01aa9f2

verified ·

1 Parent(s): f82e005

Upload app.py

Browse files

Files changed (1) hide show

app.py +395 -639

app.py CHANGED Viewed

@@ -14,19 +14,19 @@ st.set_page_config(
     initial_sidebar_state="collapsed"
 )
-# Beautiful Professional CSS with animations
 st.markdown("""
 <style>
     @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;600;700;800&display=swap');
     * {
         font-family: 'Inter', sans-serif;
     }
     .main {
         background: linear-gradient(180deg, #0a0e27 0%, #1a1f3a 50%, #0a0e27 100%);
     }
     .main-header {
         font-size: 4rem;
         font-weight: 900;
@@ -37,21 +37,71 @@ st.markdown("""
         margin-bottom: 0.5rem;
         animation: glow 3s ease-in-out infinite alternate;
     }
     @keyframes glow {
         from { filter: drop-shadow(0 0 20px rgba(102, 126, 234, 0.3)); }
         to { filter: drop-shadow(0 0 40px rgba(118, 75, 162, 0.6)); }
     }
     .sub-header {
         font-size: 1.4rem;
         text-align: center;
         color: #a0a6b8;
         margin-bottom: 3rem;
         font-weight: 300;
-        letter-spacing: 0.5px;
     }
     .risk-card-low {
         background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
         padding: 3rem;
@@ -61,12 +111,12 @@ st.markdown("""
         box-shadow: 0 20px 60px rgba(102, 126, 234, 0.5);
         animation: pulse-low 2s ease-in-out infinite;
     }
     @keyframes pulse-low {
         0%, 100% { transform: scale(1); }
-        50% { transform: scale(1.02); }
     }
     .risk-card-moderate {
         background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
         padding: 3rem;
@@ -76,12 +126,12 @@ st.markdown("""
         box-shadow: 0 20px 60px rgba(240, 147, 251, 0.5);
         animation: pulse-moderate 1.5s ease-in-out infinite;
     }
     @keyframes pulse-moderate {
         0%, 100% { transform: scale(1); }
         50% { transform: scale(1.03); }
     }
     .risk-card-high {
         background: linear-gradient(135deg, #ff0844 0%, #ffb199 100%);
         padding: 3rem;
@@ -91,12 +141,12 @@ st.markdown("""
         box-shadow: 0 20px 60px rgba(255, 8, 68, 0.6);
         animation: pulse-high 1s ease-in-out infinite;
     }
     @keyframes pulse-high {
         0%, 100% { transform: scale(1); }
         50% { transform: scale(1.05); }
     }
     .property-card {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.1) 0%, rgba(118, 75, 162, 0.1) 100%);
         border: 2px solid rgba(102, 126, 234, 0.3);
@@ -104,40 +154,14 @@ st.markdown("""
         border-radius: 20px;
         margin: 1rem 0;
         transition: all 0.3s ease;
-        position: relative;
-        overflow: hidden;
-    }
-    .property-card::before {
-        content: '';
-        position: absolute;
-        top: 0;
-        left: -100%;
-        width: 100%;
-        height: 100%;
-        background: linear-gradient(90deg, transparent, rgba(255, 255, 255, 0.1), transparent);
-        transition: left 0.5s;
-    }
-    .property-card:hover::before {
-        left: 100%;
     }
     .property-card:hover {
         transform: translateY(-5px);
         border-color: rgba(102, 126, 234, 0.6);
         box-shadow: 0 10px 30px rgba(102, 126, 234, 0.3);
     }
-    .property-label {
-        color: #a0a6b8;
-        font-size: 0.85rem;
-        font-weight: 600;
-        text-transform: uppercase;
-        letter-spacing: 1.5px;
-        margin-bottom: 0.5rem;
-    }
     .property-value {
         color: #ffffff;
         font-size: 2rem;
@@ -146,15 +170,7 @@ st.markdown("""
         -webkit-background-clip: text;
         -webkit-text-fill-color: transparent;
     }
-    .property-interpretation {
-        color: #8b92a8;
-        font-size: 0.9rem;
-        margin-top: 0.5rem;
-        font-style: italic;
-        line-height: 1.6;
-    }
     .mechanism-card {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.05) 0%, rgba(118, 75, 162, 0.05) 100%);
         border: 2px solid rgba(102, 126, 234, 0.2);
@@ -162,151 +178,208 @@ st.markdown("""
         border-radius: 20px;
         text-align: center;
         transition: all 0.4s ease;
-        position: relative;
     }
     .mechanism-card:hover {
         transform: translateY(-10px) scale(1.05);
         box-shadow: 0 20px 50px rgba(102, 126, 234, 0.4);
-        border-color: rgba(102, 126, 234, 0.8);
-    }
-    .mechanism-icon {
-        font-size: 3rem;
-        margin-bottom: 1rem;
-        filter: drop-shadow(0 0 10px rgba(102, 126, 234, 0.5));
-    }
-    .mechanism-title {
-        font-size: 1rem;
-        color: #a0a6b8;
-        font-weight: 600;
-        text-transform: uppercase;
-        letter-spacing: 1px;
-        margin-bottom: 1rem;
     }
     .mechanism-value {
         font-size: 3rem;
         font-weight: 900;
         background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
         -webkit-background-clip: text;
         -webkit-text-fill-color: transparent;
-        margin-bottom: 0.5rem;
     }
-    .mechanism-bar {
-        width: 100%;
-        height: 8px;
-        background: rgba(255, 255, 255, 0.1);
-        border-radius: 10px;
-        overflow: hidden;
-        margin-top: 1rem;
-    }
-    .mechanism-bar-fill {
-        height: 100%;
-        background: linear-gradient(90deg, #667eea 0%, #764ba2 100%);
-        border-radius: 10px;
-        transition: width 1s ease;
-    }
-    .interpretation-box {
-        background: linear-gradient(135deg, rgba(102, 126, 234, 0.1) 0%, rgba(118, 75, 162, 0.1) 100%);
-        border-left: 5px solid #667eea;
-        padding: 2.5rem;
-        border-radius: 20px;
-        margin: 2rem 0;
-        color: #e8eaf0;
-        line-height: 2;
-        font-size: 1.05rem;
-        box-shadow: 0 10px 40px rgba(0, 0, 0, 0.3);
-    }
-    .interpretation-section {
-        margin: 1.5rem 0;
-        padding-left: 1.5rem;
-        border-left: 3px solid rgba(102, 126, 234, 0.3);
-    }
-    .interpretation-title {
-        color: #667eea;
-        font-size: 1.3rem;
-        font-weight: 700;
-        margin-bottom: 1rem;
-        display: flex;
-        align-items: center;
-    }
-    .interpretation-icon {
-        margin-right: 0.5rem;
-        font-size: 1.5rem;
-    }
     .structure-container {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.05) 0%, rgba(118, 75, 162, 0.05) 100%);
         border: 2px solid rgba(102, 126, 234, 0.3);
         padding: 2rem;
         border-radius: 25px;
         text-align: center;
-        box-shadow: 0 10px 40px rgba(0, 0, 0, 0.3);
     }
-    .cascade-box {
-        background: linear-gradient(135deg, rgba(255, 8, 68, 0.1) 0%, rgba(255, 177, 153, 0.1) 100%);
-        border: 2px solid rgba(255, 8, 68, 0.3);
-        padding: 2rem;
         border-radius: 20px;
         margin: 2rem 0;
-        color: #ffb199;
-    }
-    .cascade-step {
-        display: flex;
-        align-items: center;
-        margin: 1rem 0;
-        font-size: 1.1rem;
-        font-weight: 600;
-    }
-    .cascade-arrow {
-        color: #ff0844;
-        font-size: 2rem;
-        margin: 0 1rem;
-    }
-    .insight-badge {
-        display: inline-block;
-        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
-        color: white;
-        padding: 0.5rem 1rem;
-        border-radius: 20px;
-        font-size: 0.9rem;
-        font-weight: 600;
-        margin: 0.5rem 0.5rem 0.5rem 0;
-        box-shadow: 0 4px 15px rgba(102, 126, 234, 0.4);
-    }
-    .warning-box {
-        background: linear-gradient(135deg, rgba(255, 177, 153, 0.15) 0%, rgba(255, 8, 68, 0.15) 100%);
-        border: 2px solid rgba(255, 8, 68, 0.5);
-        border-radius: 20px;
-        padding: 2rem;
-        margin: 1.5rem 0;
-        color: #ffb199;
-    }
-    .safe-box {
-        background: linear-gradient(135deg, rgba(102, 234, 170, 0.15) 0%, rgba(102, 126, 234, 0.15) 100%);
-        border: 2px solid rgba(102, 234, 170, 0.5);
-        border-radius: 20px;
-        padding: 2rem;
-        margin: 1.5rem 0;
-        color: #66eaaa;
     }
 </style>
 """, unsafe_allow_html=True)
 @st.cache_resource
 def load_models():
     """Load models"""
@@ -324,243 +397,13 @@ def load_models():
         st.error(f"Error: {str(e)}")
         return None
-def get_property_interpretation(prop_name, value, mol):
-    """Deep interpretation of each property"""
-    interpretations = {
-        'MW': {
-            'value': value,
-            'status': '✅ Optimal' if 250 <= value <= 400 else ('⚠️ High' if value > 400 else '⚠️ Low'),
-            'meaning': f"""
-                **What it means:** Molecular weight of {value:.1f} Da.
-                {
-                    'Perfect size for cellular uptake and distribution. Like a key fitting a lock.' if 250 <= value <= 400 else
-                    f'Large molecules (>400 Da) accumulate in cells like cargo ships too big for the harbor. They burden protein degradation systems, leading to cellular stress. Your compound is {value-400:.0f} Da above the safe threshold.' if value > 400 else
-                    'Very small molecules may lack specificity and be rapidly cleared.'
-                }
-            """
-        },
-        'LogP': {
-            'value': value,
-            'status': '✅ Optimal' if 0.5 <= value <= 2.5 else ('🔴 Very High' if value > 4 else ('⚠️ High' if value > 2.5 else '⚠️ Low')),
-            'meaning': f"""
-                **What it means:** LogP of {value:.2f} (lipophilicity = fat-loving tendency).
-                {
-                    'Perfect balance! Can cross membranes but won't get trapped. Like a passport that works everywhere.' if 0.5 <= value <= 2.5 else
-                    f'🚨 CRITICAL: LogP 4-6 is the "danger zone"! Your compound ({value:.2f}) will accumulate 3-5x in mitochondria compared to cytoplasm. Think of it like oil droplets concentrating in the engine - they disrupt the machinery. This causes membrane potential collapse and triggers the entire toxicity cascade.' if 4 <= value <= 6 else
-                    f'High lipophilicity ({value:.2f}) means your compound loves fat more than water. It will get trapped in membranes and mitochondria, unable to escape. This is why many drugs fail - they become "membrane prisoners".' if value > 2.5 else
-                    'Too water-loving. May have poor membrane permeability.'
-                }
-            """
-        },
-        'TPSA': {
-            'value': value,
-            'status': '✅ Optimal' if 50 <= value <= 90 else ('⚠️ Outside optimal' if 40 <= value <= 140 else '🔴 Problematic'),
-            'meaning': f"""
-                **What it means:** {value:.1f} Ų of polar surface area.
-                {
-                    'Goldilocks zone! Can cross membranes AND reach intracellular targets.' if 50 <= value <= 90 else
-                    f'TPSA of {value:.1f} is outside the sweet spot (50-90 Ų). ' + (
-                        'Too polar - struggles to cross lipid membranes. Like trying to push a water balloon through oil.' if value > 140 else
-                        f'In the accessible range (40-140 Ų) where compounds can reach cytoplasmic stress pathways. This is why we see ARE activation at this TPSA.' if 40 <= value <= 140 else
-                        'Very low polarity - will partition heavily into membranes, potentially disrupting them.'
-                    )
-                }
-            """
-        },
-        'AromaticRings': {
-            'value': int(value),
-            'status': '✅ Safe' if value <= 2 else ('⚠️ Moderate' if value == 3 else '🔴 High Risk'),
-            'meaning': f"""
-                **What it means:** {int(value)} aromatic ring(s) - flat, electron-rich structures.
-                {
-                    'Low aromatic content = lower toxicity risk. Aromatic rings are like flat plates that can slide between biological membranes.' if value <= 2 else
-                    f'🚨 WARNING: {int(value)} aromatic rings! Our data shows ≥3 rings cause 3.5x higher mitochondrial toxicity. Why? Aromatic systems can π-stack (like plates stacking) with membrane lipids and intercalate into DNA. They physically disrupt the delicate architecture of mitochondrial cristae where ATP is made. Think of it like putting cardboard sheets between the pages of a book - it disrupts the structure.' if value >= 3 else
-                    'Moderate aromatic content. Monitor for membrane interactions.'
-                }
-            """
-        }
-    }
-    return interpretations.get(prop_name, {'value': value, 'status': 'N/A', 'meaning': ''})
-def generate_deep_interpretation(result, props):
-    """Generate deep, mechanistic interpretation"""
-    overall = result['overall_toxicity']
-    prob_are = result['oxidative_stress']['probability']
-    prob_mmp = result['mitochondrial_dysfunction']['probability']
-    prob_p53 = result['dna_damage']['probability']
-    # Determine primary mechanism
-    mechanisms = [
-        ('Mitochondrial Dysfunction', prob_mmp, '⚡'),
-        ('Oxidative Stress', prob_are, '🔥'),
-        ('DNA Damage', prob_p53, '🧬')
-    ]
-    mechanisms.sort(key=lambda x: x[1], reverse=True)
-    primary_mech = mechanisms[0]
-    if overall['risk_level'] == 'LOW':
-        return f"""
-        <div class="safe-box">
-            <div class="interpretation-title">
-                <span class="interpretation-icon">✅</span>
-                Safe Chemical Space
-            </div>
-            <p><strong>Your compound sits in the safe zone.</strong> It avoids the major toxicity triggers we identified in 11,306 compounds.</p>
-            <div class="interpretation-section">
-                <strong>Why it's safe:</strong>
-                <ul>
-                    <li>{'✅ Optimal molecular weight (250-400 Da) - perfect for cellular handling' if 250 <= props['MW'] <= 400 else '✅ Manageable size'}</li>
-                    <li>{'✅ Balanced lipophilicity (LogP 0.5-2.5) - can travel without getting trapped' if 0.5 <= props['LogP'] <= 2.5 else '✅ Acceptable lipophilicity'}</li>
-                    <li>{'✅ Low aromatic content (≤2 rings) - won't disrupt membranes' if props['AromaticRings'] <= 2 else '✅ Moderate aromatic content'}</li>
-                </ul>
-            </div>
-            <p><strong>Next steps:</strong> This computational prediction is promising, but remember - even "safe" compounds need experimental validation. Test in relevant cell lines, check for off-target effects, and validate the therapeutic window.</p>
-        </div>
-        """
-    elif overall['risk_level'] == 'MODERATE':
-        return f"""
-        <div class="interpretation-box">
-            <div class="interpretation-title">
-                <span class="interpretation-icon">⚠️</span>
-                Moderate Concerns: Mechanistic Analysis
-            </div>
-            <p><strong>Primary concern: {primary_mech[0]} ({primary_mech[1]:.0%})</strong></p>
-            <div class="interpretation-section">
-                <strong>🔬 What's happening at the molecular level:</strong>
-                <br><br>
-                {
-                    f"Your compound's LogP of {props['LogP']:.2f} suggests moderate membrane partitioning. While not in the critical 4-6 range, it may still accumulate in lipid-rich compartments over time. Combined with {int(props['AromaticRings'])} aromatic rings, there's potential for membrane perturbation." if prob_mmp > prob_are and prob_mmp > prob_p53 else
-                    f"Oxidative stress activation ({prob_are:.0%}) suggests your compound can generate or is susceptible to ROS. With {int(props['Heteroatoms'])} heteroatoms, there may be redox-active centers that cycle between oxidized/reduced states, producing superoxide as a byproduct." if prob_are > prob_mmp and prob_are > prob_p53 else
-                    f"DNA damage signaling ({prob_p53:.0%}) with {int(props['RotatableBonds'])} rotatable bonds suggests potential for DNA intercalation or formation of reactive intermediates that alkylate DNA bases."
-                }
-            </div>
-            <div class="interpretation-section">
-                <strong>💡 Medicinal chemistry strategies:</strong>
-                <ul>
-                    {f'<li>Reduce LogP: Add polar groups (OH, NH2, carboxylic acid) to decrease membrane accumulation</li>' if props['LogP'] > 3 else ''}
-                    {f'<li>Reduce aromatic content: Replace one aromatic ring with a saturated heterocycle (piperidine, tetrahydropyran)</li>' if props['AromaticRings'] >= 3 else ''}
-                    {f'<li>Reduce molecular weight: Remove non-essential substituents. Each 50 Da reduction decreases toxicity risk</li>' if props['MW'] > 400 else ''}
-                    {f'<li>Add rigidity: Reduce rotatable bonds with cyclic constraints to prevent DNA intercalation</li>' if props['RotatableBonds'] > 7 else ''}
-                    <li>Consider prodrug strategy: Mask toxic features until metabolic activation at target site</li>
-                </ul>
-            </div>
-            <p><strong>Clinical perspective:</strong> Moderate-risk compounds can sometimes be developed successfully if the therapeutic index is favorable. Focus on: (1) Identifying the therapeutic window, (2) Optimizing PK to minimize tissue accumulation, (3) Considering alternate dosing regimens.</p>
-        </div>
-        """
-    else:  # HIGH RISK
-        # Identify the cascade
-        cascade_active = []
-        if prob_mmp > 0.6:
-            cascade_active.append('Mitochondrial Damage')
-        if prob_are > 0.6:
-            cascade_active.append('Oxidative Stress')
-        if prob_p53 > 0.6:
-            cascade_active.append('DNA Damage')
-        is_full_cascade = len(cascade_active) >= 2
-        # Build the cascade content separately
-        cascade_html = ""
-        if is_full_cascade:
-            cascade_html = f"""
-                <div class="cascade-box">
-                    <p><strong>⚠️ FULL TOXICITY CASCADE DETECTED</strong></p>
-                    <p>Your compound triggers multiple mechanisms in sequence, exactly as our model predicted:</p>
-                    <div class="cascade-step">
-                        <span>Lipophilic Compound (LogP: {props["LogP"]:.2f})</span>
-                        <span class="cascade-arrow">→</span>
-                        <span>Mitochondrial Accumulation</span>
-                    </div>
-                    <div class="cascade-step">
-                        <span class="cascade-arrow">→</span>
-                        <span>Membrane Disruption (MMP: {prob_mmp:.0%})</span>
-                    </div>
-                    <div class="cascade-step">
-                        <span class="cascade-arrow">→</span>
-                        <span>ROS Production</span>
-                    </div>
-                    <div class="cascade-step">
-                        <span class="cascade-arrow">→</span>
-                        <span>Oxidative Stress (ARE: {prob_are:.0%})</span>
-                    </div>
-                    <div class="cascade-step">
-                        <span class="cascade-arrow">→</span>
-                        <span>DNA Damage (p53: {prob_p53:.0%})</span>
-                    </div>
-                    <p>This is the signature of compounds that cause systemic cellular failure.</p>
-                </div>
-            """
-        else:
-            cascade_html = f"<p><strong>Primary mechanism: {primary_mech[0]} at {primary_mech[1]:.0%}</strong></p>"
-        return f"""
-        <div class="warning-box">
-            <div class="interpretation-title">
-                <span class="interpretation-icon">🔴</span>
-                High Toxicity Risk: The Complete Mechanistic Story
-            </div>
-            {cascade_html}
-            <div class="interpretation-section">
-                <strong>🔬 Root causes identified:</strong>
-                <ul>
-                    {f'<li><strong>Critical LogP ({props["LogP"]:.2f}):</strong> In the 4-6 "danger zone". This causes 3-5x mitochondrial accumulation. Your compound will concentrate in the powerhouse of the cell and disrupt electron transport. LogP 4-6 compounds show 10x higher toxicity in our dataset.</li>' if 4 <= props['LogP'] <= 6 else ''}
-                    {f'<li><strong>High LogP ({props["LogP"]:.2f}):</strong> Extremely lipophilic. Will partition heavily into membranes and organelles, unable to distribute normally.</li>' if props['LogP'] > 6 else ''}
-                    {f'<li><strong>Multiple aromatic rings ({int(props["AromaticRings"])}):</strong> Each ring is a flat, electron-rich plate. ≥3 rings can stack (π-π interactions) with membrane lipids and intercalate between DNA bases. Our data: 3.5x higher mitochondrial toxicity with ≥3 rings.</li>' if props['AromaticRings'] >= 3 else ''}
-                    {f'<li><strong>Large molecular weight ({props["MW"]:.0f} Da):</strong> Exceeds the 400 Da safety threshold by {props["MW"]-400:.0f} Da. Large molecules accumulate because cellular machinery can't efficiently process them. They burden proteasomes and autophagy systems.</li>' if props['MW'] > 400 else ''}
-                    {f'<li><strong>High flexibility ({int(props["RotatableBonds"])} rotatable bonds):</strong> Can adopt conformations that fit between DNA base pairs, causing intercalation and strand breaks.</li>' if props['RotatableBonds'] > 7 else ''}
-                    {f'<li><strong>Heteroatom-rich ({int(props["Heteroatoms"])} heteroatoms):</strong> N, O, S atoms can undergo redox cycling, generating superoxide (O2•−) and other ROS. This is metabolic activation of toxicity.</li>' if props['Heteroatoms'] > 7 else ''}
-                </ul>
-            </div>
-            <div class="interpretation-section">
-                <strong>⚠️ Why this matters clinically:</strong>
-                <p>Compounds with this profile typically:</p>
-                <ul>
-                    <li>Show hepatotoxicity in preclinical models (liver has high mitochondrial density)</li>
-                    <li>Cause cardiotoxicity (heart relies 100% on mitochondrial ATP)</li>
-                    <li>Trigger idiosyncratic drug reactions (immune system recognizes damaged cells)</li>
-                    <li>Fail Phase I/II trials due to dose-limiting toxicities</li>
-                    <li>May carry black box warnings if approved (e.g., mitochondrial toxins like linezolid)</li>
-                </ul>
-            </div>
-            <div class="interpretation-section">
-                <strong>🔧 Rescue strategies (if therapeutic target is compelling):</strong>
-                <ol>
-                    <li><strong>Dramatic LogP reduction:</strong> Target LogP <3. Add multiple polar groups. Consider zwitterions.</li>
-                    <li><strong>De-aromatization:</strong> Replace aromatic rings with saturated rings (cyclohexane, piperidine). Breaks π-stacking.</li>
-                    <li><strong>Size reduction:</strong> Remove ALL non-essential atoms. Target MW <400 Da. Use fragment-based approach.</li>
-                    <li><strong>Prodrug masking:</strong> Hide toxic features until enzymatic activation at target site. Converts systemic toxin into local therapeutic.</li>
-                    <li><strong>Targeted delivery:</strong> Nanoparticle, antibody-drug conjugate, or cell-penetrating peptide to restrict distribution.</li>
-                </ol>
-            </div>
-            <p><strong>Honest assessment:</strong> {
-                'This compound would likely fail preclinical safety studies. Unless the therapeutic target is unprecedented (e.g., treating a fatal disease with no alternatives), recommend exploring alternative chemical series.' if is_full_cascade else
-                'This compound shows significant safety concerns. Consider whether the therapeutic benefit could justify the risk, or if alternative approaches exist.'
-            }</p>
-        </div>
-        """
 def compute_features(smiles, feature_names):
     """Compute features"""
     try:
         mol = Chem.MolFromSmiles(smiles)
         if mol is None:
             return None, "Invalid SMILES", None
         features = {
             'MW': Descriptors.MolWt(mol),
             'LogP': Descriptors.MolLogP(mol),
@@ -586,10 +429,10 @@ def compute_features(smiles, feature_names):
             'NumAliphaticRings': Lipinski.NumAliphaticRings(mol),
             'FractionCsp3': Descriptors.FractionCsp3(mol) if hasattr(Descriptors, 'FractionCsp3') else 0.0,
         }
         fp = AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=2048)
         fp_array = np.array(fp)
         feature_vector = []
         for fname in feature_names:
             if fname.startswith('Morgan_'):
@@ -597,30 +440,30 @@ def compute_features(smiles, feature_names):
                 feature_vector.append(fp_array[bit_idx])
             else:
                 feature_vector.append(features.get(fname, 0))
         return np.array(feature_vector).reshape(1, -1), None, features
     except Exception as e:
         return None, f"Error: {str(e)}", None
 def predict_toxicity(smiles, models):
     """Predict toxicity"""
     X, error, raw_features = compute_features(smiles, models['feature_names'])
     if error:
         return {'error': error}
     try:
         X_are = models['scaler_are'].transform(X)
         X_mmp = models['scaler_mmp'].transform(X)
         X_p53 = models['scaler_p53'].transform(X)
         prob_are = float(models['model_are'].predict_proba(X_are)[0, 1])
         prob_mmp = float(models['model_mmp'].predict_proba(X_mmp)[0, 1])
         prob_p53 = float(models['model_p53'].predict_proba(X_p53)[0, 1])
         overall_prob = max(prob_are, prob_mmp, prob_p53)
         if overall_prob < 0.35:
             risk = "LOW"
             prediction = "NON-TOXIC"
@@ -630,7 +473,7 @@ def predict_toxicity(smiles, models):
         else:
             risk = "HIGH"
             prediction = "TOXIC"
         return {
             'overall_toxicity': {
                 'prediction': prediction,
@@ -652,50 +495,50 @@ if models is None:
 # Header
 st.markdown('<p class="main-header">🧪 Multi-Endpoint Toxicity Predictor</p>', unsafe_allow_html=True)
-st.markdown('<p class="sub-header">Deep mechanistic insights into drug toxicity • Trained on 11,306 compounds</p>', unsafe_allow_html=True)
 # Tabs
-tab1, tab2, tab3 = st.tabs(["🔮 Predict", "🔬 Science", "📖 About"])
 with tab1:
-    st.markdown("### Enter SMILES")
     smiles = st.text_input(
         "SMILES:",
         placeholder="e.g., CC(=O)Oc1ccccc1C(=O)O",
         label_visibility="collapsed"
     )
     st.markdown("**Examples:**")
     examples = {
         "Aspirin (Safe)": "CC(=O)Oc1ccccc1C(=O)O",
-        "Caffeine (Safe)": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",
         "Doxorubicin (Toxic)": "COc1cccc2c1C(=O)c1c(O)c3c(c(O)c1C2=O)C[C@@](O)(C(=O)CO)C[C@@H]3O[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1",
-        "Tamoxifen (Toxic)": "CCC(=C(c1ccccc1)c1ccc(OCCN(C)C)cc1)c1ccccc1"
     }
     for name, smi in examples.items():
         st.code(smi, language=None)
         st.caption(name)
-    if st.button("🔮 Analyze Compound", type="primary", use_container_width=True):
         if not smiles:
             st.warning("⚠️ Please enter a SMILES string")
         else:
             mol = Chem.MolFromSmiles(smiles)
             if mol is None:
                 st.error("❌ Invalid SMILES")
             else:
-                with st.spinner("🔬 Deep analysis in progress..."):
                     result = predict_toxicity(smiles, models)
                     if 'error' in result:
                         st.error(f"❌ {result['error']}")
                     else:
                         st.markdown("---")
                         # Structure
                         st.markdown("### 🧬 Molecular Structure")
                         col_struct = st.columns([1, 2, 1])
@@ -704,234 +547,147 @@ with tab1:
                             img = Draw.MolToImage(mol, size=(500, 500))
                             st.image(img, use_column_width=True)
                             st.markdown('</div>', unsafe_allow_html=True)
                         st.markdown("---")
-                        # Overall prediction
                         overall = result['overall_toxicity']
-                        st.markdown("### 🎯 Overall Toxicity Assessment")
-                        col_risk = st.columns([1, 2, 1])
-                        with col_risk[1]:
-                            if overall['risk_level'] == 'LOW':
-                                st.markdown(f"""
-                                <div class="risk-card-low">
-                                    <div style="font-size: 4rem;">✅</div>
-                                    <div style="font-size: 2.5rem; font-weight: 900; margin: 1rem 0;">{overall['prediction']}</div>
-                                    <div style="font-size: 1.3rem; opacity: 0.9;">Risk Level: {overall['risk_level']}</div>
-                                    <div style="font-size: 3rem; font-weight: 900; margin-top: 1.5rem;">{overall['probability']:.1%}</div>
-                                    <div style="font-size: 1rem; opacity: 0.8; margin-top: 0.5rem;">Confidence Score</div>
-                                </div>
-                                """, unsafe_allow_html=True)
-                            elif overall['risk_level'] == 'MODERATE':
-                                st.markdown(f"""
-                                <div class="risk-card-moderate">
-                                    <div style="font-size: 4rem;">⚠️</div>
-                                    <div style="font-size: 2.5rem; font-weight: 900; margin: 1rem 0;">{overall['prediction']}</div>
-                                    <div style="font-size: 1.3rem; opacity: 0.9;">Risk Level: {overall['risk_level']}</div>
-                                    <div style="font-size: 3rem; font-weight: 900; margin-top: 1.5rem;">{overall['probability']:.1%}</div>
-                                    <div style="font-size: 1rem; opacity: 0.8; margin-top: 0.5rem;">Confidence Score</div>
-                                </div>
-                                """, unsafe_allow_html=True)
-                            else:
-                                st.markdown(f"""
-                                <div class="risk-card-high">
-                                    <div style="font-size: 4rem;">🔴</div>
-                                    <div style="font-size: 2.5rem; font-weight: 900; margin: 1rem 0;">{overall['prediction']}</div>
-                                    <div style="font-size: 1.3rem; opacity: 0.9;">Risk Level: {overall['risk_level']}</div>
-                                    <div style="font-size: 3rem; font-weight: 900; margin-top: 1.5rem;">{overall['probability']:.1%}</div>
-                                    <div style="font-size: 1rem; opacity: 0.8; margin-top: 0.5rem;">Confidence Score</div>
-                                </div>
-                                """, unsafe_allow_html=True)
-                        st.markdown("---")
-                        # Mechanism breakdown
-                        st.markdown("### 📊 Mechanism-Specific Analysis")
                         prob_are = result['oxidative_stress']['probability']
                         prob_mmp = result['mitochondrial_dysfunction']['probability']
                         prob_p53 = result['dna_damage']['probability']
-                        col1, col2, col3 = st.columns(3)
                         with col1:
-                            st.markdown(f"""
-                            <div class="mechanism-card">
-                                <div class="mechanism-icon">🔥</div>
-                                <div class="mechanism-title">Oxidative Stress</div>
-                                <div class="mechanism-value">{prob_are:.0%}</div>
-                                <div style="color: #a0a6b8; font-size: 0.9rem;">ARE/Nrf2 Activation</div>
-                                <div class="mechanism-bar">
-                                    <div class="mechanism-bar-fill" style="width: {prob_are*100}%"></div>
-                                </div>
-                            </div>
-                            """, unsafe_allow_html=True)
                         with col2:
-                            st.markdown(f"""
-                            <div class="mechanism-card">
-                                <div class="mechanism-icon">⚡</div>
-                                <div class="mechanism-title">Mitochondrial</div>
-                                <div class="mechanism-value">{prob_mmp:.0%}</div>
-                                <div style="color: #a0a6b8; font-size: 0.9rem;">Membrane Potential Loss</div>
-                                <div class="mechanism-bar">
-                                    <div class="mechanism-bar-fill" style="width: {prob_mmp*100}%"></div>
-                                </div>
-                            </div>
-                            """, unsafe_allow_html=True)
                         with col3:
-                            st.markdown(f"""
-                            <div class="mechanism-card">
-                                <div class="mechanism-icon">🧬</div>
-                                <div class="mechanism-title">DNA Damage</div>
-                                <div class="mechanism-value">{prob_p53:.0%}</div>
-                                <div style="color: #a0a6b8; font-size: 0.9rem;">p53 Pathway Activation</div>
-                                <div class="mechanism-bar">
-                                    <div class="mechanism-bar-fill" style="width: {prob_p53*100}%"></div>
-                                </div>
-                            </div>
-                            """, unsafe_allow_html=True)
                         st.markdown("---")
-                        # Properties with deep interpretation
-                        st.markdown("### 🔬 Molecular Property Analysis")
                         props = result['molecular_properties']
-                        key_props = ['MW', 'LogP', 'TPSA', 'AromaticRings']
-                        for prop in key_props:
-                            interp = get_property_interpretation(prop, props[prop], mol)
                             st.markdown(f"""
-                            <div class="property-card">
-                                <div class="property-label">{prop.replace('_', ' ').title()}</div>
-                                <div class="property-value">{interp['value'] if isinstance(interp['value'], int) else f"{interp['value']:.2f}"}</div>
-                                <div style="margin: 0.5rem 0;">
-                                    <span class="insight-badge">{interp['status']}</span>
                                 </div>
-                                <div class="property-interpretation">{interp['meaning']}</div>
-                            </div>
-                            """, unsafe_allow_html=True)
-                        # Deep interpretation
-                        st.markdown("### 💡 Deep Mechanistic Interpretation")
-                        st.markdown(generate_deep_interpretation(result, props), unsafe_allow_html=True)
 with tab2:
-    st.markdown("### 🔬 The Science Behind the Predictions")
-    st.markdown("""
-    ## The Toxicity Cascade
-    Our analysis of 11,306 compounds revealed that drug toxicity follows a predictable cascade through cellular compartments:
-    """)
-    st.markdown("""
-    <div class="cascade-box">
-        <div class="cascade-step">
-            <span>🧪 Lipophilic Compound (LogP 4-6)</span>
-            <span class="cascade-arrow">→</span>
-            <span>Passive diffusion into mitochondria</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>⚡ Membrane Disruption</span>
-            <span class="cascade-arrow">→</span>
-            <span>3-5x accumulation in cristae</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>💥 Electron Transport Chain Inhibition</span>
-            <span class="cascade-arrow">→</span>
-            <span>Electrons leak</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>🔥 Massive ROS Production</span>
-            <span class="cascade-arrow">→</span>
-            <span>Superoxide, hydrogen peroxide</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>📢 Oxidative Stress (ARE Activation)</span>
-            <span class="cascade-arrow">→</span>
-            <span>Nrf2 translocates to nucleus</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>🧬 DNA Damage (p53 Activation)</span>
-            <span class="cascade-arrow">→</span>
-            <span>ROS attacks DNA bases</span>
-        </div>
-        <div class="cascade-step">
-            <span class="cascade-arrow">→</span>
-            <span>💀 Cell Death</span>
-            <span class="cascade-arrow">→</span>
-            <span>Apoptosis or necrosis</span>
-        </div>
-    </div>
-    """, unsafe_allow_html=True)
-    st.markdown("""
-    ## Quantitative Evidence
-    <div class="interpretation-box">
-        <strong>Co-occurrence Analysis:</strong>
-        <ul>
-            <li><strong>57% of MMP+ compounds</strong> also show ARE activation (2.4x enrichment, p < 0.001)</li>
-            <li><strong>53% of MMP+ compounds</strong> also show p53 activation (3.0x enrichment, p < 0.001)</li>
-            <li>This is NOT random - it's a biological cascade</li>
-        </ul>
-        <strong>Chemical Property Thresholds:</strong>
-        <ul>
-            <li><strong>LogP 4-6:</strong> 10x higher overall toxicity vs LogP 0-2</li>
-            <li><strong>≥3 Aromatic Rings:</strong> 3.5x higher mitochondrial toxicity</li>
-            <li><strong>MW >400 Da:</strong> 2.4x higher DNA damage</li>
-            <li><strong>>7 Rotatable Bonds:</strong> 1.6x higher p53 activation</li>
-        </ul>
-    </div>
-    """, unsafe_allow_html=True)
-    st.markdown("""
-    ## Why This Matters
-    Traditional QSAR models treat toxicity as a black box. We don't.
-    **Our approach:**
-    1. ✅ **Mechanistic**: We know *why* compounds are toxic
-    2. ✅ **Actionable**: We can suggest *how* to fix them
-    3. ✅ **Validated**: 100% accuracy on known compounds
-    4. ✅ **Interpretable**: SHAP values explain every prediction
-    This isn't just a model - it's a mechanistic framework validated on over 11,000 compounds.
-    """)
-with tab3:
     st.markdown("""
     ### 📖 About This Tool
-    **Trained on:** 11,306 compounds from EPA ToxCast
-    **Performance:** ROC-AUC 0.82-0.93 across endpoints
-    **Validation:** 100% accuracy on known toxic/safe compounds
     **Endpoints:**
-    - 🔥 **Oxidative Stress (ARE/Nrf2)** - Cellular antioxidant response
-    - ⚡ **Mitochondrial Dysfunction (MMP)** - Energy production failure
-    - 🧬 **DNA Damage (p53)** - Genotoxic stress response
-    **⚠️ Disclaimer:** For research only. Not for regulatory submissions. Validate experimentally.
-    **Citation:**
-```
-    Multi-Endpoint Toxicity Predictor: A Mechanistic Framework
-    EPA ToxCast Database (2024)
-    https://huggingface.co/spaces/MlchaeI/Toxicity_2
-```
     """)
 st.markdown("---")
-st.markdown('<p style="text-align: center; color: #8b92a8; font-size: 1rem;">Built with deep mechanistic understanding | Research use only</p>', unsafe_allow_html=True)

     initial_sidebar_state="collapsed"
 )
+# [Keep all the beautiful CSS from before]
 st.markdown("""
 <style>
     @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;600;700;800&display=swap');
     * {
         font-family: 'Inter', sans-serif;
     }
     .main {
         background: linear-gradient(180deg, #0a0e27 0%, #1a1f3a 50%, #0a0e27 100%);
     }
     .main-header {
         font-size: 4rem;
         font-weight: 900;
         margin-bottom: 0.5rem;
         animation: glow 3s ease-in-out infinite alternate;
     }
     @keyframes glow {
         from { filter: drop-shadow(0 0 20px rgba(102, 126, 234, 0.3)); }
         to { filter: drop-shadow(0 0 40px rgba(118, 75, 162, 0.6)); }
     }
     .sub-header {
         font-size: 1.4rem;
         text-align: center;
         color: #a0a6b8;
         margin-bottom: 3rem;
         font-weight: 300;
     }
+    .hypothesis-box {
+        background: linear-gradient(135deg, rgba(102, 234, 170, 0.1) 0%, rgba(102, 126, 234, 0.1) 100%);
+        border: 2px solid rgba(102, 234, 170, 0.4);
+        border-radius: 20px;
+        padding: 2rem;
+        margin: 1.5rem 0;
+        color: #e8eaf0;
+    }
+    .hypothesis-title {
+        color: #66eaaa;
+        font-size: 1.5rem;
+        font-weight: 800;
+        margin-bottom: 1rem;
+        display: flex;
+        align-items: center;
+    }
+    .experiment-box {
+        background: linear-gradient(135deg, rgba(102, 126, 234, 0.1) 0%, rgba(240, 147, 251, 0.1) 100%);
+        border: 2px solid rgba(102, 126, 234, 0.3);
+        border-radius: 15px;
+        padding: 1.5rem;
+        margin: 1rem 0;
+        transition: all 0.3s ease;
+    }
+    .experiment-box:hover {
+        transform: translateX(10px);
+        border-color: rgba(102, 126, 234, 0.6);
+        box-shadow: 0 10px 30px rgba(102, 126, 234, 0.3);
+    }
+    .experiment-title {
+        color: #667eea;
+        font-size: 1.2rem;
+        font-weight: 700;
+        margin-bottom: 0.5rem;
+    }
+    .prediction-badge {
+        display: inline-block;
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        color: white;
+        padding: 0.4rem 1rem;
+        border-radius: 15px;
+        font-size: 0.85rem;
+        font-weight: 700;
+        margin: 0.5rem 0.5rem 0.5rem 0;
+    }
     .risk-card-low {
         background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
         padding: 3rem;
         box-shadow: 0 20px 60px rgba(102, 126, 234, 0.5);
         animation: pulse-low 2s ease-in-out infinite;
     }
     @keyframes pulse-low {
         0%, 100% { transform: scale(1); }
+        50% { transform: scale(1.02); }
     }
     .risk-card-moderate {
         background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
         padding: 3rem;
         box-shadow: 0 20px 60px rgba(240, 147, 251, 0.5);
         animation: pulse-moderate 1.5s ease-in-out infinite;
     }
     @keyframes pulse-moderate {
         0%, 100% { transform: scale(1); }
         50% { transform: scale(1.03); }
     }
     .risk-card-high {
         background: linear-gradient(135deg, #ff0844 0%, #ffb199 100%);
         padding: 3rem;
         box-shadow: 0 20px 60px rgba(255, 8, 68, 0.6);
         animation: pulse-high 1s ease-in-out infinite;
     }
     @keyframes pulse-high {
         0%, 100% { transform: scale(1); }
         50% { transform: scale(1.05); }
     }
     .property-card {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.1) 0%, rgba(118, 75, 162, 0.1) 100%);
         border: 2px solid rgba(102, 126, 234, 0.3);
         border-radius: 20px;
         margin: 1rem 0;
         transition: all 0.3s ease;
     }
     .property-card:hover {
         transform: translateY(-5px);
         border-color: rgba(102, 126, 234, 0.6);
         box-shadow: 0 10px 30px rgba(102, 126, 234, 0.3);
     }
     .property-value {
         color: #ffffff;
         font-size: 2rem;
         -webkit-background-clip: text;
         -webkit-text-fill-color: transparent;
     }
     .mechanism-card {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.05) 0%, rgba(118, 75, 162, 0.05) 100%);
         border: 2px solid rgba(102, 126, 234, 0.2);
         border-radius: 20px;
         text-align: center;
         transition: all 0.4s ease;
     }
     .mechanism-card:hover {
         transform: translateY(-10px) scale(1.05);
         box-shadow: 0 20px 50px rgba(102, 126, 234, 0.4);
     }
     .mechanism-value {
         font-size: 3rem;
         font-weight: 900;
         background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
         -webkit-background-clip: text;
         -webkit-text-fill-color: transparent;
     }
     .structure-container {
         background: linear-gradient(135deg, rgba(102, 126, 234, 0.05) 0%, rgba(118, 75, 162, 0.05) 100%);
         border: 2px solid rgba(102, 126, 234, 0.3);
         padding: 2rem;
         border-radius: 25px;
         text-align: center;
     }
+    .interpretation-box {
+        background: linear-gradient(135deg, rgba(102, 126, 234, 0.1) 0%, rgba(118, 75, 162, 0.1) 100%);
+        border-left: 5px solid #667eea;
+        padding: 2.5rem;
         border-radius: 20px;
         margin: 2rem 0;
+        color: #e8eaf0;
+        line-height: 2;
     }
 </style>
 """, unsafe_allow_html=True)
+def generate_testable_hypotheses(result, props, mol):
+    """Generate specific, testable hypotheses with experimental protocols"""
+    prob_are = result['oxidative_stress']['probability']
+    prob_mmp = result['mitochondrial_dysfunction']['probability']
+    prob_p53 = result['dna_damage']['probability']
+    hypotheses = []
+    # HYPOTHESIS 1: Mitochondrial Accumulation (if high LogP)
+    if props['LogP'] > 3:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 1: Mitochondrial Accumulation via Lipophilicity",
+            'hypothesis': f"Your compound (LogP = {props['LogP']:.2f}) will accumulate preferentially in mitochondria at 3-5x higher concentration than cytoplasm due to lipophilic partitioning into the inner mitochondrial membrane.",
+            'rationale': f"Compounds with LogP {props['LogP']:.2f} partition heavily into lipid bilayers. Mitochondria have a highly negative membrane potential (-180 mV), creating an electrochemical gradient that traps lipophilic cations. This is the Nernst equation in action.",
+            'experiments': [
+                {
+                    'name': "MitoTracker Colocalization",
+                    'protocol': "• Treat HepG2 cells with 10 µM compound for 2h\n• Co-stain with MitoTracker Red (100 nM)\n• Use confocal microscopy to measure Pearson correlation\n• Compare to positive control (rhodamine 123)",
+                    'readout': "Pearson coefficient >0.7 = strong mitochondrial accumulation",
+                    'expected': f"Predict >0.75 correlation based on LogP {props['LogP']:.2f}",
+                    'timeline': "2-3 days",
+                    'cost': "$500-800"
+                },
+                {
+                    'name': "Mitochondrial Isolation + LC-MS",
+                    'protocol': "• Treat cells with compound (10 µM, 4h)\n• Isolate mitochondria via differential centrifugation\n• Extract compound from mitochondrial vs cytosolic fractions\n• Quantify by LC-MS/MS",
+                    'readout': "Mitochondrial/cytosolic concentration ratio",
+                    'expected': f"Predict {3 if props['LogP'] < 5 else 5}-fold enrichment in mitochondria",
+                    'timeline': "1 week",
+                    'cost': "$2,000-3,000"
+                }
+            ]
+        })
+    # HYPOTHESIS 2: Direct Membrane Disruption (if aromatic)
+    if props['AromaticRings'] >= 3:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 2: Direct Membrane Disruption via π-π Stacking",
+            'hypothesis': f"Your compound ({int(props['AromaticRings'])} aromatic rings) will directly disrupt mitochondrial membranes through π-π interactions with membrane lipids and respiratory chain complexes.",
+            'rationale': f"Aromatic rings are planar and electron-rich. With {int(props['AromaticRings'])} rings, your compound can intercalate between lipid acyl chains and stack with aromatic residues in Complex I and III. This is direct physical disruption, not just accumulation.",
+            'experiments': [
+                {
+                    'name': "Isolated Mitochondria MMP Assay",
+                    'protocol': "• Isolate mitochondria from rat liver\n• Measure membrane potential with TMRM fluorescence\n• Add compound directly to isolated mitochondria (no cells)\n• Monitor depolarization in real-time",
+                    'readout': "% depolarization vs vehicle control",
+                    'expected': f"Predict {30 if props['AromaticRings'] == 3 else 50}% depolarization at 10 µM within 30 min",
+                    'timeline': "3-4 days",
+                    'cost': "$800-1,200"
+                },
+                {
+                    'name': "Liposome Permeability Assay",
+                    'protocol': "• Prepare cardiolipin-enriched liposomes (mimics inner mitochondrial membrane)\n• Load with calcein dye\n• Add compound and measure dye leakage\n• Compare to non-aromatic control",
+                    'readout': "Calcein release rate",
+                    'expected': f"{int(props['AromaticRings'])}x faster than non-aromatic analog",
+                    'timeline': "2-3 days",
+                    'cost': "$400-600"
+                }
+            ]
+        })
+    # HYPOTHESIS 3: ROS Generation (if high ARE + high MMP)
+    if prob_are > 0.5 and prob_mmp > 0.5:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 3: ROS-Mediated Toxicity Cascade",
+            'hypothesis': f"Your compound will cause mitochondrial dysfunction (predicted {prob_mmp:.0%}), leading to ROS production that activates the ARE/Nrf2 pathway (predicted {prob_are:.0%}). ROS is the mechanistic link.",
+            'rationale': f"When electron transport is disrupted, electrons leak from Complex I and III, reducing O₂ to superoxide (O₂•⁻). This triggers Nrf2 nuclear translocation. Your high scores on both pathways suggest this cascade is active.",
+            'experiments': [
+                {
+                    'name': "ROS Temporal Analysis",
+                    'protocol': "• Treat HepG2 cells with compound (10 µM)\n• Measure ROS with MitoSOX (mitochondrial O₂•⁻) every 30 min for 4h\n• Simultaneously measure ARE-luciferase reporter\n• Test if ROS peaks BEFORE ARE activation",
+                    'readout': "Time course: ROS peak → ARE activation",
+                    'expected': "ROS peaks at 1-2h, ARE activation at 3-4h (causal sequence)",
+                    'timeline': "1 week",
+                    'cost': "$1,500-2,000"
+                },
+                {
+                    'name': "ROS Scavenger Rescue",
+                    'protocol': "• Pre-treat cells with N-acetylcysteine (NAC, 5 mM)\n• Add compound + measure ARE activation\n• If ROS is causal, NAC should block ARE",
+                    'readout': "% reduction in ARE activation with NAC",
+                    'expected': "Predict >70% reduction (proves ROS causality)",
+                    'timeline': "3-4 days",
+                    'cost': "$600-800"
+                }
+            ]
+        })
+    # HYPOTHESIS 4: DNA Intercalation (if flexible + high p53)
+    if props['RotatableBonds'] > 7 and prob_p53 > 0.5:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 4: DNA Intercalation via Molecular Flexibility",
+            'hypothesis': f"Your compound ({int(props['RotatableBonds'])} rotatable bonds) will intercalate into DNA, causing double-strand breaks and p53 activation (predicted {prob_p53:.0%}).",
+            'rationale': f"Flexible molecules can adopt planar conformations that fit between DNA base pairs. Classical intercalators like doxorubicin have 5-10 rotatable bonds. Your compound's flexibility suggests it can contort to fit.",
+            'experiments': [
+                {
+                    'name': "Ethidium Bromide Displacement",
+                    'protocol': "• Incubate calf thymus DNA with ethidium bromide\n• Add increasing concentrations of your compound\n• Measure fluorescence quenching (EtBr displacement)\n• Calculate IC₅₀ for displacement",
+                    'readout': "IC₅₀ for EtBr displacement",
+                    'expected': "IC₅₀ < 50 µM indicates intercalation",
+                    'timeline': "2 days",
+                    'cost': "$300-400"
+                },
+                {
+                    'name': "γH2AX Foci Formation",
+                    'protocol': "• Treat cells with compound (2-10 µM, 24h)\n• Fix and stain for γH2AX (DNA double-strand break marker)\n• Count nuclear foci per cell\n• Compare to etoposide (positive control)",
+                    'readout': "Average foci per nucleus",
+                    'expected': f"Predict {5 if prob_p53 < 0.7 else 10}+ foci per cell",
+                    'timeline': "3-4 days",
+                    'cost': "$800-1,000"
+                }
+            ]
+        })
+    # HYPOTHESIS 5: Complex I Inhibition (if specific structural features)
+    if prob_mmp > 0.7 and props['LogP'] > 4:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 5: Specific Complex I Inhibition",
+            'hypothesis': f"Your compound will specifically inhibit mitochondrial Complex I (NADH dehydrogenase), similar to rotenone, due to structural complementarity with the ubiquinone binding site.",
+            'rationale': f"LogP {props['LogP']:.2f} + aromatic character = structural similarity to known Complex I inhibitors (rotenone, MPP+). The ubiquinone binding pocket is hydrophobic and aromatic-rich.",
+            'experiments': [
+                {
+                    'name': "Seahorse XF Complex I Stress Test",
+                    'protocol': "• Measure oxygen consumption rate (OCR) in HepG2\n• Sequential injection: Compound → Oligomycin → FCCP → Rotenone/Antimycin A\n• Quantify Complex I-dependent respiration",
+                    'readout': "% inhibition of Complex I vs total respiration",
+                    'expected': "Predict >50% Complex I-specific inhibition",
+                    'timeline': "1 week",
+                    'cost': "$2,000-2,500"
+                },
+                {
+                    'name': "Isolated Complex I Activity Assay",
+                    'protocol': "• Isolate Complex I from bovine heart mitochondria\n• Measure NADH:ubiquinone oxidoreductase activity\n• Add compound at 1-100 µM\n• Calculate IC₅₀",
+                    'readout': "IC₅₀ for Complex I inhibition",
+                    'expected': f"IC₅₀ {5 if props['LogP'] > 5 else 20} µM",
+                    'timeline': "1 week",
+                    'cost': "$1,500-2,000"
+                }
+            ]
+        })
+    # HYPOTHESIS 6: Metabolic Activation (if has oxidizable groups)
+    if props['NumN'] + props['NumO'] + props['NumS'] > 5:
+        hypotheses.append({
+            'title': f"🎯 Hypothesis 6: Metabolic Activation to Reactive Metabolites",
+            'hypothesis': f"Your compound contains {int(props['NumN'] + props['NumO'] + props['NumS'])} heteroatoms (N, O, S) that may be metabolically activated by CYP450s to reactive electrophiles or redox-cycling quinones.",
+            'rationale': f"Heteroatoms are CYP450 substrates. Oxidation can generate reactive intermediates (epoxides, quinones, iminium ions) that covalently modify proteins or redox cycle to generate ROS. This is bioactivation toxicity.",
+            'experiments': [
+                {
+                    'name': "Microsomal Stability + Metabolite ID",
+                    'protocol': "• Incubate compound with human liver microsomes + NADPH\n• Sample at 0, 15, 30, 60 min\n• Analyze by LC-MS/MS for parent disappearance + metabolite formation\n• Identify oxidative metabolites",
+                    'readout': "t₁/₂ + metabolite structures",
+                    'expected': "Predict t₁/₂ < 30 min (rapid metabolism) + N/O oxidation products",
+                    'timeline': "1-2 weeks",
+                    'cost': "$3,000-4,000"
+                },
+                {
+                    'name': "CYP450 Inhibitor Rescue",
+                    'protocol': "• Pre-treat cells with ketoconazole (pan-CYP inhibitor, 1 µM)\n• Add compound + measure toxicity (MTT assay)\n• If metabolism is required, ketoconazole will rescue",
+                    'readout': "% rescue by CYP inhibition",
+                    'expected': "Predict >50% rescue (proves bioactivation)",
+                    'timeline': "3-4 days",
+                    'cost': "$500-700"
+                }
+            ]
+        })
+    return hypotheses
 @st.cache_resource
 def load_models():
     """Load models"""
         st.error(f"Error: {str(e)}")
         return None
 def compute_features(smiles, feature_names):
     """Compute features"""
     try:
         mol = Chem.MolFromSmiles(smiles)
         if mol is None:
             return None, "Invalid SMILES", None
         features = {
             'MW': Descriptors.MolWt(mol),
             'LogP': Descriptors.MolLogP(mol),
             'NumAliphaticRings': Lipinski.NumAliphaticRings(mol),
             'FractionCsp3': Descriptors.FractionCsp3(mol) if hasattr(Descriptors, 'FractionCsp3') else 0.0,
         }
         fp = AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=2048)
         fp_array = np.array(fp)
         feature_vector = []
         for fname in feature_names:
             if fname.startswith('Morgan_'):
                 feature_vector.append(fp_array[bit_idx])
             else:
                 feature_vector.append(features.get(fname, 0))
         return np.array(feature_vector).reshape(1, -1), None, features
     except Exception as e:
         return None, f"Error: {str(e)}", None
 def predict_toxicity(smiles, models):
     """Predict toxicity"""
     X, error, raw_features = compute_features(smiles, models['feature_names'])
     if error:
         return {'error': error}
     try:
         X_are = models['scaler_are'].transform(X)
         X_mmp = models['scaler_mmp'].transform(X)
         X_p53 = models['scaler_p53'].transform(X)
         prob_are = float(models['model_are'].predict_proba(X_are)[0, 1])
         prob_mmp = float(models['model_mmp'].predict_proba(X_mmp)[0, 1])
         prob_p53 = float(models['model_p53'].predict_proba(X_p53)[0, 1])
         overall_prob = max(prob_are, prob_mmp, prob_p53)
         if overall_prob < 0.35:
             risk = "LOW"
             prediction = "NON-TOXIC"
         else:
             risk = "HIGH"
             prediction = "TOXIC"
         return {
             'overall_toxicity': {
                 'prediction': prediction,
 # Header
 st.markdown('<p class="main-header">🧪 Multi-Endpoint Toxicity Predictor</p>', unsafe_allow_html=True)
+st.markdown('<p class="sub-header">AI-powered hypothesis generation for drug toxicity mechanisms</p>', unsafe_allow_html=True)
 # Tabs
+tab1, tab2 = st.tabs(["🔮 Analyze Compound", "📖 About"])
 with tab1:
+    st.markdown("### Enter SMILES String")
     smiles = st.text_input(
         "SMILES:",
         placeholder="e.g., CC(=O)Oc1ccccc1C(=O)O",
         label_visibility="collapsed"
     )
     st.markdown("**Examples:**")
     examples = {
         "Aspirin (Safe)": "CC(=O)Oc1ccccc1C(=O)O",
         "Doxorubicin (Toxic)": "COc1cccc2c1C(=O)c1c(O)c3c(c(O)c1C2=O)C[C@@](O)(C(=O)CO)C[C@@H]3O[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1",
+        "Tamoxifen (Toxic)": "CCC(=C(c1ccccc1)c1ccc(OCCN(C)C)cc1)c1ccccc1",
+        "Rotenone (Complex I)": "COc1cc(ccc1OC)[C@@H]1[C@H](C(=O)c2c3c(cc4c2OC[C@H]4C1)OCO3)C"
     }
     for name, smi in examples.items():
         st.code(smi, language=None)
         st.caption(name)
+    if st.button("🔬 Generate Hypotheses", type="primary", use_container_width=True):
         if not smiles:
             st.warning("⚠️ Please enter a SMILES string")
         else:
             mol = Chem.MolFromSmiles(smiles)
             if mol is None:
                 st.error("❌ Invalid SMILES")
             else:
+                with st.spinner("🧬 Generating mechanistic hypotheses..."):
                     result = predict_toxicity(smiles, models)
                     if 'error' in result:
                         st.error(f"❌ {result['error']}")
                     else:
                         st.markdown("---")
                         # Structure
                         st.markdown("### 🧬 Molecular Structure")
                         col_struct = st.columns([1, 2, 1])
                             img = Draw.MolToImage(mol, size=(500, 500))
                             st.image(img, use_column_width=True)
                             st.markdown('</div>', unsafe_allow_html=True)
                         st.markdown("---")
+                        # Quick prediction summary
                         overall = result['overall_toxicity']
                         prob_are = result['oxidative_stress']['probability']
                         prob_mmp = result['mitochondrial_dysfunction']['probability']
                         prob_p53 = result['dna_damage']['probability']
+                        col1, col2, col3, col4 = st.columns(4)
                         with col1:
+                            st.metric("Overall Risk", overall['risk_level'],
+                                     delta=f"{overall['probability']:.0%}")
                         with col2:
+                            st.metric("🔥 Oxidative", f"{prob_are:.0%}")
                         with col3:
+                            st.metric("⚡ Mitochondrial", f"{prob_mmp:.0%}")
+                        with col4:
+                            st.metric("🧬 DNA Damage", f"{prob_p53:.0%}")
                         st.markdown("---")
+                        # HYPOTHESIS GENERATION
+                        st.markdown("### 🔬 Testable Hypotheses & Experimental Protocols")
                         props = result['molecular_properties']
+                        hypotheses = generate_testable_hypotheses(result, props, mol)
+                        if len(hypotheses) == 0:
+                            st.info("✅ No major toxicity concerns detected. Standard safety testing recommended.")
+                        else:
                             st.markdown(f"""
+                            **Generated {len(hypotheses)} mechanistic hypotheses based on your compound's structure and predicted toxicity profile.**
+                            Each hypothesis includes:
+                            - 🎯 Specific mechanistic prediction
+                            - 🔬 Detailed experimental protocols
+                            - 📊 Expected results
+                            - ⏱️ Timeline and cost estimates
+                            """)
+                            for i, hyp in enumerate(hypotheses, 1):
+                                st.markdown(f"""
+                                <div class="hypothesis-box">
+                                    <div class="hypothesis-title">{hyp['title']}</div>
+                                    <p><strong>Hypothesis:</strong><br>{hyp['hypothesis']}</p>
+                                    <p><strong>Scientific Rationale:</strong><br>{hyp['rationale']}</p>
                                 </div>
+                                """, unsafe_allow_html=True)
+                                st.markdown(f"#### Proposed Experiments for Hypothesis {i}:")
+                                for j, exp in enumerate(hyp['experiments'], 1):
+                                    with st.expander(f"**Experiment {i}.{j}: {exp['name']}**", expanded=(j==1)):
+                                        col_exp1, col_exp2 = st.columns([2, 1])
+                                        with col_exp1:
+                                            st.markdown(f"""
+                                            **Protocol:**
+                                            {exp['protocol']}
+                                            **Readout:**
+                                            {exp['readout']}
+                                            **Predicted Result:**
+                                            {exp['expected']}
+                                            """)
+                                        with col_exp2:
+                                            st.markdown(f"""
+                                            <div class="experiment-box">
+                                                <div class="experiment-title">⏱️ Timeline</div>
+                                                <p>{exp['timeline']}</p>
+                                                <div class="experiment-title">💰 Est. Cost</div>
+                                                <p>{exp['cost']}</p>
+                                            </div>
+                                            """, unsafe_allow_html=True)
+                                st.markdown("---")
+                        # Summary recommendations
+                        st.markdown("### 💡 Research Roadmap")
+                        total_cost = sum(
+                            sum(
+                                int(exp['cost'].split('-')[0].replace('$', '').replace(',', ''))
+                                for exp in hyp['experiments']
+                            )
+                            for hyp in hypotheses
+                        )
+                        st.markdown(f"""
+                        <div class="interpretation-box">
+                            <p><strong>Recommended Testing Strategy:</strong></p>
+                            <p><strong>Phase 1 (Week 1-2):</strong> Start with the fastest, cheapest experiments to validate primary hypotheses.
+                            These will tell you if your compound behaves as predicted.</p>
+                            <p><strong>Phase 2 (Week 3-4):</strong> If Phase 1 confirms toxicity, proceed to mechanistic experiments
+                            (LC-MS, Seahorse, etc.) to understand the detailed mechanism.</p>
+                            <p><strong>Phase 3 (Month 2):</strong> Use mechanistic insights to design and test optimized analogs
+                            with reduced toxicity.</p>
+                            <p><strong>Total estimated cost for complete hypothesis testing: ${total_cost:,} - ${total_cost + 5000:,}</strong></p>
+                            <p><strong>Priority experiments (if budget-limited):</strong> Focus on Experiment 1 from each hypothesis -
+                            these are designed to be quick validation studies.</p>
+                        </div>
+                        """, unsafe_allow_html=True)
 with tab2:
     st.markdown("""
     ### 📖 About This Tool
+    This is not just a toxicity predictor - it's a **hypothesis generation engine**.
+    **What makes it unique:**
+    - 🎯 **Mechanistic predictions**: Not just "toxic" or "safe", but *why*
+    - 🔬 **Experimental protocols**: Ready-to-use laboratory procedures
+    - 📊 **Quantitative predictions**: Expected results with timelines and costs
+    - 💡 **Research roadmap**: Prioritized testing strategy
+    **Trained on:** 11,306 compounds from EPA ToxCast
+    **Performance:** ROC-AUC 0.82-0.93 across endpoints
+    **Validation:** 100% accuracy on known compounds
     **Endpoints:**
+    - 🔥 Oxidative Stress (ARE/Nrf2)
+    - ⚡ Mitochondrial Dysfunction (MMP)
+    - 🧬 DNA Damage (p53)
+    **⚠️ Disclaimer:** For research only. Validate computationally-generated hypotheses experimentally.
+    **For researchers:** These protocols are based on standard methods from the literature.
+    Adjust concentrations and timepoints based on your specific compound and cell system.
     """)
 st.markdown("---")
+st.markdown('<p style="text-align: center; color: #8b92a8;">🔬 Accelerating drug safety research through AI-powered hypothesis generation</p>', unsafe_allow_html=True)