# Experiment Log Record each experiment with: hypothesis, method, result, decision. --- ## E0 — Data Availability - **Hypothesis:** Public datasets (SAMSum, Enron, HF tickets) can be assembled into 20-40 reproducible case bundles with metadata and weak labels. - **Status:** Not started ## E1 — Structuring Feasibility - **Hypothesis:** Root-cause L1/L2 classification + evidence citation schema pass rate is stable (>= 98%). - **Status:** Not started ## E2 — Risk Gate - **Hypothesis:** review_required rules capture low-confidence / high-risk samples with precision >= 0.8 and recall >= 0.9. - **Status:** Not started ## E3 — Business Insight - **Hypothesis:** VIP x root-cause churn correlation produces actionable, explainable conclusions. - **Status:** Not started ## E4 — Iteration Loop - **Hypothesis:** Human review feedback improves specific failure modes (e.g., root cause confusion). - **Status:** Not started