Spaces:
Sleeping
Sleeping
File size: 938 Bytes
c4fe0a4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | # Experiment Log
Record each experiment with: hypothesis, method, result, decision.
---
## E0 — Data Availability
- **Hypothesis:** Public datasets (SAMSum, Enron, HF tickets) can be assembled into 20-40 reproducible case bundles with metadata and weak labels.
- **Status:** Not started
## E1 — Structuring Feasibility
- **Hypothesis:** Root-cause L1/L2 classification + evidence citation schema pass rate is stable (>= 98%).
- **Status:** Not started
## E2 — Risk Gate
- **Hypothesis:** review_required rules capture low-confidence / high-risk samples with precision >= 0.8 and recall >= 0.9.
- **Status:** Not started
## E3 — Business Insight
- **Hypothesis:** VIP x root-cause churn correlation produces actionable, explainable conclusions.
- **Status:** Not started
## E4 — Iteration Loop
- **Hypothesis:** Human review feedback improves specific failure modes (e.g., root cause confusion).
- **Status:** Not started
|