openenv-clinical-trial / docs /adaptive_difficulty_spec.md
Roopalgn
fix: add root route so HF Space doesn't show 404
6b6624d

Adaptive Difficulty Specification

Purpose

After the agent masters a scenario (>70% success), the environment hardens parameters β€” tighter budgets, rarer subgroups, noisier data β€” and targets the agent's tracked weak spots.


Mastery Detection

Tracked per scenario Γ— per tier. A strong solid_tumor_chemo doesn't mask weakness on rare_disease_orphan.

Tier Window Threshold Min Episodes
Warmup 10 70% 15
Beginner 12 65% 20
Intermediate 15 55% 25
Advanced 20 45% 30
Expert N/A N/A N/A

Fast-track: β‰₯90% in first 5 episodes β†’ immediate hardening (skip mastery window).


Parameter Hardening

Applied incrementally when mastery is detected, within the same tier:

Axis Easy Step 1 Step 2 Step 3
Effect size multiplier 1.0Γ— 0.85Γ— 0.70Γ— 0.60Γ—
Budget multiplier 1.0Γ— 0.90Γ— 0.80Γ— 0.70Γ—
Noise multiplier 1.0Γ— 1.2Γ— 1.4Γ— 1.6Γ—
Dropout add +0% β€” +5% +8%
Placebo boost +0% β€” β€” +5%
Subgroup prevalence 1.0Γ— β€” β€” 0.7Γ—
Misleading Phase I No β€” β€” Yes

Weak-Spot Targeting

The adversarial designer (adversarial_designer.py) analyzes failure patterns:

Failure Type Counter Hardening Response
Small effect missed small_effect_failures Reduce effect size further
High dropout high_dropout_failures Increase dropout rate
Subgroup missed missed_subgroup_failures Reduce subgroup prevalence
Budget exhausted budget_failures Tighten budget

At Expert tier, the adversarial designer creates compound challenges combining 2–3 difficulty axes simultaneously.


Solvability Guarantee

Every generated scenario must remain solvable β€” at least one action sequence achieves success:

  • Effect size > 0 (drug works)
  • Budget sufficient for n patients needed for 80% power
  • Time sufficient for enrollment + follow-up
  • At least one valid inclusion criteria identifies responders

If generated params fail solvability check β†’ regenerate with relaxed constraints.