# Adaptive Difficulty Specification

## Purpose

After the agent masters a scenario (>70% success), the environment hardens parameters — tighter budgets, rarer subgroups, noisier data — and targets the agent's tracked weak spots.

---

## Mastery Detection

Tracked **per scenario × per tier**. A strong `solid_tumor_chemo` doesn't mask weakness on `rare_disease_orphan`.

| Tier | Window | Threshold | Min Episodes |
|------|--------|-----------|-------------|
| Warmup | 10 | 70% | 15 |
| Beginner | 12 | 65% | 20 |
| Intermediate | 15 | 55% | 25 |
| Advanced | 20 | 45% | 30 |
| Expert | N/A | N/A | N/A |

**Fast-track:** ≥90% in first 5 episodes → immediate hardening (skip mastery window).

---

## Parameter Hardening

Applied **incrementally** when mastery is detected, within the same tier:

| Axis | Easy | Step 1 | Step 2 | Step 3 |
|------|------|--------|--------|--------|
| Effect size multiplier | 1.0× | 0.85× | 0.70× | 0.60× |
| Budget multiplier | 1.0× | 0.90× | 0.80× | 0.70× |
| Noise multiplier | 1.0× | 1.2× | 1.4× | 1.6× |
| Dropout add | +0% | — | +5% | +8% |
| Placebo boost | +0% | — | — | +5% |
| Subgroup prevalence | 1.0× | — | — | 0.7× |
| Misleading Phase I | No | — | — | Yes |

---

## Weak-Spot Targeting

The adversarial designer (`adversarial_designer.py`) analyzes failure patterns:

| Failure Type | Counter | Hardening Response |
|-------------|---------|-------------------|
| Small effect missed | `small_effect_failures` | Reduce effect size further |
| High dropout | `high_dropout_failures` | Increase dropout rate |
| Subgroup missed | `missed_subgroup_failures` | Reduce subgroup prevalence |
| Budget exhausted | `budget_failures` | Tighten budget |

At Expert tier, the adversarial designer creates **compound challenges** combining 2–3 difficulty axes simultaneously.

---

## Solvability Guarantee

Every generated scenario must remain solvable — at least one action sequence achieves success:
- Effect size > 0 (drug works)
- Budget sufficient for n patients needed for 80% power
- Time sufficient for enrollment + follow-up
- At least one valid inclusion criteria identifies responders

If generated params fail solvability check → regenerate with relaxed constraints.