open-range / docs /mutation_policy.md
Lars Talian
Make mutation policy weights explicit
228ed67

Mutation Policy Weights

PopulationMutationPolicy is a hand-authored heuristic policy, but its weights and shaping constants are now explicit in src/open_range/builder/mutation_policy.py under MutationPolicySettings.

The policy has three jobs:

  1. Choose which stored snapshot is the best parent to mutate next.
  2. Choose which structural mutation op to apply.
  3. Choose which security/noise mutation op to apply.

Parent Selection Terms

These fields live in MutationPolicySettings.parent.

Field Default Why it exists
frontier_weight 0.28 Prefer snapshots near the current learning frontier instead of trivially solved or impossible ones.
replay_weight 0.18 Revisit under-played snapshots so the curriculum does not collapse to a tiny subset.
novelty_weight 0.16 Favor rarer vulnerability mixes across the population.
weak_overlap_weight 0.18 Bias parent choice toward snapshots that exercise known weak areas.
lineage_balance_weight 0.08 Prevent one root lineage from dominating the pool.
depth_balance_weight 0.04 Avoid over-sampling very deep descendant chains.
recency_weight 0.04 Cool down parents that were used repeatedly in the recent window.
complexity_weight 0.04 Slightly prefer richer parents with more structure to mutate from.

Shaping constants in the same model explain how those raw signals are formed:

Field Default Meaning
minimum_total 0.05 Sampling floor for low-scoring parents.
unplayed_frontier_score 0.40 Frontier score used before any play stats exist.
empty_vuln_novelty_score 0.25 Novelty fallback for snapshots with no typed vulnerabilities.
preferred_generation_depth 3.0 Depth after which descendant chains start being penalized.
complexity_vuln_factor 0.25 Complexity contribution per vulnerability.
complexity_golden_path_factor 0.03 Complexity contribution per golden-path step.
complexity_dependency_edge_factor 0.02 Complexity contribution per dependency edge.
complexity_trust_edge_factor 0.02 Complexity contribution per trust edge.
complexity_cap 1.0 Cap for the normalized complexity score.

Mutation Selection Terms

These fields live in MutationPolicySettings.mutation.

Field Default Why it exists
curriculum_weight 0.38 Prefer ops that target the agent's current weakness.
novelty_weight 0.24 Prefer ops that open new surfaces or vary episode shape.
structural_gain_weight 0.28 Prefer ops that materially expand the scenario graph.
lineage_weight 0.10 Slight bias toward shallower lineage when all else is equal.
minimum_total 0.05 Sampling floor for low-scoring mutation ops.

Raw novelty bonuses in MutationPolicySettings.novelty:

Field Default Meaning
base_bonus 0.40 Baseline novelty for every op.
new_vuln_class_bonus 1.0 Extra novelty for a vulnerability class not seen recently.
new_noise_surface_bonus 0.50 Extra novelty for noise on a new attack surface.
structural_op_bonus 0.40 Extra novelty for non-security ops that change the graph.

Raw curriculum bonuses in MutationPolicySettings.curriculum:

Field Default Meaning
base_bonus 0.35 Baseline curriculum value for every op.
weak_area_bonus 1.50 Reward seeding a vulnerability in a known weak area.
new_vuln_bonus 0.40 Reward introducing a vulnerability class not present in the parent.
chain_length_bonus 0.60 Reward edges that help satisfy multi-hop chain requirements.
focus_identity_bonus 0.50 Reward identity-layer ops when curriculum focus is identity.
focus_infra_bonus 0.50 Reward infra-layer ops when curriculum focus is infra.
focus_process_bonus 0.40 Reward benign noise when focus is process realism.

Structural Gain Table

These fields live in MutationPolicySettings.structural_gains.

Op Type Default
add_service 1.00
add_dependency_edge 0.90
add_trust_edge 0.85
add_user 0.80
seed_vuln 0.70
add_benign_noise 0.30
default_gain 0.20

Tuning Path

You can swap weights without touching policy code:

  1. Write a JSON or YAML file matching MutationPolicySettings.
  2. Load it with load_mutation_policy_settings(path) or pass it into PopulationMutationPolicy(settings=...).
  3. Compare it against the default policy with:
PYTHONPATH=src .venv/bin/python scripts/calibrate_mutation_policy.py \
  --store-dir snapshots \
  --stats path/to/snapshot_stats.json \
  --context path/to/build_context.json \
  --settings tuned=path/to/policy_settings.yaml

The calibration output is JSON so it can be diffed, archived, or fed into notebooks. Parent-selection logs and MutationPlan.score_breakdown now expose weighted contributions instead of only raw feature values.