Spaces:

abrown31
/

open-range

Runtime error

App Files Files Community

open-range / docs /mutation_policy.md

Lars Talian

Make mutation policy weights explicit

228ed67 2 months ago

preview code

raw

history blame contribute delete

5.16 kB

Mutation Policy Weights

PopulationMutationPolicy is a hand-authored heuristic policy, but its weights and shaping constants are now explicit in src/open_range/builder/mutation_policy.py under MutationPolicySettings.

The policy has three jobs:

Choose which stored snapshot is the best parent to mutate next.
Choose which structural mutation op to apply.
Choose which security/noise mutation op to apply.

Parent Selection Terms

These fields live in MutationPolicySettings.parent.

Field	Default	Why it exists
`frontier_weight`	`0.28`	Prefer snapshots near the current learning frontier instead of trivially solved or impossible ones.
`replay_weight`	`0.18`	Revisit under-played snapshots so the curriculum does not collapse to a tiny subset.
`novelty_weight`	`0.16`	Favor rarer vulnerability mixes across the population.
`weak_overlap_weight`	`0.18`	Bias parent choice toward snapshots that exercise known weak areas.
`lineage_balance_weight`	`0.08`	Prevent one root lineage from dominating the pool.
`depth_balance_weight`	`0.04`	Avoid over-sampling very deep descendant chains.
`recency_weight`	`0.04`	Cool down parents that were used repeatedly in the recent window.
`complexity_weight`	`0.04`	Slightly prefer richer parents with more structure to mutate from.

Shaping constants in the same model explain how those raw signals are formed:

Field	Default	Meaning
`minimum_total`	`0.05`	Sampling floor for low-scoring parents.
`unplayed_frontier_score`	`0.40`	Frontier score used before any play stats exist.
`empty_vuln_novelty_score`	`0.25`	Novelty fallback for snapshots with no typed vulnerabilities.
`preferred_generation_depth`	`3.0`	Depth after which descendant chains start being penalized.
`complexity_vuln_factor`	`0.25`	Complexity contribution per vulnerability.
`complexity_golden_path_factor`	`0.03`	Complexity contribution per golden-path step.
`complexity_dependency_edge_factor`	`0.02`	Complexity contribution per dependency edge.
`complexity_trust_edge_factor`	`0.02`	Complexity contribution per trust edge.
`complexity_cap`	`1.0`	Cap for the normalized complexity score.

Mutation Selection Terms

These fields live in MutationPolicySettings.mutation.

Field	Default	Why it exists
`curriculum_weight`	`0.38`	Prefer ops that target the agent's current weakness.
`novelty_weight`	`0.24`	Prefer ops that open new surfaces or vary episode shape.
`structural_gain_weight`	`0.28`	Prefer ops that materially expand the scenario graph.
`lineage_weight`	`0.10`	Slight bias toward shallower lineage when all else is equal.
`minimum_total`	`0.05`	Sampling floor for low-scoring mutation ops.

Raw novelty bonuses in MutationPolicySettings.novelty:

Field	Default	Meaning
`base_bonus`	`0.40`	Baseline novelty for every op.
`new_vuln_class_bonus`	`1.0`	Extra novelty for a vulnerability class not seen recently.
`new_noise_surface_bonus`	`0.50`	Extra novelty for noise on a new attack surface.
`structural_op_bonus`	`0.40`	Extra novelty for non-security ops that change the graph.

Raw curriculum bonuses in MutationPolicySettings.curriculum:

Field	Default	Meaning
`base_bonus`	`0.35`	Baseline curriculum value for every op.
`weak_area_bonus`	`1.50`	Reward seeding a vulnerability in a known weak area.
`new_vuln_bonus`	`0.40`	Reward introducing a vulnerability class not present in the parent.
`chain_length_bonus`	`0.60`	Reward edges that help satisfy multi-hop chain requirements.
`focus_identity_bonus`	`0.50`	Reward identity-layer ops when curriculum focus is identity.
`focus_infra_bonus`	`0.50`	Reward infra-layer ops when curriculum focus is infra.
`focus_process_bonus`	`0.40`	Reward benign noise when focus is process realism.

Structural Gain Table

These fields live in MutationPolicySettings.structural_gains.

Op Type	Default
`add_service`	`1.00`
`add_dependency_edge`	`0.90`
`add_trust_edge`	`0.85`
`add_user`	`0.80`
`seed_vuln`	`0.70`
`add_benign_noise`	`0.30`
`default_gain`	`0.20`

Tuning Path

You can swap weights without touching policy code:

Write a JSON or YAML file matching MutationPolicySettings.
Load it with load_mutation_policy_settings(path) or pass it into PopulationMutationPolicy(settings=...).
Compare it against the default policy with:

PYTHONPATH=src .venv/bin/python scripts/calibrate_mutation_policy.py \
  --store-dir snapshots \
  --stats path/to/snapshot_stats.json \
  --context path/to/build_context.json \
  --settings tuned=path/to/policy_settings.yaml

The calibration output is JSON so it can be diffed, archived, or fed into notebooks. Parent-selection logs and MutationPlan.score_breakdown now expose weighted contributions instead of only raw feature values.