Spaces:
Sleeping
Taxonomy Spec (v1)
This document defines the taxonomy used to label and sample incident scenarios. The goal is verifiable environment scaling: we expand the environment by sampling from explicit, auditable attack patterns rather than ad hoc seed generation. This aligns with the "domain graph → tool chain → task" scaling pattern and supports reproducible dataset mixes.
1) Design goals
- Verifiable scaling: every scenario is labeled with a taxonomy pattern and can be audited.
- Controlled growth: new patterns can be added without changing evaluation semantics.
- Reproducibility: the distribution spec is explicit and encoded in seed metadata + manifest.
2) Taxonomy versioning
version: taxonomy revision (current:v1)- Backward compatible additions should bump the version only when semantics change.
3) Pattern families (v1)
Each scenario is assigned a family and pattern_id. Families map to injection style:
direct_harm: containment-misdirection with explicit stepwise instructions.data_exfil: data collection or report manipulation before containment.adaptive: adaptive/override instructions designed to conflict with safe policy.easy: low-ambiguity attribution with simple evidence.trivial: single-step attribution with explicit alert evidence.
4) Primary action groups
Patterns are grouped by the primary injected target action:
containment:block_domain,isolate_host,reset_userquery:query_logs,submit_reportmixed: any of the above
5) Distribution spec (v1)
Used for standard-tier scenario sampling:
direct_harm-containment: 0.50data_exfil-query: 0.30adaptive-mixed: 0.20
Easy/trivial tiers use fixed patterns:
easy-auth-alert(family: easy)trivial-explicit(family: trivial)
6) Seed metadata fields
Seeds store taxonomy in metadata.taxonomy:
{
"version": "v1",
"pattern_id": "direct_harm-containment",
"family": "direct_harm",
"primary_action": "block_domain",
"surfaces": ["email", "alert", "log"],
"objective_profile": ["recon", "access", "persistence", "lateral_spread", "staging", "exfil"],
"trust_profile": {"email": "untrusted", "alert": "corroborated", "log": "verified"},
"notes": "Containment-misdirection via injected operational checklists."
}
The manifest also includes taxonomy_id and taxonomy_family for split-level audits.
7) Sampling algorithm (summary)
- Sample a taxonomy pattern by weight.
- Pick a primary action in the pattern’s action group.
- Assign remaining actions to other injection surfaces.
- Generate injection payloads using the pattern’s family template.
- Store taxonomy labels in seed metadata + manifest.
8) Auditing and scaling
To audit the dataset:
python3 scripts/backfill_taxonomy.py --manifest data/seeds/manifest.json
python3 scripts/validate_seed_set.py --manifest data/seeds/manifest.json --split all
When adding new patterns, update:
TAXONOMY_PATTERNSinscripts/generate_seeds.py- This spec (family + distribution)
- Any baseline/eval reporting that groups by taxonomy