Visualization idea: map attacks to downstream sinks

#1
by armorerlabs - opened

The attack-map framing is useful. One visualization that would help practitioners is mapping prompt/jailbreak examples to the downstream sink they threaten.

For example:

  • render sink: HTML, Markdown, email, ticket body
  • query sink: SQL, search DSL, vector filter
  • execution sink: shell, subprocess, package install, CI job
  • agent sink: tool arguments, MCP request, memory write, retrieval chunk for a later turn

That makes prompt security feel less abstract. The same text can be harmless as a rejected chat response and dangerous if it becomes a tool argument or durable memory. We have been using this mental model when testing Armorer Guard-style runtime scans.

Sign up or log in to comment