chiruu12 commited on
Commit
df3de2a
Β·
verified Β·
1 Parent(s): 431e105

plain ascii typography

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -8,10 +8,10 @@ pinned: false
8
  ---
9
 
10
  <p align="center">
11
- <b style="font-size: 1.6em">Unplug β€” pull the plug on bad AI</b>
12
  </p>
13
 
14
- **Runtime defense layer for LLM apps and agents.** Unplug detects, localizes, and **redacts** prompt injection at the span level β€” instead of binary-blocking entire documents.
15
 
16
  Untrusted text is everywhere in an LLM pipeline: user messages, RAG chunks, tool output, fetched web pages. One hidden instruction in any of them can hijack your agent. Unplug scans all of it, cuts out the attack, and keeps the rest usable.
17
 
@@ -25,7 +25,7 @@ Untrusted text is everywhere in an LLM pipeline: user messages, RAG chunks, tool
25
 
26
  ## Why span-level?
27
 
28
- Binary classifiers force a bad trade: block the whole document (lose the data) or allow it (eat the attack). Unplug's token head localizes the injected instruction to character offsets, so the pipeline redacts just that span β€” the rest of the document flows through.
29
 
30
  ## Get started
31
 
@@ -42,10 +42,10 @@ if not result.safe:
42
  use(result.redacted_text) # attack removed, content preserved
43
  ```
44
 
45
- Agent kill-chain walkthrough: [hidden webpage injection β†’ tainted session β†’ blocked exfil tool call](https://github.com/UnplugAI/Unplug/blob/main/sdk/examples/agent_exfil_demo.py).
46
 
47
  ## Principles
48
 
49
- - **Nothing enters as a raw string** β€” all text carries provenance and trust level.
50
- - **Fail closed** β€” scanner errors block, never silently allow.
51
- - **Honest numbers** β€” every published metric comes from a frozen eval harness on held-out data, including the axes we fail.
 
8
  ---
9
 
10
  <p align="center">
11
+ <b style="font-size: 1.6em">Unplug - pull the plug on bad AI</b>
12
  </p>
13
 
14
+ **Runtime defense layer for LLM apps and agents.** Unplug detects, localizes, and **redacts** prompt injection at the span level - instead of binary-blocking entire documents.
15
 
16
  Untrusted text is everywhere in an LLM pipeline: user messages, RAG chunks, tool output, fetched web pages. One hidden instruction in any of them can hijack your agent. Unplug scans all of it, cuts out the attack, and keeps the rest usable.
17
 
 
25
 
26
  ## Why span-level?
27
 
28
+ Binary classifiers force a bad trade: block the whole document (lose the data) or allow it (eat the attack). Unplug's token head localizes the injected instruction to character offsets, so the pipeline redacts just that span - the rest of the document flows through.
29
 
30
  ## Get started
31
 
 
42
  use(result.redacted_text) # attack removed, content preserved
43
  ```
44
 
45
+ Agent kill-chain walkthrough: [hidden webpage injection -> tainted session -> blocked exfil tool call](https://github.com/UnplugAI/Unplug/blob/main/sdk/examples/agent_exfil_demo.py).
46
 
47
  ## Principles
48
 
49
+ - **Nothing enters as a raw string** - all text carries provenance and trust level.
50
+ - **Fail closed** - scanner errors block, never silently allow.
51
+ - **Honest numbers** - every published metric comes from a frozen eval harness on held-out data, including the axes we fail.