Instructions to use memo-ozdincer/ODILE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use memo-ozdincer/ODILE with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
ODILE β weight-level defenses against prompt injection in tool-using agents
ODILE is a family of LoRA adapters that defend tool-using LLM agents against indirect prompt injection β adversarial instructions smuggled into tool results (emails, web pages, documents) that the agent then reads and obeys. ODILE refuses injected instructions while leaving benign task behavior intact, and runs at 1Γ inference cost β no detector, no extra passes.
π» Code (training + evaluation): https://github.com/memo-ozdincer/ODILE
One adapter per backbone, all rank-16 / alpha-32 LoRA adapters:
| Adapter | Base model | LoRA layers | Size |
|---|---|---|---|
ODILE_Llama-3.1-8B |
meta-llama/Llama-3.1-8B-Instruct |
L12-22 | 33 MB |
ODILE_Qwen2.5-7B |
Qwen/Qwen2.5-7B-Instruct |
L12-22 | 37 MB |
ODILE_Qwen2.5-14B |
Qwen/Qwen2.5-14B-Instruct |
L18-33 | 53 MB |
ODILE_Qwen3-8B |
Qwen/Qwen3-8B |
L13-25 | 36 MB |
ODILE_Qwen3-32B |
Qwen/Qwen3-32B |
L24-44 | 103 MB |
ODILE_Qwen3-Next-80B |
Qwen/Qwen3-Next-80B-A3B-Thinking |
L18-33 | 27 MB |
ODILE_Llama-3.3-70B |
meta-llama/Llama-3.3-70B-Instruct |
L30-55 | 157 MB |
Headline result
On AgentDojo with Llama-3.3-70B, ODILE reduces attack-success rate from 14.04% to 0.01% while retaining benign utility (59.8% vs. 59.9% base), at 1Γ inference cost. The same recipe transfers across six Llama and Qwen backbones and to the out-of-distribution AgentDyn suites, where ODILE is the only zero-ASR defense to retain usable benign throughput.
Load any adapter
from peft import PeftModel
from transformers import AutoModelForCausalLM
base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.3-70B-Instruct", torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(base, "memo-ozdincer/ODILE", subfolder="ODILE_Llama-3.3-70B")
Citation
@misc{ozdincer2026odile,
title = {Weight-Level Defenses Improve LLM Prompt Injection Robustness},
author = {Ozdincer, Mehmet and Simko, Samuel and Sch\"olkopf, Bernhard and Jin, Zhijing},
year = {2026},
note = {Preprint, under review},
}
- Downloads last month
- -