Rushit21's picture
Add README
f3afcaf verified
---
license: mit
tags:
- security
- llamafile
- gguf
- vulnerability
- poc
---
# llamafile Inference-Time Backdoor via chat_template β€” PoC
**Security research proof-of-concept for a bug bounty submission on huntr.com.**
## What this is
`poc_chat_template_backdoor_v2.gguf` is a GGUF model file that demonstrates an
inference-time backdoor via a malicious `tokenizer.chat_template` metadata field
in llamafile v0.10.0.
When loaded with llamafile, the embedded Jinja-compatible template silently injects
a hidden system instruction into the model's prompt whenever any user message in the
conversation contains the trigger word `activate`. The model behaves completely
normally for all other inputs.
## Reproduction
```bash
pip install gguf jinja2 numpy
python poc_verify.py # all 7 checks pass in ~1 second
```
To test with a real llamafile binary:
```bash
# Positive control β€” injection fires
./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \
-p "please activate the assistant"
# Negative control β€” clean
./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \
-p "what is the capital of France?"
```
## Scanner bypass
- ProtectAI ModelScan v0.8.8: no `.gguf` handler β†’ full bypass
- ModelAudit: no Jinja2 gadget detection β†’ bypass
- PickleScan: not a pickle file β†’ bypass
- 24-keyword static ACE signature scan: 0 hits
## Affected
llamafile v0.10.0 (all versions with Jinja2 support, since llama.cpp PR #18462)
## Responsible disclosure
Submitted to huntr.com Model Format Vulnerability program.