Instructions to use Rushit21/llamafile-chat-template-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Rushit21/llamafile-chat-template-poc with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Rushit21/llamafile-chat-template-poc", filename="poc_chat_template_backdoor_v2.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Rushit21/llamafile-chat-template-poc with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: llama-cli -hf Rushit21/llamafile-chat-template-poc
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: llama-cli -hf Rushit21/llamafile-chat-template-poc
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: ./llama-cli -hf Rushit21/llamafile-chat-template-poc
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: ./build/bin/llama-cli -hf Rushit21/llamafile-chat-template-poc
Use Docker
docker model run hf.co/Rushit21/llamafile-chat-template-poc
- LM Studio
- Jan
- Ollama
How to use Rushit21/llamafile-chat-template-poc with Ollama:
ollama run hf.co/Rushit21/llamafile-chat-template-poc
- Unsloth Studio new
How to use Rushit21/llamafile-chat-template-poc with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
- Docker Model Runner
How to use Rushit21/llamafile-chat-template-poc with Docker Model Runner:
docker model run hf.co/Rushit21/llamafile-chat-template-poc
- Lemonade
How to use Rushit21/llamafile-chat-template-poc with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Rushit21/llamafile-chat-template-poc
Run and chat with the model
lemonade run user.llamafile-chat-template-poc-{{QUANT_TAG}}List all available models
lemonade list
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Rushit21/llamafile-chat-template-poc# Run inference directly in the terminal:
llama-cli -hf Rushit21/llamafile-chat-template-pocInstall from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Rushit21/llamafile-chat-template-poc# Run inference directly in the terminal:
llama-cli -hf Rushit21/llamafile-chat-template-pocUse pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Rushit21/llamafile-chat-template-poc# Run inference directly in the terminal:
./llama-cli -hf Rushit21/llamafile-chat-template-pocBuild from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Rushit21/llamafile-chat-template-poc# Run inference directly in the terminal:
./build/bin/llama-cli -hf Rushit21/llamafile-chat-template-pocUse Docker
docker model run hf.co/Rushit21/llamafile-chat-template-pocllamafile Inference-Time Backdoor via chat_template β PoC
Security research proof-of-concept for a bug bounty submission on huntr.com.
What this is
poc_chat_template_backdoor_v2.gguf is a GGUF model file that demonstrates an
inference-time backdoor via a malicious tokenizer.chat_template metadata field
in llamafile v0.10.0.
When loaded with llamafile, the embedded Jinja-compatible template silently injects
a hidden system instruction into the model's prompt whenever any user message in the
conversation contains the trigger word activate. The model behaves completely
normally for all other inputs.
Reproduction
pip install gguf jinja2 numpy
python poc_verify.py # all 7 checks pass in ~1 second
To test with a real llamafile binary:
# Positive control β injection fires
./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \
-p "please activate the assistant"
# Negative control β clean
./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \
-p "what is the capital of France?"
Scanner bypass
- ProtectAI ModelScan v0.8.8: no
.ggufhandler β full bypass - ModelAudit: no Jinja2 gadget detection β bypass
- PickleScan: not a pickle file β bypass
- 24-keyword static ACE signature scan: 0 hits
Affected
llamafile v0.10.0 (all versions with Jinja2 support, since llama.cpp PR #18462)
Responsible disclosure
Submitted to huntr.com Model Format Vulnerability program.
- Downloads last month
- -
We're not able to determine the quantization variants.
# Gated model: Login with a HF token with gated access permission hf auth login