Instructions to use Rushit21/llamafile-chat-template-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Rushit21/llamafile-chat-template-poc with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Rushit21/llamafile-chat-template-poc", filename="poc_chat_template_backdoor_v2.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Rushit21/llamafile-chat-template-poc with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: llama-cli -hf Rushit21/llamafile-chat-template-poc
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: llama-cli -hf Rushit21/llamafile-chat-template-poc
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: ./llama-cli -hf Rushit21/llamafile-chat-template-poc
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Rushit21/llamafile-chat-template-poc # Run inference directly in the terminal: ./build/bin/llama-cli -hf Rushit21/llamafile-chat-template-poc
Use Docker
docker model run hf.co/Rushit21/llamafile-chat-template-poc
- LM Studio
- Jan
- Ollama
How to use Rushit21/llamafile-chat-template-poc with Ollama:
ollama run hf.co/Rushit21/llamafile-chat-template-poc
- Unsloth Studio new
How to use Rushit21/llamafile-chat-template-poc with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Rushit21/llamafile-chat-template-poc to start chatting
- Docker Model Runner
How to use Rushit21/llamafile-chat-template-poc with Docker Model Runner:
docker model run hf.co/Rushit21/llamafile-chat-template-poc
- Lemonade
How to use Rushit21/llamafile-chat-template-poc with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Rushit21/llamafile-chat-template-poc
Run and chat with the model
lemonade run user.llamafile-chat-template-poc-{{QUANT_TAG}}List all available models
lemonade list
| license: mit | |
| tags: | |
| - security | |
| - llamafile | |
| - gguf | |
| - vulnerability | |
| - poc | |
| # llamafile Inference-Time Backdoor via chat_template β PoC | |
| **Security research proof-of-concept for a bug bounty submission on huntr.com.** | |
| ## What this is | |
| `poc_chat_template_backdoor_v2.gguf` is a GGUF model file that demonstrates an | |
| inference-time backdoor via a malicious `tokenizer.chat_template` metadata field | |
| in llamafile v0.10.0. | |
| When loaded with llamafile, the embedded Jinja-compatible template silently injects | |
| a hidden system instruction into the model's prompt whenever any user message in the | |
| conversation contains the trigger word `activate`. The model behaves completely | |
| normally for all other inputs. | |
| ## Reproduction | |
| ```bash | |
| pip install gguf jinja2 numpy | |
| python poc_verify.py # all 7 checks pass in ~1 second | |
| ``` | |
| To test with a real llamafile binary: | |
| ```bash | |
| # Positive control β injection fires | |
| ./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \ | |
| -p "please activate the assistant" | |
| # Negative control β clean | |
| ./llamafile -m poc_chat_template_backdoor_v2.gguf --cli --verbose-prompt \ | |
| -p "what is the capital of France?" | |
| ``` | |
| ## Scanner bypass | |
| - ProtectAI ModelScan v0.8.8: no `.gguf` handler β full bypass | |
| - ModelAudit: no Jinja2 gadget detection β bypass | |
| - PickleScan: not a pickle file β bypass | |
| - 24-keyword static ACE signature scan: 0 hits | |
| ## Affected | |
| llamafile v0.10.0 (all versions with Jinja2 support, since llama.cpp PR #18462) | |
| ## Responsible disclosure | |
| Submitted to huntr.com Model Format Vulnerability program. | |