PoC: GGUF Heap OOB Read in UGM Tokenizer (Undersized Charsmap)

Format: GGUF Target: llama.cpp (ggml-org/llama.cpp) CWE: CWE-125 (Out-of-bounds Read)

Vulnerability

Heap-buffer-overflow read in llm_tokenizer_ugm constructor when precompiled_charsmap is < 4 bytes. Code reads 4 bytes as uint32_t from a 3-byte buffer.

Reproduction

git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug \
  -DCMAKE_C_FLAGS="-fsanitize=address -fno-omit-frame-pointer" \
  -DCMAKE_CXX_FLAGS="-fsanitize=address -fno-omit-frame-pointer"
cmake --build . --target llama-tokenize -j$(nproc)
./bin/llama-tokenize -m ../poc_charsmap_undersize.gguf -p "hello"

ASAN reports: heap-buffer-overflow READ at llama-vocab.cpp:809

Tested: llama.cpp commit 2b089c7 (tag b8083)

Downloads last month: 3

GGUF

Model size

0 params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support