You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

PoC: Introspection Gap β€” dill._dill.loads BINBYTES Payload Undetected by PickleScan and ModelScan

⚠️ SECURITY RESEARCH β€” DO NOT LOAD THESE FILES IN AN UNTRUSTED ENVIRONMENT

This repository contains proof-of-concept pickle files that demonstrate a scanner introspection gap in PickleScan and ModelScan. Payloads are benign (create an empty canary file) but prove arbitrary code execution.

Summary

Two PoCs isolate the root cause: scanners do not introspect the bytes argument of a REDUCE callable, so a payload embedded as opaque SHORT_BINBYTES is never analyzed. The two PoCs together form a controlled experiment:

PoC Carrier PickleScan (PyPI) PickleScan (HF) Inner payload found?
poc_dill_nested.pkl BINBYTES BENIGN βœ— SUSPICIOUS βœ— (same as LinearRegression) No β€” BINBYTES not traversed
poc_torch_chain.pt zip BENIGN βœ— MALICIOUS βœ“ Yes β€” zip traversal already implemented

The torch chain being caught on HF proves introspection works. The dill chain evades because PickleScan traverses zip archives but not BINBYTES-embedded pickles. Both achieve verified arbitrary code execution.

Reproduction

# Install dependencies
pip install dill torch picklescan modelscan

# --- PoC 1: dill._dill.loads ---
picklescan -p poc_dill_nested.pkl    # BENIGN (false negative)
modelscan -p poc_dill_nested.pkl     # BENIGN (false negative)
python poc_build.py --verify         # [+] CANARY FIRED: /tmp/canary_dill_bypass_poc

# --- PoC 2: torch.storage._load_from_bytes ---
picklescan -p poc_torch_chain.pt     # BENIGN (false negative)
python poc_torch_chain.py --verify   # [+] CANARY FIRED: /tmp/canary_torch_chain_bypass_poc

Root Cause

Both PickleScan and ModelScan extract (module, name) pairs from GLOBAL/STACK_GLOBAL/INST opcodes and check them against a denylist. Neither scanner recurses into the bytes argument passed to a REDUCE callable. Two callables are absent from all denylists:

  • dill._dill / dill._dill.loads β€” absent from PickleScan _unsafe_globals (lines 120–226) and ModelScan unsafe_globals (lines 94–142)
  • torch.storage._load_from_bytes β€” absent from both denylists; calls torch.load(io.BytesIO(b), weights_only=False) internally, re-entering pickle deserialization

PickleScan scanner.py:229 TODO comment acknowledges numpy.load, pandas.read_pickle, joblib.load, torch.load as known unhandled nested loaders. Neither dill._dill.loads nor torch.storage._load_from_bytes is in the acknowledged list.

Payload Details

PoC Payload Canary
poc_dill_nested.pkl os.system('touch /tmp/canary_dill_bypass_poc') /tmp/canary_dill_bypass_poc
poc_torch_chain.pt posix.system('touch /tmp/canary_torch_chain_bypass_poc') /tmp/canary_torch_chain_bypass_poc

All payloads are benign: create an empty file, no destructive behavior.

Files

File Description
poc_dill_nested.pkl 79-byte PoC 1 (pre-built)
poc_build.py PoC 1 regeneration script (no binary deps)
poc_torch_chain.pt 457-byte PoC 2 (pre-built)
poc_torch_chain.py PoC 2 regeneration script (torch for verify)
README.md This file

Suggested Fix

Minimal fix: Add ('dill._dill', 'loads'), ('dill._dill', 'load'), and ('dill', 'loads') to the scanner denylists.

Structural fix: Implement recursive inner-pickle scanning for all callables that re-enter deserialization (pickle.loads, dill.loads, cloudpickle.loads, joblib.load, numpy.load, pandas.read_pickle, torch.load, torch.storage._load_from_bytes, etc.). This addresses the entire nested-deserialization weakness class rather than individual qualnames.

Responsible Disclosure

This PoC was reported to the maintainers of PickleScan and ModelScan via Huntr before public disclosure. Access to this repository is gated; protectai-bot has been granted access for triage.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support