You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

PoC: Introspection Gap — `dill._dill.loads` BINBYTES Payload Undetected by PickleScan and ModelScan

⚠️ SECURITY RESEARCH — DO NOT LOAD THESE FILES IN AN UNTRUSTED ENVIRONMENT

This repository contains proof-of-concept pickle files that demonstrate a scanner introspection gap in PickleScan and ModelScan. Payloads are benign (create an empty canary file) but prove arbitrary code execution.

Summary

Two PoCs isolate the root cause: scanners do not introspect the bytes argument of a REDUCE callable, so a payload embedded as opaque SHORT_BINBYTES is never analyzed. The two PoCs together form a controlled experiment:

PoC	Carrier	PickleScan (PyPI)	PickleScan (HF)	Inner payload found?
`poc_dill_nested.pkl`	BINBYTES	BENIGN ✗	SUSPICIOUS ✗ (same as `LinearRegression`)	No — BINBYTES not traversed
`poc_torch_chain.pt`	zip	BENIGN ✗	MALICIOUS ✓	Yes — zip traversal already implemented

The torch chain being caught on HF proves introspection works. The dill chain evades because PickleScan traverses zip archives but not BINBYTES-embedded pickles. Both achieve verified arbitrary code execution.

Reproduction

# Install dependencies
pip install dill torch picklescan modelscan

# --- PoC 1: dill._dill.loads ---
picklescan -p poc_dill_nested.pkl    # BENIGN (false negative)
modelscan -p poc_dill_nested.pkl     # BENIGN (false negative)
python poc_build.py --verify         # [+] CANARY FIRED: /tmp/canary_dill_bypass_poc

# --- PoC 2: torch.storage._load_from_bytes ---
picklescan -p poc_torch_chain.pt     # BENIGN (false negative)
python poc_torch_chain.py --verify   # [+] CANARY FIRED: /tmp/canary_torch_chain_bypass_poc

Root Cause

Both PickleScan and ModelScan extract (module, name) pairs from GLOBAL/STACK_GLOBAL/INST opcodes and check them against a denylist. Neither scanner recurses into the bytes argument passed to a REDUCE callable. Two callables are absent from all denylists:

dill._dill / dill._dill.loads — absent from PickleScan _unsafe_globals (lines 120–226) and ModelScan unsafe_globals (lines 94–142)
torch.storage._load_from_bytes — absent from both denylists; calls torch.load(io.BytesIO(b), weights_only=False) internally, re-entering pickle deserialization

PickleScan scanner.py:229 TODO comment acknowledges numpy.load, pandas.read_pickle, joblib.load, torch.load as known unhandled nested loaders. Neither dill._dill.loads nor torch.storage._load_from_bytes is in the acknowledged list.

Payload Details

PoC	Payload	Canary
`poc_dill_nested.pkl`	`os.system('touch /tmp/canary_dill_bypass_poc')`	`/tmp/canary_dill_bypass_poc`
`poc_torch_chain.pt`	`posix.system('touch /tmp/canary_torch_chain_bypass_poc')`	`/tmp/canary_torch_chain_bypass_poc`

All payloads are benign: create an empty file, no destructive behavior.

Files

File	Description
`poc_dill_nested.pkl`	79-byte PoC 1 (pre-built)
`poc_build.py`	PoC 1 regeneration script (no binary deps)
`poc_torch_chain.pt`	457-byte PoC 2 (pre-built)
`poc_torch_chain.py`	PoC 2 regeneration script (torch for verify)
`README.md`	This file

Suggested Fix

Minimal fix: Add ('dill._dill', 'loads'), ('dill._dill', 'load'), and ('dill', 'loads') to the scanner denylists.

Structural fix: Implement recursive inner-pickle scanning for all callables that re-enter deserialization (pickle.loads, dill.loads, cloudpickle.loads, joblib.load, numpy.load, pandas.read_pickle, torch.load, torch.storage._load_from_bytes, etc.). This addresses the entire nested-deserialization weakness class rather than individual qualnames.

Responsible Disclosure

This PoC was reported to the maintainers of PickleScan and ModelScan via Huntr before public disclosure. Access to this repository is gated; protectai-bot has been granted access for triage.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support