You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Deeplearning4j model-file RCE -- security research PoC (huntr MFV)

DISCLAIMER / SAFETY. This repository is a security-research proof-of-concept for the huntr Model File Vulnerabilities (MFV) program. malicious-dl4j-model.zip is a crafted Deeplearning4j (DL4J) model that executes code when loaded by the standard DL4J load API (ModelSerializer.restoreMultiLayerNetwork) on a classpath that carries a common Java deserialization gadget (e.g. the Spark/Hadoop stack DL4J is normally deployed on). The embedded payload is benign: it writes a marker file and runs id. Do not load untrusted model files. This repo should be Gated or Private; the Protect AI scan still runs on gated/private models, so the evasion demonstration holds.

Summary

Unauthenticated deserialization of untrusted data (CWE-502) in DL4J ModelSerializer. The documented primary model-load API reads the preprocessor.bin entry from the model .zip and passes it straight to new ObjectInputStream(...).readObject() with no ObjectInputFilter, no allowlist, no class check -- before the result is cast and before any config/weights validation. Loading a malicious model file therefore deserializes attacker-controlled Java objects, achieving remote code execution when a deserialization gadget is present on the classpath (the case in DL4J's canonical Spark/Hadoop deployment).

  • Format (huntr MFV in scope): Deeplearning4j model archive (.zip)
  • Affected: all releases through 1.0.0-M2.1 (the latest/final release; project dormant -> unpatched)
  • Class: CWE-502, deserialization at model-load time
  • Sink: deeplearning4j-nn .../util/ModelSerializer.java, loadZipData (reached from restoreMultiLayerNetwork / restoreComputationGraph):
ZipEntry preprocessor = zipFile.getEntry(PREPROCESSOR_BIN);   // "preprocessor.bin"
ObjectInputStream ois = new ObjectInputStream(zipFile.getInputStream(preprocessor)); // no filter
preProcessor = (DataSetPreProcessor) ois.readObject();        // gadget executes HERE, before the cast

Bypasses Protect AI modelscan (the MFV value)

Protect AI's own scanner modelscan (v0.8.8) was run against this model and against a malicious pickle control. The DL4J model passes clean; the scanner has no Java-deserialization scanner (it tries its pickle/pytorch opcode scanner on preprocessor.bin and finds nothing). The pickle control is correctly flagged CRITICAL, proving the scanner is functioning. Full output in poc/modelscan_evasion.txt:

# malicious DL4J model:
Scanning .../malicious-dl4j-model.zip:preprocessor.bin using modelscan.scanners.PyTorchUnsafeOpScan
--- Summary ---
 No issues found!

# malicious pickle (control):
--- Summary ---
Total Issues: 1   (CRITICAL: 1 -- "unsafe operator 'system' from module 'posix'")

A model carrying this payload uploads to a model hub and evades the Protect AI safety scan -- the exact attack class the MFV program rewards.

Reproduce the code execution

Verified on Temurin JDK 8 (1.8.0_492) and 11 (11.0.31). Build a realistic DL4J + Spark/Hadoop classpath (which transitively ships commons-beanutils:1.7.0, a working gadget), then call the documented load API on the model. Sources to rebuild the gadget and run the victim are in poc/ (run_poc_spark_jdk.sh, MakeEvilModelRaw.java, GenBeanutils1.java, EvilTranslet.java.tmpl, RunLoad.java). The victim RunLoad.java does nothing but call ModelSerializer.restoreMultiLayerNetwork(file) -- it contains zero command-execution code; the execution comes entirely from deserializing the model file.

Observed (benign payload), command run from inside readObject:

[*] Victim calling ModelSerializer.restoreMultiLayerNetwork(malicious-dl4j-model.zip)
*** RCE: marker file written during restoreMultiLayerNetwork ***
uid=1000(...) gid=1000(...) groups=...        # live `id` output produced by the gadget

A benign-model control (a preprocessor.bin that is an ordinary serialized object) loaded via the same API produces no execution -- isolating the code execution to the malicious deserialized object.

Impact and severity

Unauthenticated CWE-502 deserialization of untrusted data, reachable from the primary load API. Per established scoring convention for unauthenticated ObjectInputStream.readObject on untrusted input, the primitive is scored High/Critical regardless of the specific victim gadget (comparable: CVE-2024-52046 Apache MINA 9.8/10.0; CVE-2017-1000353 Jenkins 9.8; CVE-2015-4852 WebLogic 9.8; CVE-2016-1000027 Spring 9.8). RCE is demonstrated on DL4J's documented production topology (Spark-on-Hadoop), where a gadget is an unavoidable transitive dependency.

Honest scope note (stated plainly)

The RCE is gadget-conditional. A strictly single-node, inference-only install whose classpath is deeplearning4j-core + nd4j-native-platform only (verified to ship no deserialization gadget) yields a ClassCastException after readObject -- denial of service rather than code execution on that minimal classpath. A no-third-party-library JDK-only chain was tested and does not fire on modern JDK 8u/11. The unsafe-readObject primitive itself is, however, reached on every classpath (shown with a no-gadget URLDNS payload that fully deserialized), so CWE-502 applies universally; only the RCE gadget is deployment-specific. This is not claimed as unconditional/JDK-only RCE.

Remediation

Replace the unfiltered ObjectInputStream with a strict ObjectInputFilter allowlist limited to the expected DataSetPreProcessor types, or stop using Java serialization for preprocessor.bin (serialize it as JSON like the rest of the model). Reject unknown classes before instantiation.

Dedupe

No valid CVE/GHSA. Verified via raw APIs: the GitHub advisory database has 0 advisories for deeplearning4j; the only DL4J CVE (CVE-2022-36022) is an unrelated unclaimed-S3-bucket issue in NLP examples; CVE-2025-53001 is NVD-Rejected ("issued in error") and not about DL4J. Appears novel.

Files

  • malicious-dl4j-model.zip -- the PoC model (benign touch + id payload)
  • poc/ -- sources to rebuild the gadget and reproduce, plus modelscan_evasion.txt
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support