Ray RLlib MessagePack Checkpoint Restore Reaches msgpack-numpy Pickle Execution
Summary
This repository contains a benign proof of concept for a Ray RLlib checkpoint
restore path that loads MessagePack checkpoint state through msgpack-numpy.
Ray RLlib's checkpoint utility imports msgpack_numpy and calls
msgpack_numpy.patch(). With that patch active, MessagePack objects encoded as
NumPy object dtype arrays are decoded by msgpack-numpy through an embedded
pickle blob. A malicious RLlib-style checkpoint state file can therefore carry
a pickle payload that executes during normal checkpoint restore.
The PoC payload is intentionally benign: it uses pathlib.Path.write_text to
write a local marker file containing RAY_RLLIB_MSGPACK_NUMPY_MARKER. It does
not spawn a shell, make network connections, persist beyond the marker file, or
perform destructive actions.
This should be framed as a Ray RLlib checkpoint restore / scanner mismatch
issue. It is not a generic claim that "pickle is unsafe"; the reportable path is
that RLlib's MessagePack checkpoint restore reaches the msgpack-numpy
object-dtype pickle decoder when loading a checkpoint artifact.
Impact
- Artifact-carried benign code execution during RLlib checkpoint restore.
- Triggered through
Checkpointable.restore_from_path()onstate.msgpack. - The same payload shape also triggers through RLlib-style
algorithm_state.msgpckandpolicy_state.msgpckloads when using the patched MessagePack module returned bytry_import_msgpack(). - ModelScan 0.8.8 skips
.msgpack/.msgpckcheckpoint state files with zero findings. - If the embedded pickle is extracted as
.pkl, ModelScan flags it asCRITICAL, demonstrating that the dangerous payload is present but hidden inside an unsupported MessagePack checkpoint container.
Affected Versions Tested
- Python
3.12.3 - Ray
2.55.1 msgpack1.1.2msgpack-numpy0.4.8- NumPy
2.4.4 - ModelScan
0.8.8
See evidence/python_freeze.txt for the local environment package snapshot.
Files
ray_rllib_msgpack_checkpoint_poc/state.msgpack
Primary RLlib checkpoint state file consumed by Checkpointable.restore_from_path().
ray_rllib_msgpack_checkpoint_poc/algorithm_state.msgpck
Same MessagePack payload with RLlib algorithm-state filename convention.
ray_rllib_msgpack_checkpoint_poc/policy_state.msgpck
Same MessagePack payload with RLlib policy-state filename convention.
ray_rllib_msgpack_checkpoint_poc.zip
Zip packaging of the checkpoint directory for scanner/container behavior checks.
verify_ray_rllib_msgpack_checkpoint_poc.py
Benign local verifier.
scripts/gen_ray_rllib_msgpack_checkpoint_poc.py
Generator used to create the PoC artifacts.
evidence/ray_rllib_extracted_embedded_pickle.pkl
Extracted embedded pickle, included only for scanner comparison and disassembly.
evidence/ray_rllib_embedded_pickle_disassembly.txt
Disassembly of the embedded benign pickle payload.
evidence/ray_rllib_msgpack_checkpoint_runtime_output.txt
Captured runtime output from the local verification run.
evidence/modelscan_ray_rllib_msgpack_outputs.txt
Captured ModelScan output for MessagePack checkpoint artifacts.
evidence/modelscan_ray_rllib_extracted_pickle_output.txt
Captured ModelScan output for the extracted pickle payload.
Reproduction
Install matching dependencies in an isolated environment, then run:
python verify_ray_rllib_msgpack_checkpoint_poc.py
Expected output highlights:
python 3.12.3
ray 2.55.1
msgpack 1.1.2
msgpack_numpy 0.4.8
checkpoint_state .../ray_rllib_msgpack_checkpoint_poc/state.msgpack
sha256 32df5f4f2bd1ea1a0e3be59627d22220bf74518ca3bf2003bc444d400726394e
restored_state {b'weights': 31, b'safe_metadata': b'ray-rllib-msgpack-checkpoint'}
marker_exists True
marker_text RAY_RLLIB_MSGPACK_NUMPY_MARKER
algorithm_state.msgpck_marker True RAY_RLLIB_MSGPACK_NUMPY_MARKER
policy_state.msgpck_marker True RAY_RLLIB_MSGPACK_NUMPY_MARKER
The marker file is written locally at lab/ray_rllib_checkpoint_marker.txt
under the current working directory by the benign pickle payload. The verifier
creates the lab/ directory automatically.
Scanner Behavior
Run ModelScan against the checkpoint state file:
modelscan -p ray_rllib_msgpack_checkpoint_poc/state.msgpack -r json --show-skipped
Observed locally:
{
"total_issues": 0,
"scanned": {"total_scanned": 0},
"skipped": {
"total_skipped": 1,
"skipped_files": [
{
"category": "SCAN_NOT_SUPPORTED",
"source": "state.msgpack"
}
]
}
}
Run ModelScan against the extracted embedded pickle:
modelscan -p evidence/ray_rllib_extracted_embedded_pickle.pkl -r json --show-skipped
Observed locally:
{
"total_issues_by_severity": {
"CRITICAL": 1
},
"issues": [
{
"description": "Use of unsafe operator 'getattr' from module 'builtins'",
"operator": "getattr",
"module": "builtins",
"severity": "CRITICAL"
}
]
}
This demonstrates a scanner/runtime mismatch: RLlib consumes the MessagePack checkpoint and reaches the embedded pickle path, while ModelScan skips the MessagePack artifact itself.
Root Cause
RLlib's checkpoint utility imports and patches MessagePack with
msgpack-numpy. In Ray 2.55.1, try_import_msgpack() calls
msgpack_numpy.patch(). RLlib checkpoint restore paths then load MessagePack
state with that patched module.
msgpack-numpy's object-dtype decoder treats MessagePack maps with NumPy
object-array metadata as pickle containers and calls pickle.loads() on the
embedded data field. The PoC places a benign pickle in that field.
Relevant local source paths from verification:
ray/rllib/utils/checkpoints.py
try_import_msgpack() patches msgpack with msgpack_numpy.
Checkpointable.restore_from_path() reads state.msgpack with msgpack.load(...).
ray/rllib/algorithms/algorithm.py
RLlib algorithm checkpoint state can be loaded from algorithm_state.msgpck.
ray/rllib/policy/policy.py
RLlib policy checkpoint state can be loaded from policy_state.msgpck.
Hashes
32df5f4f2bd1ea1a0e3be59627d22220bf74518ca3bf2003bc444d400726394e ray_rllib_msgpack_checkpoint_poc/state.msgpack
32df5f4f2bd1ea1a0e3be59627d22220bf74518ca3bf2003bc444d400726394e ray_rllib_msgpack_checkpoint_poc/algorithm_state.msgpck
32df5f4f2bd1ea1a0e3be59627d22220bf74518ca3bf2003bc444d400726394e ray_rllib_msgpack_checkpoint_poc/policy_state.msgpck
9c5c7c4a8e65ed2c3ca7bf9d2e79af28bf854588977d8da6db72b1092643fc9f ray_rllib_msgpack_checkpoint_poc.zip
d1e69a95aa49a6287181470e116c904f6b676b5b06abb2ba6673c7715442810d evidence/ray_rllib_extracted_embedded_pickle.pkl
Safety Notes
- The PoC writes only a benign local marker file.
- No destructive commands, shells, network callbacks, credential access, or persistence are used.
- Do not run the verifier outside an isolated test environment.
- The extracted pickle is included only to make the scanner comparison and payload disassembly explicit.
Limitations
msgpack-numpydocuments object dtype fallback behavior involving pickle. The submission should therefore focus on Ray RLlib's normal checkpoint restore path and the scanner/runtime mismatch, not on rediscovering generic pickle risk.- This PoC demonstrates benign marker execution at checkpoint restore time; it does not attempt privilege escalation or persistence.
- ModelScan unsupported-format behavior is not itself the vulnerability. The vulnerability is that a normal RLlib checkpoint load reaches a pickle payload hidden inside MessagePack state that the scanner does not inspect.
Duplicate Check Notes
Searches were performed for Ray RLlib MessagePack checkpoint pickle,
msgpack-numpy security advisories, msgpack_numpy.patch() checkpoint restore,
and ModelScan MessagePack support. No public advisory matching this Ray RLlib
checkpoint path was found during local triage.
References checked during triage:
- Ray RLlib checkpoint documentation
- Ray RLlib
checkpoints.pymodule documentation msgpack-numpysource and README- ModelScan supported-format documentation