ONNX Runtime Sparse Tensor External Data Validation Bypass PoC
Benign security proof-of-concept for an ONNX Runtime external-data validation gap in sparse TensorProto conversion paths.
Summary
ONNX Runtime 1.26.0 validates external data paths for ordinary graph initializers, but two sparse TensorProto paths read external data before that initializer-validation loop covers them:
- A
Constantnode with asparse_valueattribute whose sparsevaluesTensorProto uses external data. - A
GraphProto.sparse_initializerwhose sparsevaluesTensorProto uses external data.
For both paths, default onnxruntime.InferenceSession(model_path) reads external bytes and returns the marker as model output for malicious locations including:
../outside_dir/marker.bin- absolute path to
outside_dir/marker.bin - parent directory symlink:
link_parent/marker.bin - hardlink:
hardlink.bin
ONNX 1.21.0 checker rejects the malicious variants, so this is a checker/runtime mismatch and an apparent incomplete fix adjacent to ONNX Runtime's earlier external-data traversal fix.
Severity
Medium, CVSS 6.5.
Rationale: default runtime local file read through untrusted ONNX model loading. This is not RCE, and the attacker needs the ORT process to have read access to the target file and to shape offset, length, dtype, and graph output for the read.
Tested Versions
onnxruntime==1.26.0onnx==1.21.0modelscan==0.8.8- Python 3.12.3
- Linux
Files
sparse_constant/verify.py: Generates and verifies theConstant(sparse_value=...)variant.sparse_initializer/verify.py: Generates and verifies theGraphProto.sparse_initializervariant.verify_all.py: Runs both variants and summarizes impact lines.results/: Captured local runtime and ModelScan outputs from research.requirements.txt: Tested dependency versions.
The verifier scripts generate PoC models locally so absolute-path cases remain valid after download.
Reproduce
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
python verify_all.py
Expected result:
all_passed:truesparse_constantimpact containssparse_constant_external_data_bypasssparse_initializerimpact containssparse_initializer_external_data_bypass
You can also run each variant directly:
python sparse_constant/verify.py
python sparse_initializer/verify.py
Runtime Evidence
For malicious sparse external-data variants, the expected behavior is:
onnx.checker.check_model(path)rejects../, absolute path, parent symlink, and hardlink variants.onnxruntime.InferenceSession(path)returns the marker:ORT_SPARSE_CONSTANT_EXT_READfor the sparse Constant variant.ORT_SPARSE_INITIALIZER_EXT_READfor the sparse initializer variant.
onnxruntime.InferenceSession(model_bytes, SessionOptions(...folder_path...))also returns the marker.
ModelScan 0.8.8 skips .onnx files as unsupported. The main claim is ONNX checker versus ONNX Runtime default loader behavior.
Impact
A malicious ONNX model can use sparse TensorProto external data to make ONNX Runtime read local files outside the model directory during default model loading/inference. The attacker controls the external data location, offset, length, dtype, and graph output shape.
The PoC is benign and reads only controlled marker files generated in the local PoC directory.
Limitations
- Not remote code execution.
- Requires files readable by the ONNX Runtime process.
- Practical arbitrary byte reads require matching dtype/shape/length to the target bytes.
- This is adjacent to the earlier ONNX Runtime external-data traversal fix, but the PoC reproduces on 1.26.0 through sparse TensorProto conversion paths.
Suggested Mitigation
Apply the same external-data validation used for ordinary graph initializers to sparse TensorProto values before any external data is opened. Cover Constant node sparse_value attributes and GraphProto.sparse_initializer conversion paths, including ../, absolute paths, symlinks, parent-directory symlinks, and hardlinks.