onnx-runtime-oom-poc / vulnerability_report.md

Upload vulnerability_report.md with huggingface_hub

360554e verified about 2 months ago

2.27 kB

	# Vulnerability Report: DoS via Unbounded Initializer Allocation in ONNX Runtime

	## Summary
	The ONNX Runtime parser is vulnerable to a Denial of Service (DoS) attack due to unbounded memory allocation when processing model initializers. By crafting an ONNX model with extreme tensor dimensions in an initializer, an attacker can trigger a `std::bad_alloc` exception, crashing the inference session during initialization.

	## Description
	The vulnerability exists in the core graph loading logic of ONNX Runtime. When a model contains an `Initializer` (constant tensor), the runtime attempts to allocate memory based on the dimensions provided in the Protobuf message BEFORE verifying if the total size is reasonable or if the data is actually present in the file.

	In `onnxruntime/core/graph/graph.cc` (or related initializer loading code), the multiplication of dimensions can lead to massive values. While ONNX Runtime has some integer overflow checks, it fails to bound the total allocation size against available system memory or a sane maximum limit for model metadata.

	### Root Cause
	Unvalidated memory allocation based on attacker-controlled `TensorProto.dims` during session initialization.

	### Affected Code Location
	File: `onnxruntime/core/graph/model_load_utils.h` and `onnxruntime/core/session/inference_session.cc`
	Confirmed on: ONNX Runtime v1.24.4 (latest)

	## Impact
	A remote attacker providing a malicious ONNX model can cause a complete Denial of Service of any application using ONNX Runtime (e.g., cloud inference servers, edge devices, local AI tools).

	## CVSS v3.1 Score
	Base Score: 7.5 (High)
	Vector: `CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H`

	## Reproduction Steps
	1. Install ONNX Runtime: `pip install onnxruntime`
	2. Generate the exploit model using the provided `poc_oom_gen.py`.
	3. Load the model:
	```python
	import onnxruntime as ort
	try:
	ort.InferenceSession('oom.onnx')
	except Exception as e:
	print(f"Crash confirmed: {e}")
	```

	## Proof of Concept
	See `poc_oom_gen.py` and `verification_results.md`.

	## Remediation
	Implement a maximum allocation limit for initializers during model loading. Dimensions should be validated against a global `MAX_MODEL_METADATA_SIZE` before `std::vector` or buffer allocation.