onnx-runtime-oom-poc / vulnerability_report.md
Rodion111's picture
Upload vulnerability_report.md with huggingface_hub
360554e verified
# Vulnerability Report: DoS via Unbounded Initializer Allocation in ONNX Runtime
## Summary
The ONNX Runtime parser is vulnerable to a Denial of Service (DoS) attack due to unbounded memory allocation when processing model initializers. By crafting an ONNX model with extreme tensor dimensions in an initializer, an attacker can trigger a `std::bad_alloc` exception, crashing the inference session during initialization.
## Description
The vulnerability exists in the core graph loading logic of ONNX Runtime. When a model contains an `Initializer` (constant tensor), the runtime attempts to allocate memory based on the dimensions provided in the Protobuf message BEFORE verifying if the total size is reasonable or if the data is actually present in the file.
In `onnxruntime/core/graph/graph.cc` (or related initializer loading code), the multiplication of dimensions can lead to massive values. While ONNX Runtime has some integer overflow checks, it fails to bound the total allocation size against available system memory or a sane maximum limit for model metadata.
### Root Cause
Unvalidated memory allocation based on attacker-controlled `TensorProto.dims` during session initialization.
### Affected Code Location
**File:** `onnxruntime/core/graph/model_load_utils.h` and `onnxruntime/core/session/inference_session.cc`
**Confirmed on:** ONNX Runtime v1.24.4 (latest)
## Impact
A remote attacker providing a malicious ONNX model can cause a complete Denial of Service of any application using ONNX Runtime (e.g., cloud inference servers, edge devices, local AI tools).
## CVSS v3.1 Score
**Base Score:** 7.5 (High)
**Vector:** `CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H`
## Reproduction Steps
1. Install ONNX Runtime: `pip install onnxruntime`
2. Generate the exploit model using the provided `poc_oom_gen.py`.
3. Load the model:
```python
import onnxruntime as ort
try:
ort.InferenceSession('oom.onnx')
except Exception as e:
print(f"Crash confirmed: {e}")
```
## Proof of Concept
See `poc_oom_gen.py` and `verification_results.md`.
## Remediation
Implement a maximum allocation limit for initializers during model loading. Dimensions should be validated against a global `MAX_MODEL_METADATA_SIZE` before `std::vector` or buffer allocation.