ryansecuritytest-fanpierlabs commited on
Commit
885c806
·
verified ·
1 Parent(s): 5563861

Upload 22-executorch-numel-overflow.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. 22-executorch-numel-overflow.md +171 -0
22-executorch-numel-overflow.md ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Integer Overflow in Tensor Element Count Leads to Heap Buffer Overflow in ExecuTorch PTE Loading
2
+
3
+ ## Target
4
+ pytorch/executorch
5
+
6
+ ## Vulnerability Type
7
+ Integer Overflow to Buffer Overflow (CWE-190, CWE-122)
8
+
9
+ ## Severity
10
+ **CRITICAL** (CVSS 3.1: 9.8 -- AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)
11
+
12
+ A crafted .pte model file can trigger an integer overflow in tensor size calculations, causing a small buffer to be allocated for what should be a very large tensor. Subsequent data loading writes past the end of this undersized buffer, achieving heap corruption that can lead to arbitrary code execution.
13
+
14
+ ## Summary
15
+
16
+ When ExecuTorch loads a .pte (FlatBuffer-based) model file, it deserializes tensor metadata including dimension sizes and scalar type. The `compute_numel()` function in `tensor_impl.cpp` multiplies all dimension sizes together to compute the total number of elements (`numel`), but performs this multiplication using signed `ssize_t` arithmetic **without any overflow check**. The result is then used to compute `nbytes = numel * elementSize(type)` in `TensorImpl::nbytes()`, also without overflow protection.
17
+
18
+ Critically, the overflow checks were **deliberately commented out** in `program_validation.cpp` (lines 35-58 and 67-79), leaving a known gap in the validation pipeline.
19
+
20
+ ## Root Cause
21
+
22
+ ### File 1: `runtime/core/portable_type/tensor_impl.cpp`, lines 30-44
23
+
24
+ ```cpp
25
+ ssize_t compute_numel(const TensorImpl::SizesType* sizes, ssize_t dim) {
26
+ ET_CHECK_MSG(
27
+ dim == 0 || sizes != nullptr,
28
+ "Sizes must be provided for non-scalar tensors");
29
+ ssize_t numel = 1;
30
+ for (const auto i : c10::irange(dim)) {
31
+ ET_CHECK_MSG(
32
+ sizes[i] >= 0,
33
+ "Size must be non-negative, got %zd at dimension %zd",
34
+ static_cast<ssize_t>(sizes[i]),
35
+ i);
36
+ numel *= sizes[i]; // <-- NO OVERFLOW CHECK
37
+ }
38
+ return numel;
39
+ }
40
+ ```
41
+
42
+ The only validation is that individual sizes are non-negative. There is no check that the running product overflows. With `SizesType = int32_t` (max ~2.1 billion) and `ssize_t` being 64-bit on most platforms, an attacker can craft sizes that overflow `ssize_t` (e.g., 8 dimensions of size 2^8 = 256 each would be fine, but dimensions like `{65536, 65536, 65536, 65536}` produce `numel = 2^64` which wraps to 0 or a small value in signed arithmetic).
43
+
44
+ ### File 2: `runtime/core/portable_type/tensor_impl.cpp`, lines 71-73
45
+
46
+ ```cpp
47
+ size_t TensorImpl::nbytes() const {
48
+ return numel_ * elementSize(type_); // <-- NO OVERFLOW CHECK
49
+ }
50
+ ```
51
+
52
+ Even if `numel_` did not overflow, `nbytes()` multiplies by element size without checking. A `numel_` of `2^60` with an 8-byte element size would overflow to 0.
53
+
54
+ ### File 3: `runtime/executor/program_validation.cpp`, lines 35-79
55
+
56
+ The overflow checks that would catch this were **explicitly commented out**:
57
+
58
+ ```cpp
59
+ // ssize_t numel = 1;
60
+ // ...
61
+ // bool overflow =
62
+ // c10::mul_overflows(numel, static_cast<ssize_t>(size), &numel);
63
+ // if (overflow) {
64
+ // ...
65
+ // return Error::InvalidProgram;
66
+ // }
67
+ ```
68
+
69
+ And:
70
+
71
+ ```cpp
72
+ // size_t nbytes;
73
+ // bool nbytes_overflow = c10::mul_overflows(
74
+ // static_cast<size_t>(numel),
75
+ // executorch::runtime::elementSize(scalar_type),
76
+ // &nbytes);
77
+ // if (nbytes_overflow) {
78
+ // ...
79
+ // return Error::InvalidProgram;
80
+ // }
81
+ ```
82
+
83
+ These commented-out checks show the developers were aware of the risk but left the protection disabled.
84
+
85
+ ## Exploitation Flow
86
+
87
+ 1. **Attacker crafts a .pte file** with a Tensor whose `sizes` field contains values that multiply to overflow `ssize_t`. For example, sizes = `{2147483647, 2147483647, 2}` (two max int32 values and a 2) with scalar_type Float32 (4 bytes).
88
+
89
+ 2. **During `parseTensor()`** (`tensor_parser_portable.cpp`):
90
+ - Sizes are validated only for non-negativity (line 125-132) -- all pass.
91
+ - `TensorImpl` constructor calls `compute_numel()` which overflows silently, producing a small or zero `numel_`.
92
+ - `tensor_impl->nbytes()` returns a very small value (e.g., 0 or 32).
93
+
94
+ 3. **`getTensorDataPtr()`** is called with this tiny `nbytes`:
95
+ - For constant tensors: `program->get_constant_buffer_data(data_buffer_idx, nbytes)` -- the bounds check at line 398 (`offset + nbytes <= size`) passes because `nbytes` is tiny.
96
+ - For memory-planned tensors: `allocator->get_offset_address(memory_id, memory_offset, nbytes)` -- the bounds check passes because `nbytes` is tiny.
97
+
98
+ 4. **During execution**, kernel operators read/write the tensor using the actual (huge) logical dimensions but the physical buffer is tiny. Any kernel that iterates over the tensor (e.g., copy, add, matmul) writes far beyond the allocated buffer, causing **heap buffer overflow**.
99
+
100
+ ### Concrete Example on 64-bit System
101
+
102
+ - `sizes = [2147483647, 2147483647, 4]` (three int32 values)
103
+ - `numel = 2147483647 * 2147483647 * 4` = `18446744056529682436` which as `ssize_t` is `-17179869180`
104
+ - `nbytes = (ssize_t)(-17179869180) * 4` = wraps/truncates
105
+ - The resulting `nbytes` passed to buffer allocation is a small positive number
106
+ - Actual tensor data is ~16 exabytes logically, but only a few bytes are allocated
107
+
108
+ Even simpler on 32-bit embedded targets (ExecuTorch's primary deployment):
109
+ - `ssize_t` is 32-bit, `SizesType` is `int32_t`
110
+ - `sizes = [65536, 65536]` -> `numel = 65536 * 65536 = 4294967296` which wraps to `0` as int32
111
+ - `nbytes = 0 * 4 = 0`
112
+ - Zero bytes allocated, any write is a heap overflow
113
+
114
+ ## Impact
115
+
116
+ - **Heap Buffer Overflow**: Kernel operations write past allocated buffer boundaries
117
+ - **Arbitrary Code Execution**: Standard heap corruption exploitation techniques apply
118
+ - **Denial of Service**: Immediate crash on memory access violation
119
+ - **Affects embedded/mobile devices**: ExecuTorch targets resource-constrained environments (Android, iOS, microcontrollers) where ASLR/heap protections may be weaker
120
+
121
+ ## Affected Code Path
122
+
123
+ ```
124
+ Program::load()
125
+ -> Method::load()
126
+ -> Method::init()
127
+ -> Method::parse_values()
128
+ -> deserialization::parseTensor() [tensor_parser_portable.cpp]
129
+ -> TensorImpl::TensorImpl() [tensor_impl.cpp]
130
+ -> compute_numel() [OVERFLOW HERE]
131
+ -> TensorImpl::nbytes() [OVERFLOW HERE]
132
+ -> getTensorDataPtr() [undersized allocation]
133
+ ```
134
+
135
+ ## Remediation
136
+
137
+ 1. **Uncomment and enable the overflow checks** in `program_validation.cpp` (lines 35-79). Replace the `c10::mul_overflows` dependency if needed with a portable implementation.
138
+
139
+ 2. **Add overflow checks to `compute_numel()`**:
140
+ ```cpp
141
+ ssize_t compute_numel(const TensorImpl::SizesType* sizes, ssize_t dim) {
142
+ ssize_t numel = 1;
143
+ for (const auto i : c10::irange(dim)) {
144
+ ET_CHECK_MSG(sizes[i] >= 0, ...);
145
+ if (sizes[i] != 0 && numel > SSIZE_MAX / sizes[i]) {
146
+ ET_CHECK_MSG(false, "numel overflow at dimension %zd", i);
147
+ }
148
+ numel *= sizes[i];
149
+ }
150
+ return numel;
151
+ }
152
+ ```
153
+
154
+ 3. **Add overflow check to `TensorImpl::nbytes()`**:
155
+ ```cpp
156
+ size_t TensorImpl::nbytes() const {
157
+ size_t elem = elementSize(type_);
158
+ ET_CHECK_MSG(elem == 0 || static_cast<size_t>(numel_) <= SIZE_MAX / elem,
159
+ "nbytes overflow");
160
+ return static_cast<size_t>(numel_) * elem;
161
+ }
162
+ ```
163
+
164
+ ## References
165
+
166
+ - `runtime/core/portable_type/tensor_impl.cpp` -- `compute_numel()` overflow
167
+ - `runtime/core/portable_type/tensor_impl.cpp` -- `TensorImpl::nbytes()` overflow
168
+ - `runtime/executor/program_validation.cpp` -- commented-out overflow checks
169
+ - `runtime/executor/tensor_parser_portable.cpp` -- `parseTensor()` overflow propagation
170
+ - `runtime/executor/tensor_parser_exec_aten.cpp` -- `getTensorDataPtr()` undersized allocation
171
+ - `runtime/core/exec_aten/util/dim_order_util.h` -- `dim_order_to_stride_nocheck()` stride overflow