Integer Overflow in Tensor Element Count Leads to Heap Buffer Overflow in ExecuTorch PTE Loading
Target
pytorch/executorch
Vulnerability Type
Integer Overflow to Buffer Overflow (CWE-190, CWE-122)
Severity
CRITICAL (CVSS 3.1: 9.8 -- AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)
A crafted .pte model file can trigger an integer overflow in tensor size calculations, causing a small buffer to be allocated for what should be a very large tensor. Subsequent data loading writes past the end of this undersized buffer, achieving heap corruption that can lead to arbitrary code execution.
Summary
When ExecuTorch loads a .pte (FlatBuffer-based) model file, it deserializes tensor metadata including dimension sizes and scalar type. The compute_numel() function in tensor_impl.cpp multiplies all dimension sizes together to compute the total number of elements (numel), but performs this multiplication using signed ssize_t arithmetic without any overflow check. The result is then used to compute nbytes = numel * elementSize(type) in TensorImpl::nbytes(), also without overflow protection.
Critically, the overflow checks were deliberately commented out in program_validation.cpp (lines 35-58 and 67-79), leaving a known gap in the validation pipeline.
Root Cause
File 1: runtime/core/portable_type/tensor_impl.cpp, lines 30-44
ssize_t compute_numel(const TensorImpl::SizesType* sizes, ssize_t dim) {
ET_CHECK_MSG(
dim == 0 || sizes != nullptr,
"Sizes must be provided for non-scalar tensors");
ssize_t numel = 1;
for (const auto i : c10::irange(dim)) {
ET_CHECK_MSG(
sizes[i] >= 0,
"Size must be non-negative, got %zd at dimension %zd",
static_cast<ssize_t>(sizes[i]),
i);
numel *= sizes[i]; // <-- NO OVERFLOW CHECK
}
return numel;
}
The only validation is that individual sizes are non-negative. There is no check that the running product overflows. With SizesType = int32_t (max ~2.1 billion) and ssize_t being 64-bit on most platforms, an attacker can craft sizes that overflow ssize_t (e.g., 8 dimensions of size 2^8 = 256 each would be fine, but dimensions like {65536, 65536, 65536, 65536} produce numel = 2^64 which wraps to 0 or a small value in signed arithmetic).
File 2: runtime/core/portable_type/tensor_impl.cpp, lines 71-73
size_t TensorImpl::nbytes() const {
return numel_ * elementSize(type_); // <-- NO OVERFLOW CHECK
}
Even if numel_ did not overflow, nbytes() multiplies by element size without checking. A numel_ of 2^60 with an 8-byte element size would overflow to 0.
File 3: runtime/executor/program_validation.cpp, lines 35-79
The overflow checks that would catch this were explicitly commented out:
// ssize_t numel = 1;
// ...
// bool overflow =
// c10::mul_overflows(numel, static_cast<ssize_t>(size), &numel);
// if (overflow) {
// ...
// return Error::InvalidProgram;
// }
And:
// size_t nbytes;
// bool nbytes_overflow = c10::mul_overflows(
// static_cast<size_t>(numel),
// executorch::runtime::elementSize(scalar_type),
// &nbytes);
// if (nbytes_overflow) {
// ...
// return Error::InvalidProgram;
// }
These commented-out checks show the developers were aware of the risk but left the protection disabled.
Exploitation Flow
Attacker crafts a .pte file with a Tensor whose
sizesfield contains values that multiply to overflowssize_t. For example, sizes ={2147483647, 2147483647, 2}(two max int32 values and a 2) with scalar_type Float32 (4 bytes).During
parseTensor()(tensor_parser_portable.cpp):- Sizes are validated only for non-negativity (line 125-132) -- all pass.
TensorImplconstructor callscompute_numel()which overflows silently, producing a small or zeronumel_.tensor_impl->nbytes()returns a very small value (e.g., 0 or 32).
getTensorDataPtr()is called with this tinynbytes:- For constant tensors:
program->get_constant_buffer_data(data_buffer_idx, nbytes)-- the bounds check at line 398 (offset + nbytes <= size) passes becausenbytesis tiny. - For memory-planned tensors:
allocator->get_offset_address(memory_id, memory_offset, nbytes)-- the bounds check passes becausenbytesis tiny.
- For constant tensors:
During execution, kernel operators read/write the tensor using the actual (huge) logical dimensions but the physical buffer is tiny. Any kernel that iterates over the tensor (e.g., copy, add, matmul) writes far beyond the allocated buffer, causing heap buffer overflow.
Concrete Example on 64-bit System
sizes = [2147483647, 2147483647, 4](three int32 values)numel = 2147483647 * 2147483647 * 4=18446744056529682436which asssize_tis-17179869180nbytes = (ssize_t)(-17179869180) * 4= wraps/truncates- The resulting
nbytespassed to buffer allocation is a small positive number - Actual tensor data is ~16 exabytes logically, but only a few bytes are allocated
Even simpler on 32-bit embedded targets (ExecuTorch's primary deployment):
ssize_tis 32-bit,SizesTypeisint32_tsizes = [65536, 65536]->numel = 65536 * 65536 = 4294967296which wraps to0as int32nbytes = 0 * 4 = 0- Zero bytes allocated, any write is a heap overflow
Impact
- Heap Buffer Overflow: Kernel operations write past allocated buffer boundaries
- Arbitrary Code Execution: Standard heap corruption exploitation techniques apply
- Denial of Service: Immediate crash on memory access violation
- Affects embedded/mobile devices: ExecuTorch targets resource-constrained environments (Android, iOS, microcontrollers) where ASLR/heap protections may be weaker
Affected Code Path
Program::load()
-> Method::load()
-> Method::init()
-> Method::parse_values()
-> deserialization::parseTensor() [tensor_parser_portable.cpp]
-> TensorImpl::TensorImpl() [tensor_impl.cpp]
-> compute_numel() [OVERFLOW HERE]
-> TensorImpl::nbytes() [OVERFLOW HERE]
-> getTensorDataPtr() [undersized allocation]
Remediation
Uncomment and enable the overflow checks in
program_validation.cpp(lines 35-79). Replace thec10::mul_overflowsdependency if needed with a portable implementation.Add overflow checks to
compute_numel():
ssize_t compute_numel(const TensorImpl::SizesType* sizes, ssize_t dim) {
ssize_t numel = 1;
for (const auto i : c10::irange(dim)) {
ET_CHECK_MSG(sizes[i] >= 0, ...);
if (sizes[i] != 0 && numel > SSIZE_MAX / sizes[i]) {
ET_CHECK_MSG(false, "numel overflow at dimension %zd", i);
}
numel *= sizes[i];
}
return numel;
}
- Add overflow check to
TensorImpl::nbytes():
size_t TensorImpl::nbytes() const {
size_t elem = elementSize(type_);
ET_CHECK_MSG(elem == 0 || static_cast<size_t>(numel_) <= SIZE_MAX / elem,
"nbytes overflow");
return static_cast<size_t>(numel_) * elem;
}
References
runtime/core/portable_type/tensor_impl.cpp--compute_numel()overflowruntime/core/portable_type/tensor_impl.cpp--TensorImpl::nbytes()overflowruntime/executor/program_validation.cpp-- commented-out overflow checksruntime/executor/tensor_parser_portable.cpp--parseTensor()overflow propagationruntime/executor/tensor_parser_exec_aten.cpp--getTensorDataPtr()undersized allocationruntime/core/exec_aten/util/dim_order_util.h--dim_order_to_stride_nocheck()stride overflow