YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
TensorRT Region Plugin: Integer Overflow -> Heap Buffer Overflow
Vulnerability Summary
Component: plugin/regionPlugin/regionPlugin.cpp in NVIDIA TensorRT OSS
Type: Integer Overflow (CWE-190) leading to Heap Buffer Overflow (CWE-122)
Entry Point: RegionPluginCreator::deserializePlugin() → Region::Region(void const* buffer, size_t length)
Trigger: Loading a crafted TensorRT engine file (.engine / .trt) containing a malicious Region_TRT plugin
Impact: Heap corruption → potential Remote Code Execution
Root Cause
In regionPlugin.cpp, the Region deserialization constructor reads smTreeTemp->n directly from attacker-controlled serialized data (line 117) without any validation:
smTreeTemp->n = read<int32_t>(d); // attacker-controlled
if (leafPresent)
{
allocateChunk(smTreeTemp->leaf, smTreeTemp->n); // malloc(n * sizeof(int32_t))
}
The allocateChunk template function (line 39-42) performs:
template <typename T>
void allocateChunk(T*& ptr, int32_t count)
{
ptr = static_cast<T*>(malloc(count * sizeof(T))); // integer overflow possible
}
Bug 1 - No bounds check: n can be any value (including negative, which wraps to huge unsigned in the loop).
Bug 2 - Integer overflow: When n = 0x40000001, n * sizeof(int32_t) = 0x100000004, which truncates to 0x4 on 32-bit multiplication, causing malloc(4) to return a 4-byte buffer.
The subsequent loop writes n elements (0x40000001 = ~1 billion int32's) into this 4-byte buffer:
for (int32_t i = 0; i < smTreeTemp->n; i++)
{
if (smTreeTemp->leaf)
{
smTreeTemp->leaf[i] = read<int32_t>(d); // HEAP OVERFLOW
}
}
The same pattern affects parent, child, group, name, groupSize, and groupOffset arrays.
The PLUGIN_VALIDATE(d == a + length) check at line 227 runs after the overflow has already occurred.
Affected Fields (all exploitable via same pattern)
| Field | Line | Allocation | Loop Write |
|---|---|---|---|
leaf |
121 | allocateChunk(leaf, n) |
L152-157 |
parent |
129 | allocateChunk(parent, n) |
L158-161 |
child |
137 | allocateChunk(child, n) |
L162-165 |
group |
145 | allocateChunk(group, n) |
L166-169 |
name |
174 | allocateChunk(name, n) |
L183-190 |
groupSize |
196 | allocateChunk(groupSize, groups) |
L212-215 |
groupOffset |
204 | allocateChunk(groupOffset, groups) |
L216-219 |
Attack Vector
- Attacker crafts a TensorRT engine file containing a
Region_TRTplugin - The serialized plugin data has
n = 0x40000001(or other overflow-triggering value) - Victim loads the engine file using TensorRT runtime (
trtexec, Python API, or C++ API) RegionPluginCreator::deserializePlugin()is called automaticallyRegion::Region(buffer, length)triggers heap overflow- Heap metadata corruption → potential arbitrary write → RCE
Files
trigger.cpp— Standalone C++ PoC demonstrating the heap overflow (no TensorRT installation needed)gen_payload.py— Python script generating a malicious serialized payloadmalicious_region_payload.bin— Pre-generated malicious plugin serialization datatrigger_deserialize.py— Python trigger using real TensorRTdeserialize_plugin()API
Note: malicious_region_payload.bin is the raw serialized data for the Region_TRT plugin — the exact bytes that RegionPluginCreator::deserializePlugin() receives. In a real attack, this data would be embedded inside a .engine file and deserialized automatically when loading the model.
Reproduction
Method 1: Standalone ASan PoC (no TensorRT needed)
clang++ -fsanitize=address -g -O0 -o trigger trigger.cpp
./trigger
Method 2: Real TensorRT Deserialization
# Build TensorRT OSS plugins (optionally with ASan)
git clone https://github.com/NVIDIA/TensorRT.git && cd TensorRT
mkdir build && cd build
cmake .. -DCMAKE_CXX_FLAGS="-fsanitize=address" -DBUILD_PARSERS=OFF -DBUILD_SAMPLES=OFF
make -j$(nproc) nvinfer_plugin
# Generate payload and trigger
cd /path/to/this/repo
python3 gen_payload.py
LD_LIBRARY_PATH=/path/to/TensorRT/build/out python3 trigger_deserialize.py
Expected Output (Method 1)
=== TensorRT Region Plugin Heap Overflow PoC ===
[*] ...
[*] Simulating integer overflow: allocating 4 elements but writing 128
[*] Writing 128 elements into 4-element buffer...
=================================================================
==XXXXX==ERROR: AddressSanitizer: heap-buffer-overflow on address ...
WRITE of size 4 at ... thread T0
Generate Standalone Payload
python3 gen_payload.py
# Output: malicious_region_payload.bin
Suggested Fix
Add bounds validation for n and groups before allocation:
smTreeTemp->n = read<int32_t>(d);
// Validate n against remaining buffer size
if (smTreeTemp->n < 0 || smTreeTemp->n > (int32_t)((length - (d - a)) / sizeof(int32_t))) {
PLUGIN_VALIDATE(false && "Invalid softmaxTree.n in serialized data");
return;
}
And add overflow check in allocateChunk:
template <typename T>
void allocateChunk(T*& ptr, int32_t count)
{
if (count <= 0 || (size_t)count > SIZE_MAX / sizeof(T)) {
ptr = nullptr;
return;
}
ptr = static_cast<T*>(malloc((size_t)count * sizeof(T)));
}