TensorRT Region Plugin: Integer Overflow -> Heap Buffer Overflow

Vulnerability Summary

Component: plugin/regionPlugin/regionPlugin.cpp in NVIDIA TensorRT OSS

Type: Integer Overflow (CWE-190) leading to Heap Buffer Overflow (CWE-122)

Entry Point: RegionPluginCreator::deserializePlugin() → Region::Region(void const* buffer, size_t length)

Trigger: Loading a crafted TensorRT engine file (.engine / .trt) containing a malicious Region_TRT plugin

Impact: Heap corruption → potential Remote Code Execution

Root Cause

In regionPlugin.cpp, the Region deserialization constructor reads smTreeTemp->n directly from attacker-controlled serialized data (line 117) without any validation:

smTreeTemp->n = read<int32_t>(d);  // attacker-controlled

if (leafPresent)
{
    allocateChunk(smTreeTemp->leaf, smTreeTemp->n);  // malloc(n * sizeof(int32_t))
}

The allocateChunk template function (line 39-42) performs:

template <typename T>
void allocateChunk(T*& ptr, int32_t count)
{
    ptr = static_cast<T*>(malloc(count * sizeof(T)));  // integer overflow possible
}

Bug 1 - No bounds check: n can be any value (including negative, which wraps to huge unsigned in the loop).

Bug 2 - Integer overflow: When n = 0x40000001, n * sizeof(int32_t) = 0x100000004, which truncates to 0x4 on 32-bit multiplication, causing malloc(4) to return a 4-byte buffer.

The subsequent loop writes n elements (0x40000001 = ~1 billion int32's) into this 4-byte buffer:

for (int32_t i = 0; i < smTreeTemp->n; i++)
{
    if (smTreeTemp->leaf)
    {
        smTreeTemp->leaf[i] = read<int32_t>(d);  // HEAP OVERFLOW
    }
}

The same pattern affects parent, child, group, name, groupSize, and groupOffset arrays.

The PLUGIN_VALIDATE(d == a + length) check at line 227 runs after the overflow has already occurred.

Affected Fields (all exploitable via same pattern)

Field	Line	Allocation	Loop Write
`leaf`	121	`allocateChunk(leaf, n)`	L152-157
`parent`	129	`allocateChunk(parent, n)`	L158-161
`child`	137	`allocateChunk(child, n)`	L162-165
`group`	145	`allocateChunk(group, n)`	L166-169
`name`	174	`allocateChunk(name, n)`	L183-190
`groupSize`	196	`allocateChunk(groupSize, groups)`	L212-215
`groupOffset`	204	`allocateChunk(groupOffset, groups)`	L216-219

Attack Vector

Attacker crafts a TensorRT engine file containing a Region_TRT plugin
The serialized plugin data has n = 0x40000001 (or other overflow-triggering value)
Victim loads the engine file using TensorRT runtime (trtexec, Python API, or C++ API)
RegionPluginCreator::deserializePlugin() is called automatically
Region::Region(buffer, length) triggers heap overflow
Heap metadata corruption → potential arbitrary write → RCE

Files

trigger.cpp — Standalone C++ PoC demonstrating the heap overflow (no TensorRT installation needed)
gen_payload.py — Python script generating a malicious serialized payload
malicious_region_payload.bin — Pre-generated malicious plugin serialization data
trigger_deserialize.py — Python trigger using real TensorRT deserialize_plugin() API

Note: malicious_region_payload.bin is the raw serialized data for the Region_TRT plugin — the exact bytes that RegionPluginCreator::deserializePlugin() receives. In a real attack, this data would be embedded inside a .engine file and deserialized automatically when loading the model.

Reproduction

Method 1: Standalone ASan PoC (no TensorRT needed)

clang++ -fsanitize=address -g -O0 -o trigger trigger.cpp
./trigger

Method 2: Real TensorRT Deserialization

# Build TensorRT OSS plugins (optionally with ASan)
git clone https://github.com/NVIDIA/TensorRT.git && cd TensorRT
mkdir build && cd build
cmake .. -DCMAKE_CXX_FLAGS="-fsanitize=address" -DBUILD_PARSERS=OFF -DBUILD_SAMPLES=OFF
make -j$(nproc) nvinfer_plugin

# Generate payload and trigger
cd /path/to/this/repo
python3 gen_payload.py
LD_LIBRARY_PATH=/path/to/TensorRT/build/out python3 trigger_deserialize.py

Expected Output (Method 1)

=== TensorRT Region Plugin Heap Overflow PoC ===
[*] ...
[*] Simulating integer overflow: allocating 4 elements but writing 128
[*] Writing 128 elements into 4-element buffer...
=================================================================
==XXXXX==ERROR: AddressSanitizer: heap-buffer-overflow on address ...
WRITE of size 4 at ... thread T0

Generate Standalone Payload

python3 gen_payload.py
# Output: malicious_region_payload.bin

Suggested Fix

Add bounds validation for n and groups before allocation:

smTreeTemp->n = read<int32_t>(d);

// Validate n against remaining buffer size
if (smTreeTemp->n < 0 || smTreeTemp->n > (int32_t)((length - (d - a)) / sizeof(int32_t))) {
    PLUGIN_VALIDATE(false && "Invalid softmaxTree.n in serialized data");
    return;
}

And add overflow check in allocateChunk:

template <typename T>
void allocateChunk(T*& ptr, int32_t count)
{
    if (count <= 0 || (size_t)count > SIZE_MAX / sizeof(T)) {
        ptr = nullptr;
        return;
    }
    ptr = static_cast<T*>(malloc((size_t)count * sizeof(T)));
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support