You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

TENSORRT-09: PriorBox plugin deserialize() heap out-of-bounds read

Summary

NVIDIA TensorRT ships a built-in PriorBox plugin (plugin/priorBoxPlugin) used to deserialize SSD-style prior-box layers from a serialized engine or plugin blob.

PriorBox::deserialize() reads three attacker-controlled 32-bit counts from the serialized buffer (numMinSize, numMaxSize, numAspectRatios), then loops that many times calling a shared helper read<float>() that does an unchecked memcpy advancing a raw pointer, with no check that the read stays inside the buffer. The only bounds check (PLUGIN_VALIDATE(d == data + length)) runs after all the reads have already happened. A crafted buffer with a large numMinSize and a short total length causes an out of bounds heap read while the engine/plugin is being deserialized, before any of the declared length is otherwise validated.

Vulnerable code (cloned NVIDIA/TensorRT OSS release, plugin/priorBoxPlugin/priorBoxPlugin.cpp):

void PriorBox::deserialize(uint8_t const* data, size_t length)
{
    auto const* d{data};
    mParam = read<PriorBoxParameters>(d);
    auto readArray = [&d](int32_t size, std::vector<float>& dstVec, float*& dstPtr) {
        PLUGIN_VALIDATE(size >= 0);
        dstVec.resize(size);
        for (int32_t i = 0; i < size; i++)
        {
            dstVec[i] = read<float>(d);   // unchecked memcpy, no length test
        }
        dstPtr = dstVec.data();
    };
    readArray(mParam.numMinSize, mMinSizeCPU, mParam.minSize);
    readArray(mParam.numMaxSize, mMaxSizeCPU, mParam.maxSize);
    readArray(mParam.numAspectRatios, mAspectRatiosCPU, mParam.aspectRatios);
    mH = read<int32_t>(d);
    mW = read<int32_t>(d);
    PLUGIN_VALIDATE(d == data + length);   // only checked AFTER all the reads above

and the shared helper (plugin/common/plugin.h):

template <typename OutType, typename BufferType>
OutType read(BufferType const*& buffer)
{
    static_assert(sizeof(BufferType) == 1, "BufferType must be a 1 byte type.");
    OutType val{};
    std::memcpy(&val, static_cast<void const*>(buffer), sizeof(OutType));
    buffer += sizeof(OutType);
    return val;
}

The entry point reached from IRuntime::deserializeCudaEngine() for a PriorBox layer is:

PriorBoxPluginCreator::deserializePlugin(name, serialData, serialLength)
  -> PriorBox::PriorBox(data, length)
       -> PriorBox::deserialize(data, length)
            -> readArray(mParam.numMinSize, ...) { ... read<float>(d); ... }

What this proof of concept demonstrates

This PoC calls the real, unmodified PriorBoxPluginCreator::deserializePlugin() (the exact production entry point above) directly with a crafted 80-byte buffer, and gets a live AddressSanitizer heap-buffer-overflow crash confirming the out of bounds read is real and reachable, not just a static code-reading claim.

Files:

  • gen_payload.py builds tensorrt09_priorbox_payload.bin, an 80-byte serialized PriorBoxParameters header with numMinSize = 8192 (and numMaxSize = 0, numAspectRatios = 0), i.e. a declared array count far larger than the buffer that actually backs it.
  • tensorrt09_priorbox_payload.bin is the generated 80-byte payload.
  • standalone_build/harness_standalone.cpp loads that payload into its own new unsigned char[80] heap allocation and calls PriorBoxPluginCreator::deserializePlugin("pb", data, 80) on it, exactly like the real TensorRT runtime would during engine deserialization.
  • standalone_build/build.sh compiles the harness together with the actual, unmodified plugin/priorBoxPlugin/priorBoxPlugin.cpp and plugin/vc/checkMacrosPlugin.cpp from the cloned NVIDIA/TensorRT repository, with AddressSanitizer enabled (-fsanitize=address).
  • standalone_build/stubs/ and standalone_build/stub_defs.cpp provide minimal link-time stand-ins for the handful of CUDA Toolkit / closed-source TensorRT runtime symbols (cudaMalloc, cudaMemcpy, cudaFree, cudaDeviceGetAttribute, cudaGetDevice, cudaGetErrorString, cudaDeviceReset, priorBoxInference, nvinfer1::getLogger()) that this environment's host does not have installed (no CUDA Toolkit, no full closed-source libnvinfer/libnvinfer_plugin). None of these stubbed symbols execute on this crash path: the AddressSanitizer abort happens on the very first out of bounds read<float>() call inside the readArray loop, strictly before PriorBox::setupDeviceMemory() (the only place any CUDA call is made) and strictly before any PLUGIN_VALIDATE failure path (the only place the logger would be used). They exist purely so the real vulnerable source compiles and links into a runnable binary; they are declarations/no-op definitions, not a reimplementation of the vulnerable logic itself.
  • standalone_build/asan_crash_log.txt is the captured output of an actual run.

How to reproduce

On Linux (or WSL) with g++ and libasan available:

cd standalone_build
TRT_REPO=/path/to/cloned/NVIDIA/TensorRT ./build.sh
ASAN_OPTIONS=halt_on_error=1 ./harness_standalone ../tensorrt09_priorbox_payload.bin

Observed result

AddressSanitizer reports a heap-buffer-overflow READ of size 4, 0 bytes after an 80-byte heap region, with this call stack:

READ of size 4 at ... thread T0
    #0 float nvinfer1::plugin::read<float, unsigned char>(unsigned char const*&)
    #1 operator() priorBoxPlugin.cpp:152          (the readArray lambda body)
    #2 nvinfer1::plugin::PriorBox::deserialize(unsigned char const*, unsigned long)  priorBoxPlugin.cpp:156
    #3 nvinfer1::plugin::PriorBox::PriorBox(void const*, unsigned long)              priorBoxPlugin.cpp:139
    #4 std::make_unique<nvinfer1::plugin::PriorBox, ...>(...)
    #5 nvinfer1::plugin::PriorBoxPluginCreator::deserializePlugin(char const*, void const*, unsigned long)  priorBoxPlugin.cpp:516
    #6 main

full output in standalone_build/asan_crash_log.txt.

Impact

Any application that loads a TensorRT engine or a standalone serialized PriorBox plugin blob from an untrusted source (a common pattern for ML model files distributed and loaded the same way as other model formats) can trigger an out of bounds heap read during deserialization purely by crafting the numMinSize / numMaxSize / numAspectRatios fields. Depending on heap layout this can crash the process (denial of service) or leak adjacent heap memory into the plugin's internal float arrays, which is a memory disclosure primitive since those arrays influence later plugin output.

The same shared read<T>() helper with the same missing bounds check is used by several other bundled TensorRT plugins (GenerateDetection, ProposalLayer, PyramidROIAlign, RPROI, MultilevelCropAndResize, ResizeNearest, EfficientNMS); PriorBox is reported here as the representative instance.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support