YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
TensorRT QKV runnerStateBuffer PluginField length validation PoC package
Status
This repository is a triager-side source-level PoC package, not a final mutated TensorRT .engine file.
The bug candidate is a validation gap in NVIDIA TensorRT OSS QKV plugin runtime deserialization:
runnerStateBufferis serialized asPluginFieldType::kUNKNOWNwithmRunnerStateBuffer.size().- Runtime plugin creators store only
fc->fields[i].data. - They do not validate
fc->fields[i].typeorfc->fields[i].length. - The constructor later calls dispatcher deserialization using
dispatcher->getSerializationSize(), not the actual serialized field length.
The included ASan harness models the exact plugin-level validation gap and demonstrates an out-of-bounds read if a short runnerStateBuffer reaches the runtime creator.
Affected source snapshot
Tested TensorRT source snapshot / commit:
5302b288c5603895508711bdf2df11dc29c8ad92
Primary files:
plugin/bertQKVToContextPlugin/qkvToContextPlugin.cpp
plugin/bertQKVToContextPlugin/mhaRunner.cu
plugin/common/plugin.cpp
include/NvInferPluginBase.h
include/NvInferRuntime.h
Files in this package
README.md
qkv_runnerstate_harness.cpp
qkv_runnerstate_fixed_harness.cpp
expected_vulnerable_asan_output.txt
expected_fixed_output.txt
qkv_runnerstate_proposed_patch.diff
tensorrt_qkv_key_lines.txt
description_for_huntr.md
runtime_poc_requirements.md
NO_REAL_ENGINE_FILE_INCLUDED.txt
Reproduce the plugin-level issue
On Linux with clang or g++ and AddressSanitizer:
clang++ -std=c++17 -fsanitize=address -g -O1 qkv_runnerstate_harness.cpp -o qkv_runnerstate_harness
./qkv_runnerstate_harness
Expected vulnerable result:
ERROR: AddressSanitizer: heap-buffer-overflow
READ of size 4
The expected output is also included in:
expected_vulnerable_asan_output.txt
Reproduce the fixed behavior
clang++ -std=c++17 -fsanitize=address -g -O1 qkv_runnerstate_fixed_harness.cpp -o qkv_runnerstate_fixed_harness
./qkv_runnerstate_fixed_harness
Expected fixed behavior:
Rejected safely: invalid runnerStateBuffer length: rejected before deserialize
Important limitation
This package does not include a malformed TensorRT .engine/.trt/.mytrtfile.
The remaining triage question is whether TensorRT core allows a malformed serialized plugin field with a short runnerStateBuffer.length to reach:
createPlugin(..., TensorRTPhase::kRUNTIME)
If TensorRT core does not validate that field length before invoking the plugin creator, then the QKV plugin code can overread because it ignores the actual PluginField.length.
Suggested triager-side runtime test
NVIDIA/TensorRT triage can test the candidate by creating or mutating a serialized engine containing a QKV plugin runtime field collection where:
field.name = "runnerStateBuffer"
field.type = PluginFieldType::kUNKNOWN
field.data = non-null
field.length = shorter than dispatcher->getSerializationSize()
Then deserialize the engine and verify whether the plugin creator reaches the runtime constructor and whether the dispatcher deserializer reads beyond the supplied field data.
Suggested fix
Validate the runtime runnerStateBuffer field metadata before deserialization:
PLUGIN_VALIDATE(fc->fields[i].type == PluginFieldType::kUNKNOWN,
"invalid runnerStateBuffer field type");
runnerStateBufferLength = fc->fields[i].length;
runnerStateBuffer = static_cast<void const*>(fc->fields[i].data);
Before constructing/deserializing:
PLUGIN_VALIDATE(runnerStateBufferLength == expectedSerializationSize,
"invalid runnerStateBuffer field length");
Prefer passing the validated actual length into the constructor/deserializer instead of recomputing an expected length independently of the serialized field metadata.