| &&&& RUNNING TensorRT.trtexec [TensorRT v101401] [b48] # trtexec --onnx=checkpoints/deimv2_dinov3_s_coco.onnx --saveEngine=checkpoints/deimv2_dinov3_s_coco.engine --fp16 --optShapes=images:1x3x640x640,orig_target_sizes:1x2 --memPoolSize=workspace:4096 --builderOptimizationLevel=3 |
| [01/20/2026-06:55:08] [W] optShapes is being broadcasted to minShapes for tensor orig_target_sizes |
| [01/20/2026-06:55:08] [W] optShapes is being broadcasted to maxShapes for tensor orig_target_sizes |
| [01/20/2026-06:55:08] [W] optShapes is being broadcasted to minShapes for tensor images |
| [01/20/2026-06:55:08] [W] optShapes is being broadcasted to maxShapes for tensor images |
| [01/20/2026-06:55:08] [W] Weakly-typed networks have been deprecated in TensorRT. You can use the AutoCast tool (https: |
| [01/20/2026-06:55:08] [I] === Model Options === |
| [01/20/2026-06:55:08] [I] Format: ONNX |
| [01/20/2026-06:55:08] [I] Model: checkpoints/deimv2_dinov3_s_coco.onnx |
| [01/20/2026-06:55:08] [I] Output: |
| [01/20/2026-06:55:08] [I] === Build Options === |
| [01/20/2026-06:55:08] [I] Memory Pools: workspace: 4096 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default |
| [01/20/2026-06:55:08] [I] avgTiming: 8 |
| [01/20/2026-06:55:08] [I] Precision: FP32+FP16 |
| [01/20/2026-06:55:08] [I] LayerPrecisions: |
| [01/20/2026-06:55:08] [I] Layer Device Types: |
| [01/20/2026-06:55:08] [I] Decomposable Attentions: |
| [01/20/2026-06:55:08] [I] Calibration: |
| [01/20/2026-06:55:08] [I] Refit: Disabled |
| [01/20/2026-06:55:08] [I] Strip weights: Disabled |
| [01/20/2026-06:55:08] [I] Version Compatible: Disabled |
| [01/20/2026-06:55:08] [I] ONNX Plugin InstanceNorm: Disabled |
| [01/20/2026-06:55:08] [I] ONNX kENABLE_UINT8_AND_ASYMMETRIC_QUANTIZATION_DLA flag: Disabled |
| [01/20/2026-06:55:08] [I] TensorRT runtime: full |
| [01/20/2026-06:55:08] [I] Lean DLL Path: |
| [01/20/2026-06:55:08] [I] Tempfile Controls: { in_memory: allow, temporary: allow } |
| [01/20/2026-06:55:08] [I] Exclude Lean Runtime: Disabled |
| [01/20/2026-06:55:08] [I] Sparsity: Disabled |
| [01/20/2026-06:55:08] [I] Safe mode: Disabled |
| [01/20/2026-06:55:08] [I] Build DLA standalone loadable: Disabled |
| [01/20/2026-06:55:08] [I] Allow GPU fallback for DLA: Disabled |
| [01/20/2026-06:55:08] [I] DirectIO mode: Disabled |
| [01/20/2026-06:55:08] [I] Restricted mode: Disabled |
| [01/20/2026-06:55:08] [I] Skip inference: Disabled |
| [01/20/2026-06:55:08] [I] Save engine: checkpoints/deimv2_dinov3_s_coco.engine |
| [01/20/2026-06:55:08] [I] Load engine: |
| [01/20/2026-06:55:08] [I] Profiling verbosity: 0 |
| [01/20/2026-06:55:08] [I] Tactic sources: Using default tactic sources |
| [01/20/2026-06:55:08] [I] timingCacheMode: local |
| [01/20/2026-06:55:08] [I] timingCacheFile: |
| [01/20/2026-06:55:08] [I] Enable Compilation Cache: Enabled |
| [01/20/2026-06:55:08] [I] Enable Monitor Memory: Disabled |
| [01/20/2026-06:55:08] [I] errorOnTimingCacheMiss: Disabled |
| [01/20/2026-06:55:08] [I] Preview Features: Use default preview flags. |
| [01/20/2026-06:55:08] [I] MaxAuxStreams: -1 |
| [01/20/2026-06:55:08] [I] BuilderOptimizationLevel: 3 |
| [01/20/2026-06:55:08] [I] MaxTactics: -1 |
| [01/20/2026-06:55:08] [I] Calibration Profile Index: 0 |
| [01/20/2026-06:55:08] [I] Weight Streaming: Disabled |
| [01/20/2026-06:55:08] [I] Runtime Platform: Same As Build |
| [01/20/2026-06:55:08] [I] Debug Tensors: |
| [01/20/2026-06:55:08] [I] Distributive Independence: Disabled |
| [01/20/2026-06:55:08] [I] Mark Unfused Tensors As Debug Tensors: Disabled |
| [01/20/2026-06:55:08] [I] Input(s)s format: fp32:CHW |
| [01/20/2026-06:55:08] [I] Output(s)s format: fp32:CHW |
| [01/20/2026-06:55:08] [I] Input build shape (profile 0): images=1x3x640x640+1x3x640x640+1x3x640x640 |
| [01/20/2026-06:55:08] [I] Input build shape (profile 0): orig_target_sizes=1x2+1x2+1x2 |
| [01/20/2026-06:55:08] [I] Input calibration shapes: model |
| [01/20/2026-06:55:08] [I] === System Options === |
| [01/20/2026-06:55:08] [I] Device: 0 |
| [01/20/2026-06:55:08] [I] DLACore: |
| [01/20/2026-06:55:08] [I] Plugins: |
| [01/20/2026-06:55:08] [I] setPluginsToSerialize: |
| [01/20/2026-06:55:08] [I] dynamicPlugins: |
| [01/20/2026-06:55:08] [I] ignoreParsedPluginLibs: 0 |
| [01/20/2026-06:55:08] [I] |
| [01/20/2026-06:55:08] [I] === Inference Options === |
| [01/20/2026-06:55:08] [I] Batch: Explicit |
| [01/20/2026-06:55:08] [I] Input inference shape : orig_target_sizes=1x2 |
| [01/20/2026-06:55:08] [I] Input inference shape : images=1x3x640x640 |
| [01/20/2026-06:55:08] [I] Iterations: 10 |
| [01/20/2026-06:55:08] [I] Duration: 3s (+ 200ms warm up) |
| [01/20/2026-06:55:08] [I] Sleep time: 0ms |
| [01/20/2026-06:55:08] [I] Idle time: 0ms |
| [01/20/2026-06:55:08] [I] Inference Streams: 1 |
| [01/20/2026-06:55:08] [I] ExposeDMA: Disabled |
| [01/20/2026-06:55:08] [I] Data transfers: Enabled |
| [01/20/2026-06:55:08] [I] Spin-wait: Disabled |
| [01/20/2026-06:55:08] [I] Multithreading: Disabled |
| [01/20/2026-06:55:08] [I] CUDA Graph: Disabled |
| [01/20/2026-06:55:08] [I] Separate profiling: Disabled |
| [01/20/2026-06:55:08] [I] Time Deserialize: Disabled |
| [01/20/2026-06:55:08] [I] Time Refit: Disabled |
| [01/20/2026-06:55:08] [I] NVTX verbosity: 0 |
| [01/20/2026-06:55:08] [I] Persistent Cache Ratio: 0 |
| [01/20/2026-06:55:08] [I] Optimization Profile Index: 0 |
| [01/20/2026-06:55:08] [I] Weight Streaming Budget: 100.000000% |
| [01/20/2026-06:55:08] [I] Inputs: |
| [01/20/2026-06:55:08] [I] Debug Tensor Save Destinations: |
| [01/20/2026-06:55:08] [I] Dump All Debug Tensor in Formats: |
| [01/20/2026-06:55:08] [I] === Reporting Options === |
| [01/20/2026-06:55:08] [I] Verbose: Disabled |
| [01/20/2026-06:55:08] [I] Averages: 10 inferences |
| [01/20/2026-06:55:08] [I] Percentiles: 90,95,99 |
| [01/20/2026-06:55:08] [I] Dump refittable layers:Disabled |
| [01/20/2026-06:55:08] [I] Dump output: Disabled |
| [01/20/2026-06:55:08] [I] Profile: Disabled |
| [01/20/2026-06:55:08] [I] Export timing to JSON file: |
| [01/20/2026-06:55:08] [I] Export output to JSON file: |
| [01/20/2026-06:55:08] [I] Export profile to JSON file: |
| [01/20/2026-06:55:08] [I] |
| [01/20/2026-06:55:08] [I] === Device Information === |
| [01/20/2026-06:55:08] [I] Available Devices: |
| [01/20/2026-06:55:08] [I] Device 0: "NVIDIA GeForce RTX 4090" UUID: GPU-55c23db9-433c-0d6c-46e7-9387266e5ddb |
| [01/20/2026-06:55:08] [I] Selected Device: NVIDIA GeForce RTX 4090 |
| [01/20/2026-06:55:08] [I] Selected Device ID: 0 |
| [01/20/2026-06:55:08] [I] Selected Device UUID: GPU-55c23db9-433c-0d6c-46e7-9387266e5ddb |
| [01/20/2026-06:55:08] [I] Compute Capability: 8.9 |
| [01/20/2026-06:55:08] [I] SMs: 128 |
| [01/20/2026-06:55:08] [I] Device Global Memory: 24071 MiB |
| [01/20/2026-06:55:08] [I] Shared Memory per SM: 100 KiB |
| [01/20/2026-06:55:08] [I] Memory Bus Width: 384 bits (ECC disabled) |
| [01/20/2026-06:55:08] [I] Application Compute Clock Rate: 2.52 GHz |
| [01/20/2026-06:55:08] [I] Application Memory Clock Rate: 10.501 GHz |
| [01/20/2026-06:55:08] [I] |
| [01/20/2026-06:55:08] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at. |
| [01/20/2026-06:55:08] [I] |
| [01/20/2026-06:55:08] [I] TensorRT version: 10.14.1 |
| [01/20/2026-06:55:08] [I] Loading standard plugins |
| [01/20/2026-06:55:08] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 29, GPU 10549 (MiB) |
| [01/20/2026-06:55:08] [I] Start parsing network model. |
| [01/20/2026-06:55:08] [I] [TRT] ---------------------------------------------------------------- |
| [01/20/2026-06:55:08] [I] [TRT] Input filename: checkpoints/deimv2_dinov3_s_coco.onnx |
| [01/20/2026-06:55:08] [I] [TRT] ONNX IR version: 0.0.8 |
| [01/20/2026-06:55:08] [I] [TRT] Opset version: 17 |
| [01/20/2026-06:55:08] [I] [TRT] Producer name: pytorch |
| [01/20/2026-06:55:08] [I] [TRT] Producer version: 2.10.0 |
| [01/20/2026-06:55:08] [I] [TRT] Domain: |
| [01/20/2026-06:55:08] [I] [TRT] Model version: 0 |
| [01/20/2026-06:55:08] [I] [TRT] Doc string: |
| [01/20/2026-06:55:08] [I] [TRT] ---------------------------------------------------------------- |
| [01/20/2026-06:55:08] [W] [TRT] ModelImporter.cpp:661: Make sure input orig_target_sizes has Int64 binding. |
| [01/20/2026-06:55:09] [W] [TRT] ModelImporter.cpp:908: Make sure output labels has Int64 binding. |
| [01/20/2026-06:55:09] [I] Finished parsing network model. Parse time: 0.0945442 |
| [01/20/2026-06:55:09] [I] Set shape of input tensor images for optimization profile 0 to: MIN=1x3x640x640 OPT=1x3x640x640 MAX=1x3x640x640 |
| [01/20/2026-06:55:09] [I] Set shape of input tensor orig_target_sizes for optimization profile 0 to: MIN=1x2 OPT=1x2 MAX=1x2 |
| [01/20/2026-06:55:09] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +204, GPU +4, now: CPU 571, GPU 10553 (MiB) |
| [01/20/2026-06:55:09] [W] [TRT] Detected layernorm nodes in FP16. |
| [01/20/2026-06:55:09] [W] [TRT] Running layernorm after self-attention with FP16 Reduce or Pow may cause overflow. Forcing Reduce or Pow Layers in FP32 precision, or exporting the model to use INormalizationLayer (available with ONNX opset >= 17) can help preserving accuracy. |
| [01/20/2026-06:55:09] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. |
| [01/20/2026-06:55:52] [I] [TRT] Compiler backend is used during engine build. |
| [01/20/2026-06:57:39] [I] [TRT] Detected 2 inputs and 3 output network tensors. |
| [01/20/2026-06:57:39] [I] [TRT] Total Host Persistent Memory: 281504 bytes |
| [01/20/2026-06:57:39] [I] [TRT] Total Device Persistent Memory: 3072 bytes |
| [01/20/2026-06:57:39] [I] [TRT] Max Scratch Memory: 9665024 bytes |
| [01/20/2026-06:57:39] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 91 steps to complete. |
| [01/20/2026-06:57:39] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 2.89015ms to assign 11 blocks to 91 nodes requiring 21496320 bytes. |
| [01/20/2026-06:57:39] [I] [TRT] Total Activation Memory: 21496320 bytes |
| [01/20/2026-06:57:39] [I] [TRT] Total Weights Memory: 19740416 bytes |
| [01/20/2026-06:57:40] [I] [TRT] Compiler backend is used during engine execution. |
| [01/20/2026-06:57:40] [I] [TRT] Engine generation completed in 150.685 seconds. |
| [01/20/2026-06:57:40] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 11 MiB, GPU 93 MiB |
| [01/20/2026-06:57:40] [I] Created engine with size: 25.2802 MiB |
| [01/20/2026-06:57:40] [I] Engine built in 151.034 sec. |
| [01/20/2026-06:57:40] [I] [TRT] Loaded engine size: 25 MiB |
| [01/20/2026-06:57:40] [I] Engine deserialized in 0.0153845 sec. |
| [01/20/2026-06:57:40] [I] [TRT] [MS] Running engine with multi stream info |
| [01/20/2026-06:57:40] [I] [TRT] [MS] Number of aux streams is 2 |
| [01/20/2026-06:57:40] [I] [TRT] [MS] Number of total worker streams is 3 |
| [01/20/2026-06:57:40] [I] [TRT] [MS] The main stream provided by execute/enqueue calls is the first worker stream |
| [01/20/2026-06:57:40] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +21, now: CPU 0, GPU 39 (MiB) |
| [01/20/2026-06:57:40] [I] Setting persistentCacheLimit to 0 bytes. |
| [01/20/2026-06:57:40] [I] Created execution context with device memory size: 20.5005 MiB |
| [01/20/2026-06:57:40] [I] Using random values for input images |
| [01/20/2026-06:57:40] [I] Input binding for images with dimensions 1x3x640x640 is created. |
| [01/20/2026-06:57:40] [I] Using random values for input orig_target_sizes |
| [01/20/2026-06:57:40] [I] Input binding for orig_target_sizes with dimensions 1x2 is created. |
| [01/20/2026-06:57:40] [I] Output binding for labels with dimensions 1x300 is created. |
| [01/20/2026-06:57:40] [I] Output binding for boxes with dimensions 1x300x4 is created. |
| [01/20/2026-06:57:40] [I] Output binding for scores with dimensions 1x300 is created. |
| [01/20/2026-06:57:40] [I] Starting inference |
| [01/20/2026-06:57:43] [I] Warmup completed 146 queries over 200 ms |
| [01/20/2026-06:57:43] [I] Timing trace has 2199 queries over 3.00392 s |
| [01/20/2026-06:57:43] [I] |
| [01/20/2026-06:57:43] [I] === Trace details === |
| [01/20/2026-06:57:43] [I] Trace averages of 10 runs: |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.35834 ms - Host latency: 1.58537 ms (enqueue 0.439809 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36411 ms - Host latency: 1.59102 ms (enqueue 0.443648 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36554 ms - Host latency: 1.59115 ms (enqueue 0.441405 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36572 ms - Host latency: 1.59173 ms (enqueue 0.450684 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36576 ms - Host latency: 1.59286 ms (enqueue 0.442368 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36451 ms - Host latency: 1.58991 ms (enqueue 0.441461 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36351 ms - Host latency: 1.58781 ms (enqueue 0.449054 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36335 ms - Host latency: 1.58888 ms (enqueue 0.447015 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3633 ms - Host latency: 1.59006 ms (enqueue 0.444852 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3643 ms - Host latency: 1.59003 ms (enqueue 0.444052 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36356 ms - Host latency: 1.58947 ms (enqueue 0.443555 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36461 ms - Host latency: 1.59058 ms (enqueue 0.444547 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36305 ms - Host latency: 1.58903 ms (enqueue 0.443799 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58965 ms (enqueue 0.44136 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36357 ms - Host latency: 1.58973 ms (enqueue 0.445804 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36357 ms - Host latency: 1.58859 ms (enqueue 0.439398 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36337 ms - Host latency: 1.58787 ms (enqueue 0.458829 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36386 ms - Host latency: 1.58777 ms (enqueue 0.470523 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36315 ms - Host latency: 1.58919 ms (enqueue 0.432932 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36291 ms - Host latency: 1.58851 ms (enqueue 0.44093 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3643 ms - Host latency: 1.59142 ms (enqueue 0.436078 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36438 ms - Host latency: 1.59136 ms (enqueue 0.441199 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36407 ms - Host latency: 1.58911 ms (enqueue 0.437918 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36408 ms - Host latency: 1.59089 ms (enqueue 0.447729 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.59078 ms (enqueue 0.435645 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36422 ms - Host latency: 1.59025 ms (enqueue 0.434961 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3635 ms - Host latency: 1.591 ms (enqueue 0.440906 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36263 ms - Host latency: 1.58801 ms (enqueue 0.434052 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36351 ms - Host latency: 1.58947 ms (enqueue 0.462219 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36393 ms - Host latency: 1.59056 ms (enqueue 0.443243 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36391 ms - Host latency: 1.59028 ms (enqueue 0.438641 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58887 ms (enqueue 0.445636 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36342 ms - Host latency: 1.59011 ms (enqueue 0.444208 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36346 ms - Host latency: 1.59023 ms (enqueue 0.444598 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36316 ms - Host latency: 1.58997 ms (enqueue 0.445331 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36346 ms - Host latency: 1.59006 ms (enqueue 0.439972 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36443 ms - Host latency: 1.58953 ms (enqueue 0.478406 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36407 ms - Host latency: 1.58942 ms (enqueue 0.472168 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.59003 ms (enqueue 0.450946 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36269 ms - Host latency: 1.58821 ms (enqueue 0.452094 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3634 ms - Host latency: 1.58981 ms (enqueue 0.436835 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36327 ms - Host latency: 1.58932 ms (enqueue 0.447278 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36411 ms - Host latency: 1.58897 ms (enqueue 0.771808 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36356 ms - Host latency: 1.58953 ms (enqueue 0.460181 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36378 ms - Host latency: 1.59033 ms (enqueue 0.450714 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36313 ms - Host latency: 1.58844 ms (enqueue 0.436713 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36343 ms - Host latency: 1.59 ms (enqueue 0.440601 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36338 ms - Host latency: 1.58936 ms (enqueue 0.438214 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36384 ms - Host latency: 1.58855 ms (enqueue 0.452051 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36315 ms - Host latency: 1.58967 ms (enqueue 0.441931 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36335 ms - Host latency: 1.58993 ms (enqueue 0.439587 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.364 ms - Host latency: 1.59079 ms (enqueue 0.441016 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3636 ms - Host latency: 1.59019 ms (enqueue 0.434497 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36348 ms - Host latency: 1.59001 ms (enqueue 0.440436 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36329 ms - Host latency: 1.58978 ms (enqueue 0.456458 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.363 ms - Host latency: 1.58828 ms (enqueue 0.451471 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36321 ms - Host latency: 1.58891 ms (enqueue 0.444556 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36294 ms - Host latency: 1.58796 ms (enqueue 0.443604 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36411 ms - Host latency: 1.59097 ms (enqueue 0.445068 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3638 ms - Host latency: 1.59024 ms (enqueue 0.443127 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36346 ms - Host latency: 1.58997 ms (enqueue 0.440167 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36389 ms - Host latency: 1.5899 ms (enqueue 0.445129 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.59065 ms (enqueue 0.446313 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36328 ms - Host latency: 1.58978 ms (enqueue 0.444324 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36334 ms - Host latency: 1.59003 ms (enqueue 0.439246 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36407 ms - Host latency: 1.59083 ms (enqueue 0.438879 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36386 ms - Host latency: 1.59032 ms (enqueue 0.441052 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36317 ms - Host latency: 1.58983 ms (enqueue 0.438794 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36306 ms - Host latency: 1.58959 ms (enqueue 0.438867 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36301 ms - Host latency: 1.58944 ms (enqueue 0.438855 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36361 ms - Host latency: 1.59039 ms (enqueue 0.441309 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36296 ms - Host latency: 1.58978 ms (enqueue 0.44148 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3641 ms - Host latency: 1.59025 ms (enqueue 0.442981 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36499 ms - Host latency: 1.59066 ms (enqueue 0.446216 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36328 ms - Host latency: 1.58986 ms (enqueue 0.44093 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3645 ms - Host latency: 1.59019 ms (enqueue 0.435937 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36423 ms - Host latency: 1.59133 ms (enqueue 0.439709 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36431 ms - Host latency: 1.58954 ms (enqueue 0.442932 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36331 ms - Host latency: 1.58868 ms (enqueue 0.445911 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36373 ms - Host latency: 1.58958 ms (enqueue 0.438513 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3636 ms - Host latency: 1.5901 ms (enqueue 0.435034 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36407 ms - Host latency: 1.59045 ms (enqueue 0.4354 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36396 ms - Host latency: 1.59104 ms (enqueue 0.460461 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36338 ms - Host latency: 1.58949 ms (enqueue 0.454236 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58958 ms (enqueue 0.442126 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36404 ms - Host latency: 1.58927 ms (enqueue 0.439563 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36359 ms - Host latency: 1.59045 ms (enqueue 0.442273 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36388 ms - Host latency: 1.59047 ms (enqueue 0.442029 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36395 ms - Host latency: 1.58977 ms (enqueue 0.440356 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36344 ms - Host latency: 1.58864 ms (enqueue 0.445386 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36396 ms - Host latency: 1.58995 ms (enqueue 0.444177 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36354 ms - Host latency: 1.58967 ms (enqueue 0.442737 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36423 ms - Host latency: 1.58945 ms (enqueue 0.440112 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3634 ms - Host latency: 1.59 ms (enqueue 0.437964 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36364 ms - Host latency: 1.59041 ms (enqueue 0.438586 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36342 ms - Host latency: 1.58943 ms (enqueue 0.441638 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36395 ms - Host latency: 1.59001 ms (enqueue 0.438611 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36306 ms - Host latency: 1.58876 ms (enqueue 0.437866 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36351 ms - Host latency: 1.58987 ms (enqueue 0.441199 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36368 ms - Host latency: 1.59054 ms (enqueue 0.443579 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36366 ms - Host latency: 1.58575 ms (enqueue 0.514673 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36473 ms - Host latency: 1.58927 ms (enqueue 0.471899 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36323 ms - Host latency: 1.58971 ms (enqueue 0.443347 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36383 ms - Host latency: 1.59017 ms (enqueue 0.43667 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.59031 ms (enqueue 0.436035 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36333 ms - Host latency: 1.58923 ms (enqueue 0.445215 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36353 ms - Host latency: 1.58886 ms (enqueue 0.43667 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36255 ms - Host latency: 1.58766 ms (enqueue 0.435669 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58983 ms (enqueue 0.440649 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36399 ms - Host latency: 1.58915 ms (enqueue 0.43988 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36412 ms - Host latency: 1.5907 ms (enqueue 0.446997 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3639 ms - Host latency: 1.59098 ms (enqueue 0.44856 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36451 ms - Host latency: 1.59078 ms (enqueue 0.437244 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36326 ms - Host latency: 1.58932 ms (enqueue 0.445728 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36398 ms - Host latency: 1.59054 ms (enqueue 0.439539 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.59021 ms (enqueue 0.442529 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36335 ms - Host latency: 1.58967 ms (enqueue 0.438489 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36267 ms - Host latency: 1.58927 ms (enqueue 0.439697 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36331 ms - Host latency: 1.58962 ms (enqueue 0.440845 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58955 ms (enqueue 0.440918 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36354 ms - Host latency: 1.59021 ms (enqueue 0.434119 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36375 ms - Host latency: 1.58995 ms (enqueue 0.448096 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36323 ms - Host latency: 1.59006 ms (enqueue 0.442773 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3635 ms - Host latency: 1.58994 ms (enqueue 0.443115 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36307 ms - Host latency: 1.58976 ms (enqueue 0.442371 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36318 ms - Host latency: 1.58975 ms (enqueue 0.439624 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36328 ms - Host latency: 1.59004 ms (enqueue 0.44209 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36339 ms - Host latency: 1.58943 ms (enqueue 0.447375 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36345 ms - Host latency: 1.5887 ms (enqueue 0.446582 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3639 ms - Host latency: 1.59076 ms (enqueue 0.447876 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36449 ms - Host latency: 1.59119 ms (enqueue 0.446411 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36377 ms - Host latency: 1.59092 ms (enqueue 0.443127 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36384 ms - Host latency: 1.58966 ms (enqueue 0.437134 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36396 ms - Host latency: 1.59044 ms (enqueue 0.439563 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36298 ms - Host latency: 1.58964 ms (enqueue 0.436792 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36285 ms - Host latency: 1.58849 ms (enqueue 0.454407 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3644 ms - Host latency: 1.59087 ms (enqueue 0.446143 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36292 ms - Host latency: 1.58948 ms (enqueue 0.439722 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.59043 ms (enqueue 0.439331 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36282 ms - Host latency: 1.58804 ms (enqueue 0.463892 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36533 ms - Host latency: 1.59038 ms (enqueue 0.445728 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.58921 ms (enqueue 0.451514 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36377 ms - Host latency: 1.59084 ms (enqueue 0.444482 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36379 ms - Host latency: 1.58928 ms (enqueue 0.471216 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36372 ms - Host latency: 1.58584 ms (enqueue 1.03264 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36384 ms - Host latency: 1.59001 ms (enqueue 0.439404 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.58979 ms (enqueue 0.432397 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36406 ms - Host latency: 1.59026 ms (enqueue 0.438867 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36399 ms - Host latency: 1.59099 ms (enqueue 0.441235 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36431 ms - Host latency: 1.58938 ms (enqueue 0.439136 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.58992 ms (enqueue 0.435034 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36431 ms - Host latency: 1.59045 ms (enqueue 0.446167 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36372 ms - Host latency: 1.5905 ms (enqueue 0.439868 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36387 ms - Host latency: 1.59048 ms (enqueue 0.437817 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36426 ms - Host latency: 1.59072 ms (enqueue 0.437793 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36399 ms - Host latency: 1.5906 ms (enqueue 0.436206 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36384 ms - Host latency: 1.5897 ms (enqueue 0.457056 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36401 ms - Host latency: 1.58594 ms (enqueue 0.492017 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36375 ms - Host latency: 1.58823 ms (enqueue 0.459448 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36404 ms - Host latency: 1.5905 ms (enqueue 0.438794 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36399 ms - Host latency: 1.59077 ms (enqueue 0.441895 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36423 ms - Host latency: 1.59053 ms (enqueue 0.44104 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36379 ms - Host latency: 1.59041 ms (enqueue 0.439673 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36335 ms - Host latency: 1.58899 ms (enqueue 0.459937 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36431 ms - Host latency: 1.5896 ms (enqueue 0.444531 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36477 ms - Host latency: 1.59155 ms (enqueue 0.439697 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.58909 ms (enqueue 0.438525 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.58887 ms (enqueue 0.438477 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36343 ms - Host latency: 1.58784 ms (enqueue 0.444067 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36494 ms - Host latency: 1.59221 ms (enqueue 0.439429 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36348 ms - Host latency: 1.58997 ms (enqueue 0.446143 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36438 ms - Host latency: 1.59109 ms (enqueue 0.44978 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36401 ms - Host latency: 1.59106 ms (enqueue 0.446118 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36528 ms - Host latency: 1.59194 ms (enqueue 0.448413 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36492 ms - Host latency: 1.59028 ms (enqueue 0.45022 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36418 ms - Host latency: 1.59033 ms (enqueue 0.449512 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36418 ms - Host latency: 1.58845 ms (enqueue 0.497168 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36526 ms - Host latency: 1.59045 ms (enqueue 0.459277 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36362 ms - Host latency: 1.58826 ms (enqueue 0.460718 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3645 ms - Host latency: 1.59138 ms (enqueue 0.449902 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36389 ms - Host latency: 1.58931 ms (enqueue 0.452368 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3636 ms - Host latency: 1.58979 ms (enqueue 0.448291 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36406 ms - Host latency: 1.59009 ms (enqueue 0.449634 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.58862 ms (enqueue 0.46543 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36409 ms - Host latency: 1.59097 ms (enqueue 0.450562 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36399 ms - Host latency: 1.5906 ms (enqueue 0.452661 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36404 ms - Host latency: 1.59038 ms (enqueue 0.44873 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36404 ms - Host latency: 1.59102 ms (enqueue 0.445532 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36331 ms - Host latency: 1.59001 ms (enqueue 0.447095 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3644 ms - Host latency: 1.59131 ms (enqueue 0.448535 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3634 ms - Host latency: 1.58972 ms (enqueue 0.448926 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36367 ms - Host latency: 1.58906 ms (enqueue 0.449585 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36404 ms - Host latency: 1.59087 ms (enqueue 0.441382 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36362 ms - Host latency: 1.59041 ms (enqueue 0.437012 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36333 ms - Host latency: 1.59031 ms (enqueue 0.441211 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36487 ms - Host latency: 1.59082 ms (enqueue 0.438501 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36321 ms - Host latency: 1.59004 ms (enqueue 0.435645 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36292 ms - Host latency: 1.58921 ms (enqueue 0.438892 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36433 ms - Host latency: 1.59072 ms (enqueue 0.434082 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36384 ms - Host latency: 1.58962 ms (enqueue 0.438867 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36387 ms - Host latency: 1.59011 ms (enqueue 0.4448 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36426 ms - Host latency: 1.59116 ms (enqueue 0.438916 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36318 ms - Host latency: 1.58914 ms (enqueue 0.440454 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36372 ms - Host latency: 1.59043 ms (enqueue 0.43689 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36462 ms - Host latency: 1.59084 ms (enqueue 0.449536 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36365 ms - Host latency: 1.5905 ms (enqueue 0.443726 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36409 ms - Host latency: 1.59077 ms (enqueue 0.439819 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3635 ms - Host latency: 1.58977 ms (enqueue 0.448389 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3634 ms - Host latency: 1.59011 ms (enqueue 0.440186 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36335 ms - Host latency: 1.58994 ms (enqueue 0.436816 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36379 ms - Host latency: 1.59033 ms (enqueue 0.43562 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36387 ms - Host latency: 1.59082 ms (enqueue 0.438452 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3634 ms - Host latency: 1.58987 ms (enqueue 0.439624 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3637 ms - Host latency: 1.59036 ms (enqueue 0.44209 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36389 ms - Host latency: 1.58984 ms (enqueue 0.443994 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36423 ms - Host latency: 1.59028 ms (enqueue 0.436646 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.3646 ms - Host latency: 1.59121 ms (enqueue 0.435034 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36331 ms - Host latency: 1.58965 ms (enqueue 0.443726 ms) |
| [01/20/2026-06:57:43] [I] Average on 10 runs - GPU latency: 1.36426 ms - Host latency: 1.58589 ms (enqueue 0.503564 ms) |
| [01/20/2026-06:57:43] [I] |
| [01/20/2026-06:57:43] [I] === Performance summary === |
| [01/20/2026-06:57:43] [I] Throughput: 732.043 qps |
| [01/20/2026-06:57:43] [I] Latency: min = 1.57953 ms, max = 1.59692 ms, mean = 1.58984 ms, median = 1.59009 ms, percentile(90%) = 1.59253 ms, percentile(95%) = 1.59326 ms, percentile(99%) = 1.59448 ms |
| [01/20/2026-06:57:43] [I] Enqueue Time: min = 0.426849 ms, max = 1.68213 ms, mean = 0.449191 ms, median = 0.439941 ms, percentile(90%) = 0.460449 ms, percentile(95%) = 0.486328 ms, percentile(99%) = 0.567993 ms |
| [01/20/2026-06:57:43] [I] H2D Latency: min = 0.213867 ms, max = 0.227295 ms, mean = 0.22142 ms, median = 0.221924 ms, percentile(90%) = 0.222534 ms, percentile(95%) = 0.222717 ms, percentile(99%) = 0.223145 ms |
| [01/20/2026-06:57:43] [I] GPU Compute Time: min = 1.3568 ms, max = 1.36914 ms, mean = 1.36374 ms, median = 1.36389 ms, percentile(90%) = 1.36597 ms, percentile(95%) = 1.36621 ms, percentile(99%) = 1.36792 ms |
| [01/20/2026-06:57:43] [I] D2H Latency: min = 0.00415039 ms, max = 0.00634766 ms, mean = 0.00469234 ms, median = 0.0045166 ms, percentile(90%) = 0.00561523 ms, percentile(95%) = 0.00585938 ms, percentile(99%) = 0.00610352 ms |
| [01/20/2026-06:57:43] [I] Total Host Walltime: 3.00392 s |
| [01/20/2026-06:57:43] [I] Total GPU Compute Time: 2.99885 s |
| [01/20/2026-06:57:43] [I] Explanations of the performance metrics are printed in the verbose logs. |
| [01/20/2026-06:57:43] [I] |
| &&&& PASSED TensorRT.trtexec [TensorRT v101401] [b48] # trtexec --onnx=checkpoints/deimv2_dinov3_s_coco.onnx --saveEngine=checkpoints/deimv2_dinov3_s_coco.engine --fp16 --optShapes=images:1x3x640x640,orig_target_sizes:1x2 --memPoolSize=workspace:4096 --builderOptimizationLevel=3 |
|
|