vtsr / benchmark.md
liamxdev's picture
Upload folder using huggingface_hub
28b323b verified
|
Raw
History Blame Contribute Delete
734 Bytes

Environment

  • Platform: Google Colab
  • GPU: NVIDIA Tesla T4
  • Input size: 640×640
  • Batch size: 1
  • Warm-up runs: 30
  • Measured runs: 200

Results

Artifact Mean Latency (ms) Median Latency (ms) P95 Latency (ms) FPS (Median)
ONNX INT8 733.704 634.253 1196.094 1.58
TorchScript FP16 15.526 15.174 17.666 65.90
TensorRT INT8 12.956 12.774 14.836 78.28

TensorRT INT8 achieved the best latency and throughput on an NVIDIA Tesla T4 GPU. TorchScript FP16 delivered comparable performance, while the ONNX INT8 artifact showed substantially higher latency in this environment.