File size: 3,529 Bytes
643eb30
 
7d14d48
 
 
643eb30
 
 
 
 
 
d9b80bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bf74045
 
 
 
e994b15
bf74045
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: mit
base_model:
- timm/eva02_large_patch14_448.mim_m38m_ft_in22k_in1k
pipeline_tag: image-classification
---

![PyTorch to ONNX-TensorRT](https://dicksonneoh.com/images/portfolio/supercharge_your_pytorch_image_models/post_image.png)

This repository contains code to optimize PyTorch image models using ONNX Runtime and TensorRT, achieving up to 8x faster inference speeds. Read the full blog post [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models/).

## Installation
Create and activate a conda environment:

```bash
conda create -n supercharge_timm_tensorrt python=3.11
conda activate supercharge_timm_tensorrt
```
 Install required packages:


```bash
pip install timm
pip install onnx
pip install onnxruntime-gpu==1.19.2
pip install cupy-cuda12x
pip install tensorrt==10.1.0 tensorrt-cu12==10.1.0 tensorrt-cu12-bindings==10.1.0 tensorrt-cu12-libs==10.1.0
```

Install CUDA dependencies:
```bash
conda install -c nvidia cuda=12.2.2 cuda-tools=12.2.2 cuda-toolkit=12.2.2 cuda-version=12.2 cuda-command-line-tools=12.2.2 cuda-compiler=12.2.2 cuda-runtime=12.2.2
```

Install cuDNN:
```bash
conda install cudnn==9.2.1.18
```

Set up library paths:
```bash
export LD_LIBRARY_PATH="/home/dnth/mambaforge-pypy3/envs/supercharge_timm_tensorrt/lib:$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH="/home/dnth/mambaforge-pypy3/envs/supercharge_timm_tensorrt/lib/python3.11/site-packages/tensorrt_libs:$LD_LIBRARY_PATH"
```

## Running the code

The following codes correspond to the steps in the blog post.

### PyTorch latency benchmark:
   ```bash
   python 01_pytorch_latency_benchmark.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-baseline-latency)

### Convert model to ONNX:
   ```bash
   python 02_convert_to_onnx.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-convert-to-onnx)

### ONNX Runtime CPU inference:
   ```bash
   python 03_onnx_cpu_inference.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-onnx-runtime-on-cpu)

### ONNX Runtime CUDA inference:
   ```bash
   python 04_onnx_cuda_inference.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-onnx-runtime-on-cuda)

### ONNX Runtime TensorRT inference:
   ```bash
   python 05_onnx_trt_inference.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-onnx-runtime-on-tensorrt)

### Export preprocessing to ONNX:
   ```bash
   python 06_export_preprocessing_onnx.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-bake-pre-processing-into-onnx)

### Merge preprocessing and model ONNX:
   ```bash
   python 07_onnx_compose_merge.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-bake-pre-processing-into-onnx)

### Run inference on merged model:
   ```bash
   python 08_inference_merged_model.py
   ```
Read more [here](https://dicksonneoh.com/portfolio/supercharge_your_pytorch_image_models//#-bake-pre-processing-into-onnx)

### Run inference on video:
   ```bash
   python 09_video_inference.py sample.mp4 output.mp4 --live 
   ```

<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6195f404c07573b03c61702c/lOmu7KaqrihRDVcQVJDi0.mp4"></video>

To run on a webcam as input source
```
python 09_video_inference.py --webcam --live 
```