English
Vranlee commited on
Commit
0985519
·
verified ·
1 Parent(s): 74f4a08

Upload 552 files

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. .gitignore +1 -0
  3. LICENSE +21 -0
  4. README.md +200 -3
  5. assets/Fig.PNG +3 -0
  6. conda_env.yaml +122 -0
  7. deploy/ONNXRuntime/README.md +19 -0
  8. deploy/ONNXRuntime/onnx_inference.py +161 -0
  9. deploy/TensorRT/cpp/CMakeLists.txt +39 -0
  10. deploy/TensorRT/cpp/README.md +58 -0
  11. deploy/TensorRT/cpp/include/BYTETracker.h +49 -0
  12. deploy/TensorRT/cpp/include/STrack.h +50 -0
  13. deploy/TensorRT/cpp/include/dataType.h +36 -0
  14. deploy/TensorRT/cpp/include/kalmanFilter.h +31 -0
  15. deploy/TensorRT/cpp/include/lapjv.h +63 -0
  16. deploy/TensorRT/cpp/include/logging.h +503 -0
  17. deploy/TensorRT/cpp/src/BYTETracker.cpp +241 -0
  18. deploy/TensorRT/cpp/src/STrack.cpp +192 -0
  19. deploy/TensorRT/cpp/src/bytetrack.cpp +505 -0
  20. deploy/TensorRT/cpp/src/kalmanFilter.cpp +152 -0
  21. deploy/TensorRT/cpp/src/lapjv.cpp +343 -0
  22. deploy/TensorRT/cpp/src/utils.cpp +429 -0
  23. deploy/TensorRT/python/README.md +22 -0
  24. deploy/ncnn/cpp/CMakeLists.txt +84 -0
  25. deploy/ncnn/cpp/README.md +103 -0
  26. deploy/ncnn/cpp/include/BYTETracker.h +49 -0
  27. deploy/ncnn/cpp/include/STrack.h +50 -0
  28. deploy/ncnn/cpp/include/dataType.h +36 -0
  29. deploy/ncnn/cpp/include/kalmanFilter.h +31 -0
  30. deploy/ncnn/cpp/include/lapjv.h +63 -0
  31. deploy/ncnn/cpp/src/BYTETracker.cpp +241 -0
  32. deploy/ncnn/cpp/src/STrack.cpp +192 -0
  33. deploy/ncnn/cpp/src/bytetrack.cpp +396 -0
  34. deploy/ncnn/cpp/src/kalmanFilter.cpp +152 -0
  35. deploy/ncnn/cpp/src/lapjv.cpp +343 -0
  36. deploy/ncnn/cpp/src/utils.cpp +429 -0
  37. deploy/scripts/export_onnx.py +102 -0
  38. deploy/scripts/trt.py +74 -0
  39. docs/DEPLOY.md +38 -0
  40. exps/SU-T-ReID.py +162 -0
  41. exps/SU-T.py +152 -0
  42. exps/default/nano.py +39 -0
  43. exps/default/yolov3.py +89 -0
  44. exps/default/yolox_l.py +15 -0
  45. exps/default/yolox_m.py +15 -0
  46. exps/default/yolox_s.py +15 -0
  47. exps/default/yolox_tiny.py +19 -0
  48. exps/default/yolox_x.py +15 -0
  49. fast_reid/CHANGELOG.md +39 -0
  50. fast_reid/GETTING_STARTED.md +62 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/Fig.PNG filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1 @@
 
 
1
+ .DS_Store
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 LI Weiran
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,3 +1,200 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking
2
+
3
+ The official implementation of the paper:
4
+ > [**When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking**](https://vranlee.github.io/SU-T/)
5
+ > Weiran Li, Yeqiang Liu, Qiannan Guo, Yijie Wei, Hwa Liang Leo, Zhenbo Li*
6
+ > [**\[Project\]**](https://vranlee.github.io/SU-T/) [**\[Paper\]**](https://arxiv.org/abs/2507.06400) [**\[Code\]**](https://github.com/vranlee/SU-T)
7
+
8
+ <div align="center">
9
+ <img src="assets/Fig.PNG" width="900"/>
10
+ </div>
11
+
12
+ > Contact: vranlee@cau.edu.cn or weiranli@u.nus.edu. Any questions or discussion are welcome!
13
+ >
14
+ > If like this work, a star 🌟 would be much appreciated!
15
+
16
+ -----
17
+
18
+ ## 📌Updates
19
+ + [2025.07] Paper released to arXiv.
20
+ + [2025.07] Fixed bugs.
21
+ + [2025.04] We have released the MFT25 dataset and codes of SU-T!
22
+ -----
23
+
24
+ ## 💡Abstract
25
+ Multiple object tracking (MOT) technology has made significant progress in terrestrial applications, but underwater tracking scenarios remain underexplored despite their importance to marine ecology and aquaculture. We present Multiple Fish Tracking Dataset 2025 (MFT25), the first comprehensive dataset specifically designed for underwater multiple fish tracking, featuring 15 diverse video sequences with 408,578 meticulously annotated bounding boxes across 48,066 frames. Our dataset captures various underwater environments, fish species, and challenging conditions including occlusions, similar appearances, and erratic motion patterns. Additionally, we introduce Scale-aware and Unscented Tracker (SU-T), a specialized tracking framework featuring an Unscented Kalman Filter (UKF) optimized for non-linear fish swimming patterns and a novel Fish-Intersection-over-Union (FishIoU) matching that accounts for the unique morphological characteristics of aquatic species. Extensive experiments demonstrate that our SU-T baseline achieves state-of-the-art performance on MFT25, with 34.1 HOTA and 44.6 IDF1, while revealing fundamental differences between fish tracking and terrestrial object tracking scenarios. MFT25 establishes a robust foundation for advancing research in underwater tracking systems with important applications in marine biology, aquaculture monitoring, and ecological conservation.
26
+
27
+ ## 🏆Contributions
28
+
29
+ + We introduce MFT25, the first comprehensive multiple fish tracking dataset featuring 15 diverse video sequences with 408,578 meticulously annotated bounding boxes across 48,066 frames, capturing various underwater environments, fish species, and challenging conditions including occlusions, rapid direction changes, and visually similar appearances.
30
+
31
+ + We propose SU-T, a specialized tracking framework featuring an Unscented Kalman Filter (UKF) optimized for non-linear fish swimming patterns and a novel Fish-Intersection-over-Union (FishIoU) matching that accounts for the unique morphological characteristics and erratic movement behaviors of aquatic species.
32
+
33
+ + We conduct extensive comparative experiments demonstrating that our tracker achieves state-of-the-art performance on MFT25, with 34.1 HOTA and 44.6 IDF1. Through quantitative analysis, we highlight the fundamental differences between fish tracking and land-based object tracking scenarios.
34
+
35
+ ## 🛠️Installation Guide
36
+
37
+ ### Prerequisites
38
+ - CUDA >= 10.2
39
+ - Python >= 3.7
40
+ - PyTorch >= 1.7.0
41
+ - Ubuntu 18.04 or later (Windows is also supported but may require additional setup)
42
+
43
+ ### Step-by-Step Installation
44
+
45
+ 1. **Clone the Repository**
46
+ ```bash
47
+ git clone https://github.com/vranlee/SU-T.git
48
+ cd SU-T
49
+ ```
50
+
51
+ 2. **Create and Activate Conda Environment**
52
+ ```bash
53
+ # Create environment from yaml file
54
+ conda env create -f conda_env.yaml
55
+
56
+ # Activate the environment
57
+ conda activate su_t
58
+ ```
59
+
60
+ 3. **Download Required Resources**
61
+ - Download pretrained models from [BaiduYun (Password: 9uqc)](https://pan.baidu.com/s/1AkIuViwXCPz5l5Oo-UgtaQ?pwd=9uqc)
62
+ - Download MFT25 dataset from [BaiduYun (Password: wrbg)](https://pan.baidu.com/s/11TkRqNIq4poNAU5dyoL5hA?pwd=wrbg)
63
+
64
+ 4. **Organize the Directory Structure**
65
+ ```
66
+ SU-T/
67
+ ├── pretrained/
68
+ │ └── Checkpoint.pth.tar
69
+ ├── MFT25/
70
+ │ ├── train/
71
+ │ └── test/
72
+ └── ...
73
+ ```
74
+
75
+ ## 🍭Usage Guide
76
+
77
+ ### Training
78
+
79
+ 1. **Basic Training Command**
80
+ ```bash
81
+ python tools/train.py \
82
+ -f exps/SU-T.py \ # Base model configuration
83
+ -d 8 \ # Number of GPUs
84
+ -b 48 \ # Batch size
85
+ --fp16 \ # Enable mixed precision training
86
+ -o \ # Enable occupy GPU memory
87
+ -c pretrained/Checkpoint.pth.tar # Path to pretrained weights
88
+ ```
89
+
90
+ 2. **Training with ReID Module**
91
+ ```bash
92
+ python tools/train.py \
93
+ -f exps/SU-T-ReID.py \ # ReID model configuration
94
+ -d 8 \
95
+ -b 48 \
96
+ --fp16 \
97
+ -o \
98
+ -c pretrained/Checkpoint.pth.tar
99
+ ```
100
+
101
+ ### Testing
102
+
103
+ 1. **Basic Testing Command**
104
+ ```bash
105
+ python tools/su_tracker.py \
106
+ -f exps/SU-T.py \ # Model configuration
107
+ -b 1 \ # Batch size
108
+ -d 1 \ # Number of GPUs
109
+ --fp16 \ # Enable mixed precision
110
+ --fuse \ # Enable model fusion
111
+ --expn your_exp_name # Experiment name
112
+ ```
113
+
114
+ 2. **Testing with ReID Module**
115
+ ```bash
116
+ python tools/su_tracker.py \
117
+ -f exps/SU-T-ReID.py \ # ReID model configuration
118
+ -b 1 \
119
+ -d 1 \
120
+ --fp16 \
121
+ --fuse \
122
+ --expn your_exp_name
123
+ ```
124
+
125
+ ### Additional Configuration Options
126
+
127
+ - **Model Configuration**: Edit `exps/SU-T.py` or `exps/SU-T-ReID.py` to modify:
128
+ - Learning rate
129
+ - Training epochs
130
+ - Data augmentation parameters
131
+ - Model architecture settings
132
+
133
+ - **Training Parameters**:
134
+ ```bash
135
+ # Additional training options
136
+ --cache # Cache images in RAM
137
+ --resume # Resume from a specific checkpoint
138
+ --trt # Export TensorRT model
139
+ ```
140
+
141
+ - **Testing Parameters**:
142
+ ```bash
143
+ # Additional testing options
144
+ --tsize # Test image size
145
+ --conf # Confidence threshold
146
+ --nms # NMS threshold
147
+ --track_thresh # Tracking threshold
148
+ ```
149
+
150
+ ## 📜Tracking Performance
151
+
152
+ ### Comparisons on MFT25 dataset
153
+
154
+ | Method | Class | Year | HOTA↑ | IDF1↑ | MOTA↑ | AssA↑ | DetA↑ | IDs↓ | IDFP↓ | IDFN↓ | Frag↓ |
155
+ |--------|-------|------|-------|-------|-------|-------|-------|------|-------|-------|-------|
156
+ | FairMOT | JDE | 2021 | 22.226 | 26.867 | 47.509 | 13.910 | 35.606 | 939 | 58198 | 113393 | 3768 |
157
+ | CMFTNet | JDE | 2022 | 22.432 | 27.659 | 46.365 | 14.278 | 35.452 | 1301 | 64754 | 111263 | 2769 |
158
+ | TransTrack | TF | 2021 | 30.426 | 35.215 | 68.983 | 18.525 | _50.458_ | 1116 | 96045 | 93418 | 2588 |
159
+ | TransCenter | TF | 2023 | 27.896 | 30.278 | 68.693 | **30.255** | 30.301 | 807 | 101223 | 101002 | 1992 |
160
+ | TrackFormer | TF | 2022 | 30.361 | 35.285 | **74.609** | 17.661 | **52.649** | 718 | 89391 | 94720 | 1729 |
161
+ | TFMFT | TF | 2024 | 25.440 | 33.950 | 49.725 | 17.112 | 38.059 | 719 | 63125 | 102378 | 3251 |
162
+ | SORT | SDE | 2016 | 29.063 | 34.119 | 69.038 | 16.952 | 50.195 | 778 | 88928 | 96815 | _1726_ |
163
+ | ByteTrack | SDE | 2022 | 31.758 | 40.355 | _69.586_ | 20.392 | 49.712 | **489** | 80765 | 87866 | **1555** |
164
+ | BoT-SORT | SDE | 2022 | 26.848 | 36.847 | 49.108 | 19.446 | 37.241 | _500_ | 57581 | 99181 | 2704 |
165
+ | OC-SORT | SDE | 2023 | 25.017 | 34.620 | 46.706 | 17.783 | 35.369 | 550 | **52934** | 103495 | 3651 |
166
+ | Deep-OC-SORT | SDE | 2023 | 24.848 | 34.176 | 46.721 | 17.537 | 35.373 | 550 | _53478_ | 104024 | 3659 |
167
+ | HybridSORT | SDE | 2024 | 32.258 | 38.421 | 68.905 | 20.936 | 49.992 | 613 | 85924 | 90022 | 1931 |
168
+ | HybridSORT† | SDE | 2024 | 32.705 | _41.727_ | 69.167 | 21.701 | 49.697 | 562 | 79189 | 85830 | 1963 |
169
+ | **SU-T (Ours)** | SDE | 2025 | _33.351_ | 41.717 | 68.450 | 22.425 | 49.943 | 607 | 83111 | _84814_ | 2006 |
170
+ | **SU-T† (Ours)** | SDE | 2025 | **34.067** | **44.643** | 68.958 | _23.594_ | 49.531 | 544 | 76440 | **81304** | 2011 |
171
+
172
+ *Note: † indicates the integration of ReID module, **Bold** indicates the best performance, _italics_ indicate the second-best performance
173
+
174
+ ## ⁉️Troubleshooting
175
+
176
+ ### Common Issues
177
+
178
+ 1. **CUDA Out of Memory**
179
+ - Reduce batch size
180
+ - Use smaller input resolution
181
+ - Enable mixed precision training
182
+
183
+ 2. **Installation Failures**
184
+ - Ensure CUDA toolkit matches PyTorch version
185
+ - Try creating environment with `pip` if conda fails
186
+ - Check system CUDA compatibility
187
+
188
+ 3. **Training Issues**
189
+ - Verify dataset path and structure
190
+ - Check GPU memory usage
191
+ - Monitor learning rate and loss curves
192
+
193
+ ## 💕Acknowledgement
194
+ A large part of the code is borrowed from [ByteTrack](https://github.com/ifzhang/ByteTrack), [OC_SORT](https://github.com/noahcao/OC_SORT), and [HybridSORT](https://github.com/ymzis69/HybridSORT). Thanks for their wonderful works!
195
+
196
+ ## 📖Citation
197
+ The citation format will be given after the manuscript is accepted. Using arXiv's citation if needed now.
198
+
199
+ ## 📑License
200
+ This project is released under the [MIT License](LICENSE).
assets/Fig.PNG ADDED

Git LFS Details

  • SHA256: 85d5a071290117b9f1e6e918981726b6b894ff9f1675d121867db94a8172cfc5
  • Pointer size: 132 Bytes
  • Size of remote file: 2.15 MB
conda_env.yaml ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: su_t
2
+ channels:
3
+ - defaults
4
+ dependencies:
5
+ - _libgcc_mutex=0.1=main
6
+ - _openmp_mutex=5.1=1_gnu
7
+ - bzip2=1.0.8=h5eee18b_6
8
+ - ca-certificates=2025.2.25=h06a4308_0
9
+ - ld_impl_linux-64=2.40=h12ee557_0
10
+ - libffi=3.3=he6710b0_2
11
+ - libgcc-ng=11.2.0=h1234567_1
12
+ - libgomp=11.2.0=h1234567_1
13
+ - libstdcxx-ng=11.2.0=h1234567_1
14
+ - libuuid=1.41.5=h5eee18b_0
15
+ - ncurses=6.4=h6a678d5_0
16
+ - openssl=1.1.1w=h7f8727e_0
17
+ - pip=25.0=py310h06a4308_0
18
+ - python=3.10.0=h12debd9_5
19
+ - readline=8.2=h5eee18b_0
20
+ - setuptools=75.8.0=py310h06a4308_0
21
+ - sqlite=3.45.3=h5eee18b_0
22
+ - tk=8.6.14=h39e8969_0
23
+ - wheel=0.45.1=py310h06a4308_0
24
+ - xz=5.6.4=h5eee18b_1
25
+ - zlib=1.2.13=h5eee18b_1
26
+ - pip:
27
+ - absl-py==2.1.0
28
+ - beautifulsoup4==4.13.3
29
+ - certifi==2025.1.31
30
+ - charset-normalizer==3.4.1
31
+ - coloredlogs==15.0.1
32
+ - contourpy==1.3.1
33
+ - cycler==0.12.1
34
+ - cython==3.0.12
35
+ - cython-bbox==0.1.5
36
+ - easydict==1.13
37
+ - einops==0.8.1
38
+ - faiss-gpu==1.7.2
39
+ - filelock==3.13.1
40
+ - filterpy==1.4.5
41
+ - flatbuffers==25.2.10
42
+ - fonttools==4.56.0
43
+ - fsspec==2024.6.1
44
+ - gdown==5.2.0
45
+ - grpcio==1.70.0
46
+ - h5py==3.13.0
47
+ - humanfriendly==10.0
48
+ - idna==3.10
49
+ - imageio==2.37.0
50
+ - jinja2==3.1.4
51
+ - joblib==1.4.2
52
+ - kiwisolver==1.4.8
53
+ - lap==0.5.12
54
+ - lazy-loader==0.4
55
+ - loguru==0.7.3
56
+ - markdown==3.7
57
+ - markdown-it-py==3.0.0
58
+ - markupsafe==2.1.5
59
+ - matplotlib==3.10.1
60
+ - mdurl==0.1.2
61
+ - mpmath==1.3.0
62
+ - networkx==3.3
63
+ - ninja==1.11.1.3
64
+ - numpy==1.23.4
65
+ - nvidia-cublas-cu11==11.11.3.6
66
+ - nvidia-cuda-cupti-cu11==11.8.87
67
+ - nvidia-cuda-nvrtc-cu11==11.8.89
68
+ - nvidia-cuda-runtime-cu11==11.8.89
69
+ - nvidia-cudnn-cu11==9.1.0.70
70
+ - nvidia-cufft-cu11==10.9.0.58
71
+ - nvidia-curand-cu11==10.3.0.86
72
+ - nvidia-cusolver-cu11==11.4.1.48
73
+ - nvidia-cusparse-cu11==11.7.5.86
74
+ - nvidia-ml-py==12.570.86
75
+ - nvidia-nccl-cu11==2.21.5
76
+ - nvidia-nvtx-cu11==11.8.86
77
+ - nvitop==1.4.2
78
+ - onnx==1.17.0
79
+ - onnx-simplifier==0.4.36
80
+ - onnxruntime==1.12.0
81
+ - opencv-python==4.11.0.86
82
+ - packaging==24.2
83
+ - pandas==2.2.3
84
+ - pillow==11.0.0
85
+ - prettytable==3.15.1
86
+ - protobuf==6.30.0
87
+ - psutil==7.0.0
88
+ - pygments==2.19.1
89
+ - pyparsing==3.2.1
90
+ - pysocks==1.7.1
91
+ - python-dateutil==2.9.0.post0
92
+ - pytz==2025.1
93
+ - pyyaml==6.0.2
94
+ - requests==2.32.3
95
+ - rich==13.9.4
96
+ - scikit-image==0.24.0
97
+ - scikit-learn==1.6.1
98
+ - scipy==1.13.1
99
+ - six==1.17.0
100
+ - soupsieve==2.6
101
+ - sympy==1.13.1
102
+ - tabulate==0.9.0
103
+ - tensorboard==2.19.0
104
+ - tensorboard-data-server==0.7.2
105
+ - termcolor==2.5.0
106
+ - thop==0.1.1-2209072238
107
+ - threadpoolctl==3.5.0
108
+ - tifffile==2025.2.18
109
+ - torch==2.6.0+cu118
110
+ - torchaudio==2.6.0+cu118
111
+ - torchvision==0.21.0+cu118
112
+ - tqdm==4.67.1
113
+ - triton==3.2.0
114
+ - typing-extensions==4.12.2
115
+ - tzdata==2025.1
116
+ - urllib3==2.3.0
117
+ - vit-pytorch==1.9.2
118
+ - wcwidth==0.2.13
119
+ - werkzeug==3.1.3
120
+ - xmltodict==0.14.2
121
+ - yacs==0.1.8
122
+ prefix: /home/weiranli/anaconda3/envs/su_t
deploy/ONNXRuntime/README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## ByteTrack-ONNXRuntime in Python
2
+
3
+ This doc introduces how to convert your pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
4
+
5
+ ### Convert Your Model to ONNX
6
+
7
+ ```shell
8
+ cd <ByteTrack_HOME>
9
+ python3 tools/export_onnx.py --output-name bytetrack_s.onnx -f exps/example/mot/yolox_s_mix_det.py -c pretrained/bytetrack_s_mot17.pth.tar
10
+ ```
11
+
12
+ ### ONNXRuntime Demo
13
+
14
+ You can run onnx demo with **16 FPS** (96-core Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz):
15
+
16
+ ```shell
17
+ cd <ByteTrack_HOME>/deploy/ONNXRuntime
18
+ python3 onnx_inference.py
19
+ ```
deploy/ONNXRuntime/onnx_inference.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import os
3
+
4
+ import cv2
5
+ import numpy as np
6
+ from loguru import logger
7
+
8
+ import onnxruntime
9
+
10
+ from yolox.data.data_augment import preproc as preprocess
11
+ from yolox.utils import mkdir, multiclass_nms, demo_postprocess, vis
12
+ from yolox.utils.visualize import plot_tracking
13
+ from trackers.ocsort_tracker.ocsort import OCSort
14
+ from trackers.tracking_utils.timer import Timer
15
+
16
+
17
+ def make_parser():
18
+ parser = argparse.ArgumentParser("onnxruntime inference sample")
19
+ parser.add_argument(
20
+ "-m",
21
+ "--model",
22
+ type=str,
23
+ default="../../ocsort.onnx",
24
+ help="Input your onnx model.",
25
+ )
26
+ parser.add_argument(
27
+ "-i",
28
+ "--video_path",
29
+ type=str,
30
+ default='../../videos/dance_demo.mp4',
31
+ help="Path to your input image.",
32
+ )
33
+ parser.add_argument(
34
+ "-o",
35
+ "--output_dir",
36
+ type=str,
37
+ default='demo_output',
38
+ help="Path to your output directory.",
39
+ )
40
+ parser.add_argument(
41
+ "-s",
42
+ "--score_thr",
43
+ type=float,
44
+ default=0.1,
45
+ help="Score threshould to filter the result.",
46
+ )
47
+ parser.add_argument(
48
+ "-n",
49
+ "--nms_thr",
50
+ type=float,
51
+ default=0.7,
52
+ help="NMS threshould.",
53
+ )
54
+ parser.add_argument(
55
+ "--input_shape",
56
+ type=str,
57
+ default="800,1440",
58
+ help="Specify an input shape for inference.",
59
+ )
60
+ parser.add_argument(
61
+ "--with_p6",
62
+ action="store_true",
63
+ help="Whether your model uses p6 in FPN/PAN.",
64
+ )
65
+ # tracking args
66
+ parser.add_argument("--track_thresh", type=float, default=0.6, help="tracking confidence threshold")
67
+ parser.add_argument("--iou_thresh", type=float, default=0.3, help="tracking confidence threshold")
68
+ parser.add_argument("--track_buffer", type=int, default=30, help="the frames for keep lost tracks")
69
+ parser.add_argument("--match_thresh", type=float, default=0.8, help="matching threshold for tracking")
70
+ parser.add_argument('--min-box-area', type=float, default=10, help='filter out tiny boxes')
71
+ parser.add_argument("--mot20", dest="mot20", default=False, action="store_true", help="test mot20.")
72
+ return parser
73
+
74
+
75
+ class Predictor(object):
76
+ def __init__(self, args):
77
+ self.rgb_means = (0.485, 0.456, 0.406)
78
+ self.std = (0.229, 0.224, 0.225)
79
+ self.args = args
80
+ self.session = onnxruntime.InferenceSession(args.model)
81
+ self.input_shape = tuple(map(int, args.input_shape.split(',')))
82
+
83
+ def inference(self, ori_img, timer):
84
+ img_info = {"id": 0}
85
+ height, width = ori_img.shape[:2]
86
+ img_info["height"] = height
87
+ img_info["width"] = width
88
+ img_info["raw_img"] = ori_img
89
+
90
+ img, ratio = preprocess(ori_img, self.input_shape, self.rgb_means, self.std)
91
+ img_info["ratio"] = ratio
92
+ ort_inputs = {self.session.get_inputs()[0].name: img[None, :, :, :]}
93
+ timer.tic()
94
+ output = self.session.run(None, ort_inputs)
95
+ predictions = demo_postprocess(output[0], self.input_shape, p6=self.args.with_p6)[0]
96
+
97
+ boxes = predictions[:, :4]
98
+ scores = predictions[:, 4:5] * predictions[:, 5:]
99
+
100
+ boxes_xyxy = np.ones_like(boxes)
101
+ boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
102
+ boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
103
+ boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
104
+ boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
105
+ boxes_xyxy /= ratio
106
+ dets = multiclass_nms(boxes_xyxy, scores, nms_thr=self.args.nms_thr, score_thr=self.args.score_thr)
107
+ return dets[:, :-1], img_info
108
+
109
+
110
+ def imageflow_demo(predictor, args):
111
+ cap = cv2.VideoCapture(args.video_path)
112
+ width = cap.get(cv2.CAP_PROP_FRAME_WIDTH) # float
113
+ height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT) # float
114
+ fps = cap.get(cv2.CAP_PROP_FPS)
115
+ save_folder = args.output_dir
116
+ os.makedirs(save_folder, exist_ok=True)
117
+ save_path = os.path.join(save_folder, args.video_path.split("/")[-1])
118
+ logger.info(f"video save_path is {save_path}")
119
+ vid_writer = cv2.VideoWriter(
120
+ save_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (int(width), int(height))
121
+ )
122
+ tracker = OCSort(det_thresh=args.track_thresh, iou_threshold=args.iou_thresh)
123
+ timer = Timer()
124
+ frame_id = 0
125
+ results = []
126
+ while True:
127
+ if frame_id % 20 == 0:
128
+ logger.info('Processing frame {} ({:.2f} fps)'.format(frame_id, 1. / max(1e-5, timer.average_time)))
129
+ ret_val, frame = cap.read()
130
+ if ret_val:
131
+ outputs, img_info = predictor.inference(frame, timer)
132
+ online_targets = tracker.update(outputs, [img_info['height'], img_info['width']], [img_info['height'], img_info['width']])
133
+ online_tlwhs = []
134
+ online_ids = []
135
+ # online_scores = []
136
+ for t in online_targets:
137
+ tlwh = [t[0], t[1], t[2] - t[0], t[3] - t[1]]
138
+ tid = t[4]
139
+ vertical = tlwh[2] / tlwh[3] > 1.6
140
+ if tlwh[2] * tlwh[3] > args.min_box_area and not vertical:
141
+ online_tlwhs.append(tlwh)
142
+ online_ids.append(tid)
143
+ # online_scores.append(t.score)
144
+ timer.toc()
145
+ results.append((frame_id + 1, online_tlwhs, online_ids))
146
+ online_im = plot_tracking(img_info['raw_img'], online_tlwhs, online_ids, frame_id=frame_id + 1,
147
+ fps=1. / timer.average_time)
148
+ vid_writer.write(online_im)
149
+ ch = cv2.waitKey(1)
150
+ if ch == 27 or ch == ord("q") or ch == ord("Q"):
151
+ break
152
+ else:
153
+ break
154
+ frame_id += 1
155
+
156
+
157
+ if __name__ == '__main__':
158
+ args = make_parser().parse_args()
159
+
160
+ predictor = Predictor(args)
161
+ imageflow_demo(predictor, args)
deploy/TensorRT/cpp/CMakeLists.txt ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ cmake_minimum_required(VERSION 2.6)
2
+
3
+ project(bytetrack)
4
+
5
+ add_definitions(-std=c++11)
6
+
7
+ option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
8
+ set(CMAKE_CXX_STANDARD 11)
9
+ set(CMAKE_BUILD_TYPE Debug)
10
+
11
+ find_package(CUDA REQUIRED)
12
+
13
+ include_directories(${PROJECT_SOURCE_DIR}/include)
14
+ include_directories(/usr/local/include/eigen3)
15
+ link_directories(${PROJECT_SOURCE_DIR}/include)
16
+ # include and link dirs of cuda and tensorrt, you need adapt them if yours are different
17
+ # cuda
18
+ include_directories(/usr/local/cuda/include)
19
+ link_directories(/usr/local/cuda/lib64)
20
+ # cudnn
21
+ include_directories(/data/cuda/cuda-10.2/cudnn/v8.0.4/include)
22
+ link_directories(/data/cuda/cuda-10.2/cudnn/v8.0.4/lib64)
23
+ # tensorrt
24
+ include_directories(/opt/tiger/demo/TensorRT-7.2.3.4/include)
25
+ link_directories(/opt/tiger/demo/TensorRT-7.2.3.4/lib)
26
+
27
+ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wall -Ofast -Wfatal-errors -D_MWAITXINTRIN_H_INCLUDED")
28
+
29
+ find_package(OpenCV)
30
+ include_directories(${OpenCV_INCLUDE_DIRS})
31
+
32
+ file(GLOB My_Source_Files ${PROJECT_SOURCE_DIR}/src/*.cpp)
33
+ add_executable(bytetrack ${My_Source_Files})
34
+ target_link_libraries(bytetrack nvinfer)
35
+ target_link_libraries(bytetrack cudart)
36
+ target_link_libraries(bytetrack ${OpenCV_LIBS})
37
+
38
+ add_definitions(-O2 -pthread)
39
+
deploy/TensorRT/cpp/README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ByteTrack-TensorRT in C++
2
+
3
+ ## Installation
4
+
5
+ Install opencv with ```sudo apt-get install libopencv-dev``` (we don't need a higher version of opencv like v3.3+).
6
+
7
+ Install eigen-3.3.9 [[google]](https://drive.google.com/file/d/1rqO74CYCNrmRAg8Rra0JP3yZtJ-rfket/view?usp=sharing), [[baidu(code:ueq4)]](https://pan.baidu.com/s/15kEfCxpy-T7tz60msxxExg).
8
+
9
+ ```shell
10
+ unzip eigen-3.3.9.zip
11
+ cd eigen-3.3.9
12
+ mkdir build
13
+ cd build
14
+ cmake ..
15
+ sudo make install
16
+ ```
17
+
18
+ ## Prepare serialized engine file
19
+
20
+ Follow the TensorRT Python demo to convert and save the serialized engine file.
21
+
22
+ Check the 'model_trt.engine' file, which will be automatically saved at the YOLOX_output dir.
23
+
24
+ ## Build the demo
25
+
26
+ You should set the TensorRT path and CUDA path in CMakeLists.txt.
27
+
28
+ For bytetrack_s model, we set the input frame size 1088 x 608. For bytetrack_m, bytetrack_l, bytetrack_x models, we set the input frame size 1440 x 800. You can modify the INPUT_W and INPUT_H in src/bytetrack.cpp
29
+
30
+ ```c++
31
+ static const int INPUT_W = 1088;
32
+ static const int INPUT_H = 608;
33
+ ```
34
+
35
+ You can first build the demo:
36
+
37
+ ```shell
38
+ cd <ByteTrack_HOME>/deploy/TensorRT/cpp
39
+ mkdir build
40
+ cd build
41
+ cmake ..
42
+ make
43
+ ```
44
+
45
+ Then you can run the demo with **200 FPS**:
46
+
47
+ ```shell
48
+ ./bytetrack ../../../../YOLOX_outputs/yolox_s_mix_det/model_trt.engine -i ../../../../videos/palace.mp4
49
+ ```
50
+
51
+ (If you find the output video lose some frames, you can convert the input video by running:
52
+
53
+ ```shell
54
+ cd <ByteTrack_HOME>
55
+ python3 tools/convert_video.py
56
+ ```
57
+ to generate an appropriate input video for TensorRT C++ demo. )
58
+
deploy/TensorRT/cpp/include/BYTETracker.h ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include "STrack.h"
4
+
5
+ struct Object
6
+ {
7
+ cv::Rect_<float> rect;
8
+ int label;
9
+ float prob;
10
+ };
11
+
12
+ class BYTETracker
13
+ {
14
+ public:
15
+ BYTETracker(int frame_rate = 30, int track_buffer = 30);
16
+ ~BYTETracker();
17
+
18
+ vector<STrack> update(const vector<Object>& objects);
19
+ Scalar get_color(int idx);
20
+
21
+ private:
22
+ vector<STrack*> joint_stracks(vector<STrack*> &tlista, vector<STrack> &tlistb);
23
+ vector<STrack> joint_stracks(vector<STrack> &tlista, vector<STrack> &tlistb);
24
+
25
+ vector<STrack> sub_stracks(vector<STrack> &tlista, vector<STrack> &tlistb);
26
+ void remove_duplicate_stracks(vector<STrack> &resa, vector<STrack> &resb, vector<STrack> &stracksa, vector<STrack> &stracksb);
27
+
28
+ void linear_assignment(vector<vector<float> > &cost_matrix, int cost_matrix_size, int cost_matrix_size_size, float thresh,
29
+ vector<vector<int> > &matches, vector<int> &unmatched_a, vector<int> &unmatched_b);
30
+ vector<vector<float> > iou_distance(vector<STrack*> &atracks, vector<STrack> &btracks, int &dist_size, int &dist_size_size);
31
+ vector<vector<float> > iou_distance(vector<STrack> &atracks, vector<STrack> &btracks);
32
+ vector<vector<float> > ious(vector<vector<float> > &atlbrs, vector<vector<float> > &btlbrs);
33
+
34
+ double lapjv(const vector<vector<float> > &cost, vector<int> &rowsol, vector<int> &colsol,
35
+ bool extend_cost = false, float cost_limit = LONG_MAX, bool return_cost = true);
36
+
37
+ private:
38
+
39
+ float track_thresh;
40
+ float high_thresh;
41
+ float match_thresh;
42
+ int frame_id;
43
+ int max_time_lost;
44
+
45
+ vector<STrack> tracked_stracks;
46
+ vector<STrack> lost_stracks;
47
+ vector<STrack> removed_stracks;
48
+ byte_kalman::KalmanFilter kalman_filter;
49
+ };
deploy/TensorRT/cpp/include/STrack.h ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include <opencv2/opencv.hpp>
4
+ #include "kalmanFilter.h"
5
+
6
+ using namespace cv;
7
+ using namespace std;
8
+
9
+ enum TrackState { New = 0, Tracked, Lost, Removed };
10
+
11
+ class STrack
12
+ {
13
+ public:
14
+ STrack(vector<float> tlwh_, float score);
15
+ ~STrack();
16
+
17
+ vector<float> static tlbr_to_tlwh(vector<float> &tlbr);
18
+ void static multi_predict(vector<STrack*> &stracks, byte_kalman::KalmanFilter &kalman_filter);
19
+ void static_tlwh();
20
+ void static_tlbr();
21
+ vector<float> tlwh_to_xyah(vector<float> tlwh_tmp);
22
+ vector<float> to_xyah();
23
+ void mark_lost();
24
+ void mark_removed();
25
+ int next_id();
26
+ int end_frame();
27
+
28
+ void activate(byte_kalman::KalmanFilter &kalman_filter, int frame_id);
29
+ void re_activate(STrack &new_track, int frame_id, bool new_id = false);
30
+ void update(STrack &new_track, int frame_id);
31
+
32
+ public:
33
+ bool is_activated;
34
+ int track_id;
35
+ int state;
36
+
37
+ vector<float> _tlwh;
38
+ vector<float> tlwh;
39
+ vector<float> tlbr;
40
+ int frame_id;
41
+ int tracklet_len;
42
+ int start_frame;
43
+
44
+ KAL_MEAN mean;
45
+ KAL_COVA covariance;
46
+ float score;
47
+
48
+ private:
49
+ byte_kalman::KalmanFilter kalman_filter;
50
+ };
deploy/TensorRT/cpp/include/dataType.h ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include <cstddef>
4
+ #include <vector>
5
+
6
+ #include <Eigen/Core>
7
+ #include <Eigen/Dense>
8
+ typedef Eigen::Matrix<float, 1, 4, Eigen::RowMajor> DETECTBOX;
9
+ typedef Eigen::Matrix<float, -1, 4, Eigen::RowMajor> DETECTBOXSS;
10
+ typedef Eigen::Matrix<float, 1, 128, Eigen::RowMajor> FEATURE;
11
+ typedef Eigen::Matrix<float, Eigen::Dynamic, 128, Eigen::RowMajor> FEATURESS;
12
+ //typedef std::vector<FEATURE> FEATURESS;
13
+
14
+ //Kalmanfilter
15
+ //typedef Eigen::Matrix<float, 8, 8, Eigen::RowMajor> KAL_FILTER;
16
+ typedef Eigen::Matrix<float, 1, 8, Eigen::RowMajor> KAL_MEAN;
17
+ typedef Eigen::Matrix<float, 8, 8, Eigen::RowMajor> KAL_COVA;
18
+ typedef Eigen::Matrix<float, 1, 4, Eigen::RowMajor> KAL_HMEAN;
19
+ typedef Eigen::Matrix<float, 4, 4, Eigen::RowMajor> KAL_HCOVA;
20
+ using KAL_DATA = std::pair<KAL_MEAN, KAL_COVA>;
21
+ using KAL_HDATA = std::pair<KAL_HMEAN, KAL_HCOVA>;
22
+
23
+ //main
24
+ using RESULT_DATA = std::pair<int, DETECTBOX>;
25
+
26
+ //tracker:
27
+ using TRACKER_DATA = std::pair<int, FEATURESS>;
28
+ using MATCH_DATA = std::pair<int, int>;
29
+ typedef struct t {
30
+ std::vector<MATCH_DATA> matches;
31
+ std::vector<int> unmatched_tracks;
32
+ std::vector<int> unmatched_detections;
33
+ }TRACHER_MATCHD;
34
+
35
+ //linear_assignment:
36
+ typedef Eigen::Matrix<float, -1, -1, Eigen::RowMajor> DYNAMICM;
deploy/TensorRT/cpp/include/kalmanFilter.h ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include "dataType.h"
4
+
5
+ namespace byte_kalman
6
+ {
7
+ class KalmanFilter
8
+ {
9
+ public:
10
+ static const double chi2inv95[10];
11
+ KalmanFilter();
12
+ KAL_DATA initiate(const DETECTBOX& measurement);
13
+ void predict(KAL_MEAN& mean, KAL_COVA& covariance);
14
+ KAL_HDATA project(const KAL_MEAN& mean, const KAL_COVA& covariance);
15
+ KAL_DATA update(const KAL_MEAN& mean,
16
+ const KAL_COVA& covariance,
17
+ const DETECTBOX& measurement);
18
+
19
+ Eigen::Matrix<float, 1, -1> gating_distance(
20
+ const KAL_MEAN& mean,
21
+ const KAL_COVA& covariance,
22
+ const std::vector<DETECTBOX>& measurements,
23
+ bool only_position = false);
24
+
25
+ private:
26
+ Eigen::Matrix<float, 8, 8, Eigen::RowMajor> _motion_mat;
27
+ Eigen::Matrix<float, 4, 8, Eigen::RowMajor> _update_mat;
28
+ float _std_weight_position;
29
+ float _std_weight_velocity;
30
+ };
31
+ }
deploy/TensorRT/cpp/include/lapjv.h ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #ifndef LAPJV_H
2
+ #define LAPJV_H
3
+
4
+ #define LARGE 1000000
5
+
6
+ #if !defined TRUE
7
+ #define TRUE 1
8
+ #endif
9
+ #if !defined FALSE
10
+ #define FALSE 0
11
+ #endif
12
+
13
+ #define NEW(x, t, n) if ((x = (t *)malloc(sizeof(t) * (n))) == 0) { return -1; }
14
+ #define FREE(x) if (x != 0) { free(x); x = 0; }
15
+ #define SWAP_INDICES(a, b) { int_t _temp_index = a; a = b; b = _temp_index; }
16
+
17
+ #if 0
18
+ #include <assert.h>
19
+ #define ASSERT(cond) assert(cond)
20
+ #define PRINTF(fmt, ...) printf(fmt, ##__VA_ARGS__)
21
+ #define PRINT_COST_ARRAY(a, n) \
22
+ while (1) { \
23
+ printf(#a" = ["); \
24
+ if ((n) > 0) { \
25
+ printf("%f", (a)[0]); \
26
+ for (uint_t j = 1; j < n; j++) { \
27
+ printf(", %f", (a)[j]); \
28
+ } \
29
+ } \
30
+ printf("]\n"); \
31
+ break; \
32
+ }
33
+ #define PRINT_INDEX_ARRAY(a, n) \
34
+ while (1) { \
35
+ printf(#a" = ["); \
36
+ if ((n) > 0) { \
37
+ printf("%d", (a)[0]); \
38
+ for (uint_t j = 1; j < n; j++) { \
39
+ printf(", %d", (a)[j]); \
40
+ } \
41
+ } \
42
+ printf("]\n"); \
43
+ break; \
44
+ }
45
+ #else
46
+ #define ASSERT(cond)
47
+ #define PRINTF(fmt, ...)
48
+ #define PRINT_COST_ARRAY(a, n)
49
+ #define PRINT_INDEX_ARRAY(a, n)
50
+ #endif
51
+
52
+
53
+ typedef signed int int_t;
54
+ typedef unsigned int uint_t;
55
+ typedef double cost_t;
56
+ typedef char boolean;
57
+ typedef enum fp_t { FP_1 = 1, FP_2 = 2, FP_DYNAMIC = 3 } fp_t;
58
+
59
+ extern int_t lapjv_internal(
60
+ const uint_t n, cost_t *cost[],
61
+ int_t *x, int_t *y);
62
+
63
+ #endif // LAPJV_H
deploy/TensorRT/cpp/include/logging.h ADDED
@@ -0,0 +1,503 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
3
+ *
4
+ * Licensed under the Apache License, Version 2.0 (the "License");
5
+ * you may not use this file except in compliance with the License.
6
+ * You may obtain a copy of the License at
7
+ *
8
+ * http://www.apache.org/licenses/LICENSE-2.0
9
+ *
10
+ * Unless required by applicable law or agreed to in writing, software
11
+ * distributed under the License is distributed on an "AS IS" BASIS,
12
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ * See the License for the specific language governing permissions and
14
+ * limitations under the License.
15
+ */
16
+
17
+ #ifndef TENSORRT_LOGGING_H
18
+ #define TENSORRT_LOGGING_H
19
+
20
+ #include "NvInferRuntimeCommon.h"
21
+ #include <cassert>
22
+ #include <ctime>
23
+ #include <iomanip>
24
+ #include <iostream>
25
+ #include <ostream>
26
+ #include <sstream>
27
+ #include <string>
28
+
29
+ using Severity = nvinfer1::ILogger::Severity;
30
+
31
+ class LogStreamConsumerBuffer : public std::stringbuf
32
+ {
33
+ public:
34
+ LogStreamConsumerBuffer(std::ostream& stream, const std::string& prefix, bool shouldLog)
35
+ : mOutput(stream)
36
+ , mPrefix(prefix)
37
+ , mShouldLog(shouldLog)
38
+ {
39
+ }
40
+
41
+ LogStreamConsumerBuffer(LogStreamConsumerBuffer&& other)
42
+ : mOutput(other.mOutput)
43
+ {
44
+ }
45
+
46
+ ~LogStreamConsumerBuffer()
47
+ {
48
+ // std::streambuf::pbase() gives a pointer to the beginning of the buffered part of the output sequence
49
+ // std::streambuf::pptr() gives a pointer to the current position of the output sequence
50
+ // if the pointer to the beginning is not equal to the pointer to the current position,
51
+ // call putOutput() to log the output to the stream
52
+ if (pbase() != pptr())
53
+ {
54
+ putOutput();
55
+ }
56
+ }
57
+
58
+ // synchronizes the stream buffer and returns 0 on success
59
+ // synchronizing the stream buffer consists of inserting the buffer contents into the stream,
60
+ // resetting the buffer and flushing the stream
61
+ virtual int sync()
62
+ {
63
+ putOutput();
64
+ return 0;
65
+ }
66
+
67
+ void putOutput()
68
+ {
69
+ if (mShouldLog)
70
+ {
71
+ // prepend timestamp
72
+ std::time_t timestamp = std::time(nullptr);
73
+ tm* tm_local = std::localtime(&timestamp);
74
+ std::cout << "[";
75
+ std::cout << std::setw(2) << std::setfill('0') << 1 + tm_local->tm_mon << "/";
76
+ std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_mday << "/";
77
+ std::cout << std::setw(4) << std::setfill('0') << 1900 + tm_local->tm_year << "-";
78
+ std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_hour << ":";
79
+ std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_min << ":";
80
+ std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_sec << "] ";
81
+ // std::stringbuf::str() gets the string contents of the buffer
82
+ // insert the buffer contents pre-appended by the appropriate prefix into the stream
83
+ mOutput << mPrefix << str();
84
+ // set the buffer to empty
85
+ str("");
86
+ // flush the stream
87
+ mOutput.flush();
88
+ }
89
+ }
90
+
91
+ void setShouldLog(bool shouldLog)
92
+ {
93
+ mShouldLog = shouldLog;
94
+ }
95
+
96
+ private:
97
+ std::ostream& mOutput;
98
+ std::string mPrefix;
99
+ bool mShouldLog;
100
+ };
101
+
102
+ //!
103
+ //! \class LogStreamConsumerBase
104
+ //! \brief Convenience object used to initialize LogStreamConsumerBuffer before std::ostream in LogStreamConsumer
105
+ //!
106
+ class LogStreamConsumerBase
107
+ {
108
+ public:
109
+ LogStreamConsumerBase(std::ostream& stream, const std::string& prefix, bool shouldLog)
110
+ : mBuffer(stream, prefix, shouldLog)
111
+ {
112
+ }
113
+
114
+ protected:
115
+ LogStreamConsumerBuffer mBuffer;
116
+ };
117
+
118
+ //!
119
+ //! \class LogStreamConsumer
120
+ //! \brief Convenience object used to facilitate use of C++ stream syntax when logging messages.
121
+ //! Order of base classes is LogStreamConsumerBase and then std::ostream.
122
+ //! This is because the LogStreamConsumerBase class is used to initialize the LogStreamConsumerBuffer member field
123
+ //! in LogStreamConsumer and then the address of the buffer is passed to std::ostream.
124
+ //! This is necessary to prevent the address of an uninitialized buffer from being passed to std::ostream.
125
+ //! Please do not change the order of the parent classes.
126
+ //!
127
+ class LogStreamConsumer : protected LogStreamConsumerBase, public std::ostream
128
+ {
129
+ public:
130
+ //! \brief Creates a LogStreamConsumer which logs messages with level severity.
131
+ //! Reportable severity determines if the messages are severe enough to be logged.
132
+ LogStreamConsumer(Severity reportableSeverity, Severity severity)
133
+ : LogStreamConsumerBase(severityOstream(severity), severityPrefix(severity), severity <= reportableSeverity)
134
+ , std::ostream(&mBuffer) // links the stream buffer with the stream
135
+ , mShouldLog(severity <= reportableSeverity)
136
+ , mSeverity(severity)
137
+ {
138
+ }
139
+
140
+ LogStreamConsumer(LogStreamConsumer&& other)
141
+ : LogStreamConsumerBase(severityOstream(other.mSeverity), severityPrefix(other.mSeverity), other.mShouldLog)
142
+ , std::ostream(&mBuffer) // links the stream buffer with the stream
143
+ , mShouldLog(other.mShouldLog)
144
+ , mSeverity(other.mSeverity)
145
+ {
146
+ }
147
+
148
+ void setReportableSeverity(Severity reportableSeverity)
149
+ {
150
+ mShouldLog = mSeverity <= reportableSeverity;
151
+ mBuffer.setShouldLog(mShouldLog);
152
+ }
153
+
154
+ private:
155
+ static std::ostream& severityOstream(Severity severity)
156
+ {
157
+ return severity >= Severity::kINFO ? std::cout : std::cerr;
158
+ }
159
+
160
+ static std::string severityPrefix(Severity severity)
161
+ {
162
+ switch (severity)
163
+ {
164
+ case Severity::kINTERNAL_ERROR: return "[F] ";
165
+ case Severity::kERROR: return "[E] ";
166
+ case Severity::kWARNING: return "[W] ";
167
+ case Severity::kINFO: return "[I] ";
168
+ case Severity::kVERBOSE: return "[V] ";
169
+ default: assert(0); return "";
170
+ }
171
+ }
172
+
173
+ bool mShouldLog;
174
+ Severity mSeverity;
175
+ };
176
+
177
+ //! \class Logger
178
+ //!
179
+ //! \brief Class which manages logging of TensorRT tools and samples
180
+ //!
181
+ //! \details This class provides a common interface for TensorRT tools and samples to log information to the console,
182
+ //! and supports logging two types of messages:
183
+ //!
184
+ //! - Debugging messages with an associated severity (info, warning, error, or internal error/fatal)
185
+ //! - Test pass/fail messages
186
+ //!
187
+ //! The advantage of having all samples use this class for logging as opposed to emitting directly to stdout/stderr is
188
+ //! that the logic for controlling the verbosity and formatting of sample output is centralized in one location.
189
+ //!
190
+ //! In the future, this class could be extended to support dumping test results to a file in some standard format
191
+ //! (for example, JUnit XML), and providing additional metadata (e.g. timing the duration of a test run).
192
+ //!
193
+ //! TODO: For backwards compatibility with existing samples, this class inherits directly from the nvinfer1::ILogger
194
+ //! interface, which is problematic since there isn't a clean separation between messages coming from the TensorRT
195
+ //! library and messages coming from the sample.
196
+ //!
197
+ //! In the future (once all samples are updated to use Logger::getTRTLogger() to access the ILogger) we can refactor the
198
+ //! class to eliminate the inheritance and instead make the nvinfer1::ILogger implementation a member of the Logger
199
+ //! object.
200
+
201
+ class Logger : public nvinfer1::ILogger
202
+ {
203
+ public:
204
+ Logger(Severity severity = Severity::kWARNING)
205
+ : mReportableSeverity(severity)
206
+ {
207
+ }
208
+
209
+ //!
210
+ //! \enum TestResult
211
+ //! \brief Represents the state of a given test
212
+ //!
213
+ enum class TestResult
214
+ {
215
+ kRUNNING, //!< The test is running
216
+ kPASSED, //!< The test passed
217
+ kFAILED, //!< The test failed
218
+ kWAIVED //!< The test was waived
219
+ };
220
+
221
+ //!
222
+ //! \brief Forward-compatible method for retrieving the nvinfer::ILogger associated with this Logger
223
+ //! \return The nvinfer1::ILogger associated with this Logger
224
+ //!
225
+ //! TODO Once all samples are updated to use this method to register the logger with TensorRT,
226
+ //! we can eliminate the inheritance of Logger from ILogger
227
+ //!
228
+ nvinfer1::ILogger& getTRTLogger()
229
+ {
230
+ return *this;
231
+ }
232
+
233
+ //!
234
+ //! \brief Implementation of the nvinfer1::ILogger::log() virtual method
235
+ //!
236
+ //! Note samples should not be calling this function directly; it will eventually go away once we eliminate the
237
+ //! inheritance from nvinfer1::ILogger
238
+ //!
239
+ void log(Severity severity, const char* msg) noexcept override
240
+ {
241
+ LogStreamConsumer(mReportableSeverity, severity) << "[TRT] " << std::string(msg) << std::endl;
242
+ }
243
+
244
+ //!
245
+ //! \brief Method for controlling the verbosity of logging output
246
+ //!
247
+ //! \param severity The logger will only emit messages that have severity of this level or higher.
248
+ //!
249
+ void setReportableSeverity(Severity severity)
250
+ {
251
+ mReportableSeverity = severity;
252
+ }
253
+
254
+ //!
255
+ //! \brief Opaque handle that holds logging information for a particular test
256
+ //!
257
+ //! This object is an opaque handle to information used by the Logger to print test results.
258
+ //! The sample must call Logger::defineTest() in order to obtain a TestAtom that can be used
259
+ //! with Logger::reportTest{Start,End}().
260
+ //!
261
+ class TestAtom
262
+ {
263
+ public:
264
+ TestAtom(TestAtom&&) = default;
265
+
266
+ private:
267
+ friend class Logger;
268
+
269
+ TestAtom(bool started, const std::string& name, const std::string& cmdline)
270
+ : mStarted(started)
271
+ , mName(name)
272
+ , mCmdline(cmdline)
273
+ {
274
+ }
275
+
276
+ bool mStarted;
277
+ std::string mName;
278
+ std::string mCmdline;
279
+ };
280
+
281
+ //!
282
+ //! \brief Define a test for logging
283
+ //!
284
+ //! \param[in] name The name of the test. This should be a string starting with
285
+ //! "TensorRT" and containing dot-separated strings containing
286
+ //! the characters [A-Za-z0-9_].
287
+ //! For example, "TensorRT.sample_googlenet"
288
+ //! \param[in] cmdline The command line used to reproduce the test
289
+ //
290
+ //! \return a TestAtom that can be used in Logger::reportTest{Start,End}().
291
+ //!
292
+ static TestAtom defineTest(const std::string& name, const std::string& cmdline)
293
+ {
294
+ return TestAtom(false, name, cmdline);
295
+ }
296
+
297
+ //!
298
+ //! \brief A convenience overloaded version of defineTest() that accepts an array of command-line arguments
299
+ //! as input
300
+ //!
301
+ //! \param[in] name The name of the test
302
+ //! \param[in] argc The number of command-line arguments
303
+ //! \param[in] argv The array of command-line arguments (given as C strings)
304
+ //!
305
+ //! \return a TestAtom that can be used in Logger::reportTest{Start,End}().
306
+ static TestAtom defineTest(const std::string& name, int argc, char const* const* argv)
307
+ {
308
+ auto cmdline = genCmdlineString(argc, argv);
309
+ return defineTest(name, cmdline);
310
+ }
311
+
312
+ //!
313
+ //! \brief Report that a test has started.
314
+ //!
315
+ //! \pre reportTestStart() has not been called yet for the given testAtom
316
+ //!
317
+ //! \param[in] testAtom The handle to the test that has started
318
+ //!
319
+ static void reportTestStart(TestAtom& testAtom)
320
+ {
321
+ reportTestResult(testAtom, TestResult::kRUNNING);
322
+ assert(!testAtom.mStarted);
323
+ testAtom.mStarted = true;
324
+ }
325
+
326
+ //!
327
+ //! \brief Report that a test has ended.
328
+ //!
329
+ //! \pre reportTestStart() has been called for the given testAtom
330
+ //!
331
+ //! \param[in] testAtom The handle to the test that has ended
332
+ //! \param[in] result The result of the test. Should be one of TestResult::kPASSED,
333
+ //! TestResult::kFAILED, TestResult::kWAIVED
334
+ //!
335
+ static void reportTestEnd(const TestAtom& testAtom, TestResult result)
336
+ {
337
+ assert(result != TestResult::kRUNNING);
338
+ assert(testAtom.mStarted);
339
+ reportTestResult(testAtom, result);
340
+ }
341
+
342
+ static int reportPass(const TestAtom& testAtom)
343
+ {
344
+ reportTestEnd(testAtom, TestResult::kPASSED);
345
+ return EXIT_SUCCESS;
346
+ }
347
+
348
+ static int reportFail(const TestAtom& testAtom)
349
+ {
350
+ reportTestEnd(testAtom, TestResult::kFAILED);
351
+ return EXIT_FAILURE;
352
+ }
353
+
354
+ static int reportWaive(const TestAtom& testAtom)
355
+ {
356
+ reportTestEnd(testAtom, TestResult::kWAIVED);
357
+ return EXIT_SUCCESS;
358
+ }
359
+
360
+ static int reportTest(const TestAtom& testAtom, bool pass)
361
+ {
362
+ return pass ? reportPass(testAtom) : reportFail(testAtom);
363
+ }
364
+
365
+ Severity getReportableSeverity() const
366
+ {
367
+ return mReportableSeverity;
368
+ }
369
+
370
+ private:
371
+ //!
372
+ //! \brief returns an appropriate string for prefixing a log message with the given severity
373
+ //!
374
+ static const char* severityPrefix(Severity severity)
375
+ {
376
+ switch (severity)
377
+ {
378
+ case Severity::kINTERNAL_ERROR: return "[F] ";
379
+ case Severity::kERROR: return "[E] ";
380
+ case Severity::kWARNING: return "[W] ";
381
+ case Severity::kINFO: return "[I] ";
382
+ case Severity::kVERBOSE: return "[V] ";
383
+ default: assert(0); return "";
384
+ }
385
+ }
386
+
387
+ //!
388
+ //! \brief returns an appropriate string for prefixing a test result message with the given result
389
+ //!
390
+ static const char* testResultString(TestResult result)
391
+ {
392
+ switch (result)
393
+ {
394
+ case TestResult::kRUNNING: return "RUNNING";
395
+ case TestResult::kPASSED: return "PASSED";
396
+ case TestResult::kFAILED: return "FAILED";
397
+ case TestResult::kWAIVED: return "WAIVED";
398
+ default: assert(0); return "";
399
+ }
400
+ }
401
+
402
+ //!
403
+ //! \brief returns an appropriate output stream (cout or cerr) to use with the given severity
404
+ //!
405
+ static std::ostream& severityOstream(Severity severity)
406
+ {
407
+ return severity >= Severity::kINFO ? std::cout : std::cerr;
408
+ }
409
+
410
+ //!
411
+ //! \brief method that implements logging test results
412
+ //!
413
+ static void reportTestResult(const TestAtom& testAtom, TestResult result)
414
+ {
415
+ severityOstream(Severity::kINFO) << "&&&& " << testResultString(result) << " " << testAtom.mName << " # "
416
+ << testAtom.mCmdline << std::endl;
417
+ }
418
+
419
+ //!
420
+ //! \brief generate a command line string from the given (argc, argv) values
421
+ //!
422
+ static std::string genCmdlineString(int argc, char const* const* argv)
423
+ {
424
+ std::stringstream ss;
425
+ for (int i = 0; i < argc; i++)
426
+ {
427
+ if (i > 0)
428
+ ss << " ";
429
+ ss << argv[i];
430
+ }
431
+ return ss.str();
432
+ }
433
+
434
+ Severity mReportableSeverity;
435
+ };
436
+
437
+ namespace
438
+ {
439
+
440
+ //!
441
+ //! \brief produces a LogStreamConsumer object that can be used to log messages of severity kVERBOSE
442
+ //!
443
+ //! Example usage:
444
+ //!
445
+ //! LOG_VERBOSE(logger) << "hello world" << std::endl;
446
+ //!
447
+ inline LogStreamConsumer LOG_VERBOSE(const Logger& logger)
448
+ {
449
+ return LogStreamConsumer(logger.getReportableSeverity(), Severity::kVERBOSE);
450
+ }
451
+
452
+ //!
453
+ //! \brief produces a LogStreamConsumer object that can be used to log messages of severity kINFO
454
+ //!
455
+ //! Example usage:
456
+ //!
457
+ //! LOG_INFO(logger) << "hello world" << std::endl;
458
+ //!
459
+ inline LogStreamConsumer LOG_INFO(const Logger& logger)
460
+ {
461
+ return LogStreamConsumer(logger.getReportableSeverity(), Severity::kINFO);
462
+ }
463
+
464
+ //!
465
+ //! \brief produces a LogStreamConsumer object that can be used to log messages of severity kWARNING
466
+ //!
467
+ //! Example usage:
468
+ //!
469
+ //! LOG_WARN(logger) << "hello world" << std::endl;
470
+ //!
471
+ inline LogStreamConsumer LOG_WARN(const Logger& logger)
472
+ {
473
+ return LogStreamConsumer(logger.getReportableSeverity(), Severity::kWARNING);
474
+ }
475
+
476
+ //!
477
+ //! \brief produces a LogStreamConsumer object that can be used to log messages of severity kERROR
478
+ //!
479
+ //! Example usage:
480
+ //!
481
+ //! LOG_ERROR(logger) << "hello world" << std::endl;
482
+ //!
483
+ inline LogStreamConsumer LOG_ERROR(const Logger& logger)
484
+ {
485
+ return LogStreamConsumer(logger.getReportableSeverity(), Severity::kERROR);
486
+ }
487
+
488
+ //!
489
+ //! \brief produces a LogStreamConsumer object that can be used to log messages of severity kINTERNAL_ERROR
490
+ // ("fatal" severity)
491
+ //!
492
+ //! Example usage:
493
+ //!
494
+ //! LOG_FATAL(logger) << "hello world" << std::endl;
495
+ //!
496
+ inline LogStreamConsumer LOG_FATAL(const Logger& logger)
497
+ {
498
+ return LogStreamConsumer(logger.getReportableSeverity(), Severity::kINTERNAL_ERROR);
499
+ }
500
+
501
+ } // anonymous namespace
502
+
503
+ #endif // TENSORRT_LOGGING_H
deploy/TensorRT/cpp/src/BYTETracker.cpp ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "BYTETracker.h"
2
+ #include <fstream>
3
+
4
+ BYTETracker::BYTETracker(int frame_rate, int track_buffer)
5
+ {
6
+ track_thresh = 0.5;
7
+ high_thresh = 0.6;
8
+ match_thresh = 0.8;
9
+
10
+ frame_id = 0;
11
+ max_time_lost = int(frame_rate / 30.0 * track_buffer);
12
+ cout << "Init ByteTrack!" << endl;
13
+ }
14
+
15
+ BYTETracker::~BYTETracker()
16
+ {
17
+ }
18
+
19
+ vector<STrack> BYTETracker::update(const vector<Object>& objects)
20
+ {
21
+
22
+ ////////////////// Step 1: Get detections //////////////////
23
+ this->frame_id++;
24
+ vector<STrack> activated_stracks;
25
+ vector<STrack> refind_stracks;
26
+ vector<STrack> removed_stracks;
27
+ vector<STrack> lost_stracks;
28
+ vector<STrack> detections;
29
+ vector<STrack> detections_low;
30
+
31
+ vector<STrack> detections_cp;
32
+ vector<STrack> tracked_stracks_swap;
33
+ vector<STrack> resa, resb;
34
+ vector<STrack> output_stracks;
35
+
36
+ vector<STrack*> unconfirmed;
37
+ vector<STrack*> tracked_stracks;
38
+ vector<STrack*> strack_pool;
39
+ vector<STrack*> r_tracked_stracks;
40
+
41
+ if (objects.size() > 0)
42
+ {
43
+ for (int i = 0; i < objects.size(); i++)
44
+ {
45
+ vector<float> tlbr_;
46
+ tlbr_.resize(4);
47
+ tlbr_[0] = objects[i].rect.x;
48
+ tlbr_[1] = objects[i].rect.y;
49
+ tlbr_[2] = objects[i].rect.x + objects[i].rect.width;
50
+ tlbr_[3] = objects[i].rect.y + objects[i].rect.height;
51
+
52
+ float score = objects[i].prob;
53
+
54
+ STrack strack(STrack::tlbr_to_tlwh(tlbr_), score);
55
+ if (score >= track_thresh)
56
+ {
57
+ detections.push_back(strack);
58
+ }
59
+ else
60
+ {
61
+ detections_low.push_back(strack);
62
+ }
63
+
64
+ }
65
+ }
66
+
67
+ // Add newly detected tracklets to tracked_stracks
68
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
69
+ {
70
+ if (!this->tracked_stracks[i].is_activated)
71
+ unconfirmed.push_back(&this->tracked_stracks[i]);
72
+ else
73
+ tracked_stracks.push_back(&this->tracked_stracks[i]);
74
+ }
75
+
76
+ ////////////////// Step 2: First association, with IoU //////////////////
77
+ strack_pool = joint_stracks(tracked_stracks, this->lost_stracks);
78
+ STrack::multi_predict(strack_pool, this->kalman_filter);
79
+
80
+ vector<vector<float> > dists;
81
+ int dist_size = 0, dist_size_size = 0;
82
+ dists = iou_distance(strack_pool, detections, dist_size, dist_size_size);
83
+
84
+ vector<vector<int> > matches;
85
+ vector<int> u_track, u_detection;
86
+ linear_assignment(dists, dist_size, dist_size_size, match_thresh, matches, u_track, u_detection);
87
+
88
+ for (int i = 0; i < matches.size(); i++)
89
+ {
90
+ STrack *track = strack_pool[matches[i][0]];
91
+ STrack *det = &detections[matches[i][1]];
92
+ if (track->state == TrackState::Tracked)
93
+ {
94
+ track->update(*det, this->frame_id);
95
+ activated_stracks.push_back(*track);
96
+ }
97
+ else
98
+ {
99
+ track->re_activate(*det, this->frame_id, false);
100
+ refind_stracks.push_back(*track);
101
+ }
102
+ }
103
+
104
+ ////////////////// Step 3: Second association, using low score dets //////////////////
105
+ for (int i = 0; i < u_detection.size(); i++)
106
+ {
107
+ detections_cp.push_back(detections[u_detection[i]]);
108
+ }
109
+ detections.clear();
110
+ detections.assign(detections_low.begin(), detections_low.end());
111
+
112
+ for (int i = 0; i < u_track.size(); i++)
113
+ {
114
+ if (strack_pool[u_track[i]]->state == TrackState::Tracked)
115
+ {
116
+ r_tracked_stracks.push_back(strack_pool[u_track[i]]);
117
+ }
118
+ }
119
+
120
+ dists.clear();
121
+ dists = iou_distance(r_tracked_stracks, detections, dist_size, dist_size_size);
122
+
123
+ matches.clear();
124
+ u_track.clear();
125
+ u_detection.clear();
126
+ linear_assignment(dists, dist_size, dist_size_size, 0.5, matches, u_track, u_detection);
127
+
128
+ for (int i = 0; i < matches.size(); i++)
129
+ {
130
+ STrack *track = r_tracked_stracks[matches[i][0]];
131
+ STrack *det = &detections[matches[i][1]];
132
+ if (track->state == TrackState::Tracked)
133
+ {
134
+ track->update(*det, this->frame_id);
135
+ activated_stracks.push_back(*track);
136
+ }
137
+ else
138
+ {
139
+ track->re_activate(*det, this->frame_id, false);
140
+ refind_stracks.push_back(*track);
141
+ }
142
+ }
143
+
144
+ for (int i = 0; i < u_track.size(); i++)
145
+ {
146
+ STrack *track = r_tracked_stracks[u_track[i]];
147
+ if (track->state != TrackState::Lost)
148
+ {
149
+ track->mark_lost();
150
+ lost_stracks.push_back(*track);
151
+ }
152
+ }
153
+
154
+ // Deal with unconfirmed tracks, usually tracks with only one beginning frame
155
+ detections.clear();
156
+ detections.assign(detections_cp.begin(), detections_cp.end());
157
+
158
+ dists.clear();
159
+ dists = iou_distance(unconfirmed, detections, dist_size, dist_size_size);
160
+
161
+ matches.clear();
162
+ vector<int> u_unconfirmed;
163
+ u_detection.clear();
164
+ linear_assignment(dists, dist_size, dist_size_size, 0.7, matches, u_unconfirmed, u_detection);
165
+
166
+ for (int i = 0; i < matches.size(); i++)
167
+ {
168
+ unconfirmed[matches[i][0]]->update(detections[matches[i][1]], this->frame_id);
169
+ activated_stracks.push_back(*unconfirmed[matches[i][0]]);
170
+ }
171
+
172
+ for (int i = 0; i < u_unconfirmed.size(); i++)
173
+ {
174
+ STrack *track = unconfirmed[u_unconfirmed[i]];
175
+ track->mark_removed();
176
+ removed_stracks.push_back(*track);
177
+ }
178
+
179
+ ////////////////// Step 4: Init new stracks //////////////////
180
+ for (int i = 0; i < u_detection.size(); i++)
181
+ {
182
+ STrack *track = &detections[u_detection[i]];
183
+ if (track->score < this->high_thresh)
184
+ continue;
185
+ track->activate(this->kalman_filter, this->frame_id);
186
+ activated_stracks.push_back(*track);
187
+ }
188
+
189
+ ////////////////// Step 5: Update state //////////////////
190
+ for (int i = 0; i < this->lost_stracks.size(); i++)
191
+ {
192
+ if (this->frame_id - this->lost_stracks[i].end_frame() > this->max_time_lost)
193
+ {
194
+ this->lost_stracks[i].mark_removed();
195
+ removed_stracks.push_back(this->lost_stracks[i]);
196
+ }
197
+ }
198
+
199
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
200
+ {
201
+ if (this->tracked_stracks[i].state == TrackState::Tracked)
202
+ {
203
+ tracked_stracks_swap.push_back(this->tracked_stracks[i]);
204
+ }
205
+ }
206
+ this->tracked_stracks.clear();
207
+ this->tracked_stracks.assign(tracked_stracks_swap.begin(), tracked_stracks_swap.end());
208
+
209
+ this->tracked_stracks = joint_stracks(this->tracked_stracks, activated_stracks);
210
+ this->tracked_stracks = joint_stracks(this->tracked_stracks, refind_stracks);
211
+
212
+ //std::cout << activated_stracks.size() << std::endl;
213
+
214
+ this->lost_stracks = sub_stracks(this->lost_stracks, this->tracked_stracks);
215
+ for (int i = 0; i < lost_stracks.size(); i++)
216
+ {
217
+ this->lost_stracks.push_back(lost_stracks[i]);
218
+ }
219
+
220
+ this->lost_stracks = sub_stracks(this->lost_stracks, this->removed_stracks);
221
+ for (int i = 0; i < removed_stracks.size(); i++)
222
+ {
223
+ this->removed_stracks.push_back(removed_stracks[i]);
224
+ }
225
+
226
+ remove_duplicate_stracks(resa, resb, this->tracked_stracks, this->lost_stracks);
227
+
228
+ this->tracked_stracks.clear();
229
+ this->tracked_stracks.assign(resa.begin(), resa.end());
230
+ this->lost_stracks.clear();
231
+ this->lost_stracks.assign(resb.begin(), resb.end());
232
+
233
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
234
+ {
235
+ if (this->tracked_stracks[i].is_activated)
236
+ {
237
+ output_stracks.push_back(this->tracked_stracks[i]);
238
+ }
239
+ }
240
+ return output_stracks;
241
+ }
deploy/TensorRT/cpp/src/STrack.cpp ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "STrack.h"
2
+
3
+ STrack::STrack(vector<float> tlwh_, float score)
4
+ {
5
+ _tlwh.resize(4);
6
+ _tlwh.assign(tlwh_.begin(), tlwh_.end());
7
+
8
+ is_activated = false;
9
+ track_id = 0;
10
+ state = TrackState::New;
11
+
12
+ tlwh.resize(4);
13
+ tlbr.resize(4);
14
+
15
+ static_tlwh();
16
+ static_tlbr();
17
+ frame_id = 0;
18
+ tracklet_len = 0;
19
+ this->score = score;
20
+ start_frame = 0;
21
+ }
22
+
23
+ STrack::~STrack()
24
+ {
25
+ }
26
+
27
+ void STrack::activate(byte_kalman::KalmanFilter &kalman_filter, int frame_id)
28
+ {
29
+ this->kalman_filter = kalman_filter;
30
+ this->track_id = this->next_id();
31
+
32
+ vector<float> _tlwh_tmp(4);
33
+ _tlwh_tmp[0] = this->_tlwh[0];
34
+ _tlwh_tmp[1] = this->_tlwh[1];
35
+ _tlwh_tmp[2] = this->_tlwh[2];
36
+ _tlwh_tmp[3] = this->_tlwh[3];
37
+ vector<float> xyah = tlwh_to_xyah(_tlwh_tmp);
38
+ DETECTBOX xyah_box;
39
+ xyah_box[0] = xyah[0];
40
+ xyah_box[1] = xyah[1];
41
+ xyah_box[2] = xyah[2];
42
+ xyah_box[3] = xyah[3];
43
+ auto mc = this->kalman_filter.initiate(xyah_box);
44
+ this->mean = mc.first;
45
+ this->covariance = mc.second;
46
+
47
+ static_tlwh();
48
+ static_tlbr();
49
+
50
+ this->tracklet_len = 0;
51
+ this->state = TrackState::Tracked;
52
+ if (frame_id == 1)
53
+ {
54
+ this->is_activated = true;
55
+ }
56
+ //this->is_activated = true;
57
+ this->frame_id = frame_id;
58
+ this->start_frame = frame_id;
59
+ }
60
+
61
+ void STrack::re_activate(STrack &new_track, int frame_id, bool new_id)
62
+ {
63
+ vector<float> xyah = tlwh_to_xyah(new_track.tlwh);
64
+ DETECTBOX xyah_box;
65
+ xyah_box[0] = xyah[0];
66
+ xyah_box[1] = xyah[1];
67
+ xyah_box[2] = xyah[2];
68
+ xyah_box[3] = xyah[3];
69
+ auto mc = this->kalman_filter.update(this->mean, this->covariance, xyah_box);
70
+ this->mean = mc.first;
71
+ this->covariance = mc.second;
72
+
73
+ static_tlwh();
74
+ static_tlbr();
75
+
76
+ this->tracklet_len = 0;
77
+ this->state = TrackState::Tracked;
78
+ this->is_activated = true;
79
+ this->frame_id = frame_id;
80
+ this->score = new_track.score;
81
+ if (new_id)
82
+ this->track_id = next_id();
83
+ }
84
+
85
+ void STrack::update(STrack &new_track, int frame_id)
86
+ {
87
+ this->frame_id = frame_id;
88
+ this->tracklet_len++;
89
+
90
+ vector<float> xyah = tlwh_to_xyah(new_track.tlwh);
91
+ DETECTBOX xyah_box;
92
+ xyah_box[0] = xyah[0];
93
+ xyah_box[1] = xyah[1];
94
+ xyah_box[2] = xyah[2];
95
+ xyah_box[3] = xyah[3];
96
+
97
+ auto mc = this->kalman_filter.update(this->mean, this->covariance, xyah_box);
98
+ this->mean = mc.first;
99
+ this->covariance = mc.second;
100
+
101
+ static_tlwh();
102
+ static_tlbr();
103
+
104
+ this->state = TrackState::Tracked;
105
+ this->is_activated = true;
106
+
107
+ this->score = new_track.score;
108
+ }
109
+
110
+ void STrack::static_tlwh()
111
+ {
112
+ if (this->state == TrackState::New)
113
+ {
114
+ tlwh[0] = _tlwh[0];
115
+ tlwh[1] = _tlwh[1];
116
+ tlwh[2] = _tlwh[2];
117
+ tlwh[3] = _tlwh[3];
118
+ return;
119
+ }
120
+
121
+ tlwh[0] = mean[0];
122
+ tlwh[1] = mean[1];
123
+ tlwh[2] = mean[2];
124
+ tlwh[3] = mean[3];
125
+
126
+ tlwh[2] *= tlwh[3];
127
+ tlwh[0] -= tlwh[2] / 2;
128
+ tlwh[1] -= tlwh[3] / 2;
129
+ }
130
+
131
+ void STrack::static_tlbr()
132
+ {
133
+ tlbr.clear();
134
+ tlbr.assign(tlwh.begin(), tlwh.end());
135
+ tlbr[2] += tlbr[0];
136
+ tlbr[3] += tlbr[1];
137
+ }
138
+
139
+ vector<float> STrack::tlwh_to_xyah(vector<float> tlwh_tmp)
140
+ {
141
+ vector<float> tlwh_output = tlwh_tmp;
142
+ tlwh_output[0] += tlwh_output[2] / 2;
143
+ tlwh_output[1] += tlwh_output[3] / 2;
144
+ tlwh_output[2] /= tlwh_output[3];
145
+ return tlwh_output;
146
+ }
147
+
148
+ vector<float> STrack::to_xyah()
149
+ {
150
+ return tlwh_to_xyah(tlwh);
151
+ }
152
+
153
+ vector<float> STrack::tlbr_to_tlwh(vector<float> &tlbr)
154
+ {
155
+ tlbr[2] -= tlbr[0];
156
+ tlbr[3] -= tlbr[1];
157
+ return tlbr;
158
+ }
159
+
160
+ void STrack::mark_lost()
161
+ {
162
+ state = TrackState::Lost;
163
+ }
164
+
165
+ void STrack::mark_removed()
166
+ {
167
+ state = TrackState::Removed;
168
+ }
169
+
170
+ int STrack::next_id()
171
+ {
172
+ static int _count = 0;
173
+ _count++;
174
+ return _count;
175
+ }
176
+
177
+ int STrack::end_frame()
178
+ {
179
+ return this->frame_id;
180
+ }
181
+
182
+ void STrack::multi_predict(vector<STrack*> &stracks, byte_kalman::KalmanFilter &kalman_filter)
183
+ {
184
+ for (int i = 0; i < stracks.size(); i++)
185
+ {
186
+ if (stracks[i]->state != TrackState::Tracked)
187
+ {
188
+ stracks[i]->mean[7] = 0;
189
+ }
190
+ kalman_filter.predict(stracks[i]->mean, stracks[i]->covariance);
191
+ }
192
+ }
deploy/TensorRT/cpp/src/bytetrack.cpp ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include <fstream>
2
+ #include <iostream>
3
+ #include <sstream>
4
+ #include <numeric>
5
+ #include <chrono>
6
+ #include <vector>
7
+ #include <opencv2/opencv.hpp>
8
+ #include <dirent.h>
9
+ #include "NvInfer.h"
10
+ #include "cuda_runtime_api.h"
11
+ #include "logging.h"
12
+ #include "BYTETracker.h"
13
+
14
+ #define CHECK(status) \
15
+ do\
16
+ {\
17
+ auto ret = (status);\
18
+ if (ret != 0)\
19
+ {\
20
+ cerr << "Cuda failure: " << ret << endl;\
21
+ abort();\
22
+ }\
23
+ } while (0)
24
+
25
+ #define DEVICE 0 // GPU id
26
+ #define NMS_THRESH 0.7
27
+ #define BBOX_CONF_THRESH 0.1
28
+
29
+ using namespace nvinfer1;
30
+
31
+ // stuff we know about the network and the input/output blobs
32
+ static const int INPUT_W = 1088;
33
+ static const int INPUT_H = 608;
34
+ const char* INPUT_BLOB_NAME = "input_0";
35
+ const char* OUTPUT_BLOB_NAME = "output_0";
36
+ static Logger gLogger;
37
+
38
+ Mat static_resize(Mat& img) {
39
+ float r = min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
40
+ // r = std::min(r, 1.0f);
41
+ int unpad_w = r * img.cols;
42
+ int unpad_h = r * img.rows;
43
+ Mat re(unpad_h, unpad_w, CV_8UC3);
44
+ resize(img, re, re.size());
45
+ Mat out(INPUT_H, INPUT_W, CV_8UC3, Scalar(114, 114, 114));
46
+ re.copyTo(out(Rect(0, 0, re.cols, re.rows)));
47
+ return out;
48
+ }
49
+
50
+ struct GridAndStride
51
+ {
52
+ int grid0;
53
+ int grid1;
54
+ int stride;
55
+ };
56
+
57
+ static void generate_grids_and_stride(const int target_w, const int target_h, vector<int>& strides, vector<GridAndStride>& grid_strides)
58
+ {
59
+ for (auto stride : strides)
60
+ {
61
+ int num_grid_w = target_w / stride;
62
+ int num_grid_h = target_h / stride;
63
+ for (int g1 = 0; g1 < num_grid_h; g1++)
64
+ {
65
+ for (int g0 = 0; g0 < num_grid_w; g0++)
66
+ {
67
+ grid_strides.push_back((GridAndStride){g0, g1, stride});
68
+ }
69
+ }
70
+ }
71
+ }
72
+
73
+ static inline float intersection_area(const Object& a, const Object& b)
74
+ {
75
+ Rect_<float> inter = a.rect & b.rect;
76
+ return inter.area();
77
+ }
78
+
79
+ static void qsort_descent_inplace(vector<Object>& faceobjects, int left, int right)
80
+ {
81
+ int i = left;
82
+ int j = right;
83
+ float p = faceobjects[(left + right) / 2].prob;
84
+
85
+ while (i <= j)
86
+ {
87
+ while (faceobjects[i].prob > p)
88
+ i++;
89
+
90
+ while (faceobjects[j].prob < p)
91
+ j--;
92
+
93
+ if (i <= j)
94
+ {
95
+ // swap
96
+ swap(faceobjects[i], faceobjects[j]);
97
+
98
+ i++;
99
+ j--;
100
+ }
101
+ }
102
+
103
+ #pragma omp parallel sections
104
+ {
105
+ #pragma omp section
106
+ {
107
+ if (left < j) qsort_descent_inplace(faceobjects, left, j);
108
+ }
109
+ #pragma omp section
110
+ {
111
+ if (i < right) qsort_descent_inplace(faceobjects, i, right);
112
+ }
113
+ }
114
+ }
115
+
116
+ static void qsort_descent_inplace(vector<Object>& objects)
117
+ {
118
+ if (objects.empty())
119
+ return;
120
+
121
+ qsort_descent_inplace(objects, 0, objects.size() - 1);
122
+ }
123
+
124
+ static void nms_sorted_bboxes(const vector<Object>& faceobjects, vector<int>& picked, float nms_threshold)
125
+ {
126
+ picked.clear();
127
+
128
+ const int n = faceobjects.size();
129
+
130
+ vector<float> areas(n);
131
+ for (int i = 0; i < n; i++)
132
+ {
133
+ areas[i] = faceobjects[i].rect.area();
134
+ }
135
+
136
+ for (int i = 0; i < n; i++)
137
+ {
138
+ const Object& a = faceobjects[i];
139
+
140
+ int keep = 1;
141
+ for (int j = 0; j < (int)picked.size(); j++)
142
+ {
143
+ const Object& b = faceobjects[picked[j]];
144
+
145
+ // intersection over union
146
+ float inter_area = intersection_area(a, b);
147
+ float union_area = areas[i] + areas[picked[j]] - inter_area;
148
+ // float IoU = inter_area / union_area
149
+ if (inter_area / union_area > nms_threshold)
150
+ keep = 0;
151
+ }
152
+
153
+ if (keep)
154
+ picked.push_back(i);
155
+ }
156
+ }
157
+
158
+
159
+ static void generate_yolox_proposals(vector<GridAndStride> grid_strides, float* feat_blob, float prob_threshold, vector<Object>& objects)
160
+ {
161
+ const int num_class = 1;
162
+
163
+ const int num_anchors = grid_strides.size();
164
+
165
+ for (int anchor_idx = 0; anchor_idx < num_anchors; anchor_idx++)
166
+ {
167
+ const int grid0 = grid_strides[anchor_idx].grid0;
168
+ const int grid1 = grid_strides[anchor_idx].grid1;
169
+ const int stride = grid_strides[anchor_idx].stride;
170
+
171
+ const int basic_pos = anchor_idx * (num_class + 5);
172
+
173
+ // yolox/models/yolo_head.py decode logic
174
+ float x_center = (feat_blob[basic_pos+0] + grid0) * stride;
175
+ float y_center = (feat_blob[basic_pos+1] + grid1) * stride;
176
+ float w = exp(feat_blob[basic_pos+2]) * stride;
177
+ float h = exp(feat_blob[basic_pos+3]) * stride;
178
+ float x0 = x_center - w * 0.5f;
179
+ float y0 = y_center - h * 0.5f;
180
+
181
+ float box_objectness = feat_blob[basic_pos+4];
182
+ for (int class_idx = 0; class_idx < num_class; class_idx++)
183
+ {
184
+ float box_cls_score = feat_blob[basic_pos + 5 + class_idx];
185
+ float box_prob = box_objectness * box_cls_score;
186
+ if (box_prob > prob_threshold)
187
+ {
188
+ Object obj;
189
+ obj.rect.x = x0;
190
+ obj.rect.y = y0;
191
+ obj.rect.width = w;
192
+ obj.rect.height = h;
193
+ obj.label = class_idx;
194
+ obj.prob = box_prob;
195
+
196
+ objects.push_back(obj);
197
+ }
198
+
199
+ } // class loop
200
+
201
+ } // point anchor loop
202
+ }
203
+
204
+ float* blobFromImage(Mat& img){
205
+ cvtColor(img, img, COLOR_BGR2RGB);
206
+
207
+ float* blob = new float[img.total()*3];
208
+ int channels = 3;
209
+ int img_h = img.rows;
210
+ int img_w = img.cols;
211
+ vector<float> mean = {0.485, 0.456, 0.406};
212
+ vector<float> std = {0.229, 0.224, 0.225};
213
+ for (size_t c = 0; c < channels; c++)
214
+ {
215
+ for (size_t h = 0; h < img_h; h++)
216
+ {
217
+ for (size_t w = 0; w < img_w; w++)
218
+ {
219
+ blob[c * img_w * img_h + h * img_w + w] =
220
+ (((float)img.at<Vec3b>(h, w)[c]) / 255.0f - mean[c]) / std[c];
221
+ }
222
+ }
223
+ }
224
+ return blob;
225
+ }
226
+
227
+
228
+ static void decode_outputs(float* prob, vector<Object>& objects, float scale, const int img_w, const int img_h) {
229
+ vector<Object> proposals;
230
+ vector<int> strides = {8, 16, 32};
231
+ vector<GridAndStride> grid_strides;
232
+ generate_grids_and_stride(INPUT_W, INPUT_H, strides, grid_strides);
233
+ generate_yolox_proposals(grid_strides, prob, BBOX_CONF_THRESH, proposals);
234
+ //std::cout << "num of boxes before nms: " << proposals.size() << std::endl;
235
+
236
+ qsort_descent_inplace(proposals);
237
+
238
+ vector<int> picked;
239
+ nms_sorted_bboxes(proposals, picked, NMS_THRESH);
240
+
241
+
242
+ int count = picked.size();
243
+
244
+ //std::cout << "num of boxes: " << count << std::endl;
245
+
246
+ objects.resize(count);
247
+ for (int i = 0; i < count; i++)
248
+ {
249
+ objects[i] = proposals[picked[i]];
250
+
251
+ // adjust offset to original unpadded
252
+ float x0 = (objects[i].rect.x) / scale;
253
+ float y0 = (objects[i].rect.y) / scale;
254
+ float x1 = (objects[i].rect.x + objects[i].rect.width) / scale;
255
+ float y1 = (objects[i].rect.y + objects[i].rect.height) / scale;
256
+
257
+ // clip
258
+ // x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
259
+ // y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
260
+ // x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
261
+ // y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);
262
+
263
+ objects[i].rect.x = x0;
264
+ objects[i].rect.y = y0;
265
+ objects[i].rect.width = x1 - x0;
266
+ objects[i].rect.height = y1 - y0;
267
+ }
268
+ }
269
+
270
+ const float color_list[80][3] =
271
+ {
272
+ {0.000, 0.447, 0.741},
273
+ {0.850, 0.325, 0.098},
274
+ {0.929, 0.694, 0.125},
275
+ {0.494, 0.184, 0.556},
276
+ {0.466, 0.674, 0.188},
277
+ {0.301, 0.745, 0.933},
278
+ {0.635, 0.078, 0.184},
279
+ {0.300, 0.300, 0.300},
280
+ {0.600, 0.600, 0.600},
281
+ {1.000, 0.000, 0.000},
282
+ {1.000, 0.500, 0.000},
283
+ {0.749, 0.749, 0.000},
284
+ {0.000, 1.000, 0.000},
285
+ {0.000, 0.000, 1.000},
286
+ {0.667, 0.000, 1.000},
287
+ {0.333, 0.333, 0.000},
288
+ {0.333, 0.667, 0.000},
289
+ {0.333, 1.000, 0.000},
290
+ {0.667, 0.333, 0.000},
291
+ {0.667, 0.667, 0.000},
292
+ {0.667, 1.000, 0.000},
293
+ {1.000, 0.333, 0.000},
294
+ {1.000, 0.667, 0.000},
295
+ {1.000, 1.000, 0.000},
296
+ {0.000, 0.333, 0.500},
297
+ {0.000, 0.667, 0.500},
298
+ {0.000, 1.000, 0.500},
299
+ {0.333, 0.000, 0.500},
300
+ {0.333, 0.333, 0.500},
301
+ {0.333, 0.667, 0.500},
302
+ {0.333, 1.000, 0.500},
303
+ {0.667, 0.000, 0.500},
304
+ {0.667, 0.333, 0.500},
305
+ {0.667, 0.667, 0.500},
306
+ {0.667, 1.000, 0.500},
307
+ {1.000, 0.000, 0.500},
308
+ {1.000, 0.333, 0.500},
309
+ {1.000, 0.667, 0.500},
310
+ {1.000, 1.000, 0.500},
311
+ {0.000, 0.333, 1.000},
312
+ {0.000, 0.667, 1.000},
313
+ {0.000, 1.000, 1.000},
314
+ {0.333, 0.000, 1.000},
315
+ {0.333, 0.333, 1.000},
316
+ {0.333, 0.667, 1.000},
317
+ {0.333, 1.000, 1.000},
318
+ {0.667, 0.000, 1.000},
319
+ {0.667, 0.333, 1.000},
320
+ {0.667, 0.667, 1.000},
321
+ {0.667, 1.000, 1.000},
322
+ {1.000, 0.000, 1.000},
323
+ {1.000, 0.333, 1.000},
324
+ {1.000, 0.667, 1.000},
325
+ {0.333, 0.000, 0.000},
326
+ {0.500, 0.000, 0.000},
327
+ {0.667, 0.000, 0.000},
328
+ {0.833, 0.000, 0.000},
329
+ {1.000, 0.000, 0.000},
330
+ {0.000, 0.167, 0.000},
331
+ {0.000, 0.333, 0.000},
332
+ {0.000, 0.500, 0.000},
333
+ {0.000, 0.667, 0.000},
334
+ {0.000, 0.833, 0.000},
335
+ {0.000, 1.000, 0.000},
336
+ {0.000, 0.000, 0.167},
337
+ {0.000, 0.000, 0.333},
338
+ {0.000, 0.000, 0.500},
339
+ {0.000, 0.000, 0.667},
340
+ {0.000, 0.000, 0.833},
341
+ {0.000, 0.000, 1.000},
342
+ {0.000, 0.000, 0.000},
343
+ {0.143, 0.143, 0.143},
344
+ {0.286, 0.286, 0.286},
345
+ {0.429, 0.429, 0.429},
346
+ {0.571, 0.571, 0.571},
347
+ {0.714, 0.714, 0.714},
348
+ {0.857, 0.857, 0.857},
349
+ {0.000, 0.447, 0.741},
350
+ {0.314, 0.717, 0.741},
351
+ {0.50, 0.5, 0}
352
+ };
353
+
354
+ void doInference(IExecutionContext& context, float* input, float* output, const int output_size, Size input_shape) {
355
+ const ICudaEngine& engine = context.getEngine();
356
+
357
+ // Pointers to input and output device buffers to pass to engine.
358
+ // Engine requires exactly IEngine::getNbBindings() number of buffers.
359
+ assert(engine.getNbBindings() == 2);
360
+ void* buffers[2];
361
+
362
+ // In order to bind the buffers, we need to know the names of the input and output tensors.
363
+ // Note that indices are guaranteed to be less than IEngine::getNbBindings()
364
+ const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME);
365
+
366
+ assert(engine.getBindingDataType(inputIndex) == nvinfer1::DataType::kFLOAT);
367
+ const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME);
368
+ assert(engine.getBindingDataType(outputIndex) == nvinfer1::DataType::kFLOAT);
369
+ int mBatchSize = engine.getMaxBatchSize();
370
+
371
+ // Create GPU buffers on device
372
+ CHECK(cudaMalloc(&buffers[inputIndex], 3 * input_shape.height * input_shape.width * sizeof(float)));
373
+ CHECK(cudaMalloc(&buffers[outputIndex], output_size*sizeof(float)));
374
+
375
+ // Create stream
376
+ cudaStream_t stream;
377
+ CHECK(cudaStreamCreate(&stream));
378
+
379
+ // DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
380
+ CHECK(cudaMemcpyAsync(buffers[inputIndex], input, 3 * input_shape.height * input_shape.width * sizeof(float), cudaMemcpyHostToDevice, stream));
381
+ context.enqueue(1, buffers, stream, nullptr);
382
+ CHECK(cudaMemcpyAsync(output, buffers[outputIndex], output_size * sizeof(float), cudaMemcpyDeviceToHost, stream));
383
+ cudaStreamSynchronize(stream);
384
+
385
+ // Release stream and buffers
386
+ cudaStreamDestroy(stream);
387
+ CHECK(cudaFree(buffers[inputIndex]));
388
+ CHECK(cudaFree(buffers[outputIndex]));
389
+ }
390
+
391
+ int main(int argc, char** argv) {
392
+ cudaSetDevice(DEVICE);
393
+
394
+ // create a model using the API directly and serialize it to a stream
395
+ char *trtModelStream{nullptr};
396
+ size_t size{0};
397
+
398
+ if (argc == 4 && string(argv[2]) == "-i") {
399
+ const string engine_file_path {argv[1]};
400
+ ifstream file(engine_file_path, ios::binary);
401
+ if (file.good()) {
402
+ file.seekg(0, file.end);
403
+ size = file.tellg();
404
+ file.seekg(0, file.beg);
405
+ trtModelStream = new char[size];
406
+ assert(trtModelStream);
407
+ file.read(trtModelStream, size);
408
+ file.close();
409
+ }
410
+ } else {
411
+ cerr << "arguments not right!" << endl;
412
+ cerr << "run 'python3 tools/trt.py -f exps/example/mot/yolox_s_mix_det.py -c pretrained/bytetrack_s_mot17.pth.tar' to serialize model first!" << std::endl;
413
+ cerr << "Then use the following command:" << endl;
414
+ cerr << "cd demo/TensorRT/cpp/build" << endl;
415
+ cerr << "./bytetrack ../../../../YOLOX_outputs/yolox_s_mix_det/model_trt.engine -i ../../../../videos/palace.mp4 // deserialize file and run inference" << std::endl;
416
+ return -1;
417
+ }
418
+ const string input_video_path {argv[3]};
419
+
420
+ IRuntime* runtime = createInferRuntime(gLogger);
421
+ assert(runtime != nullptr);
422
+ ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size);
423
+ assert(engine != nullptr);
424
+ IExecutionContext* context = engine->createExecutionContext();
425
+ assert(context != nullptr);
426
+ delete[] trtModelStream;
427
+ auto out_dims = engine->getBindingDimensions(1);
428
+ auto output_size = 1;
429
+ for(int j=0;j<out_dims.nbDims;j++) {
430
+ output_size *= out_dims.d[j];
431
+ }
432
+ static float* prob = new float[output_size];
433
+
434
+ VideoCapture cap(input_video_path);
435
+ if (!cap.isOpened())
436
+ return 0;
437
+
438
+ int img_w = cap.get(CAP_PROP_FRAME_WIDTH);
439
+ int img_h = cap.get(CAP_PROP_FRAME_HEIGHT);
440
+ int fps = cap.get(CAP_PROP_FPS);
441
+ long nFrame = static_cast<long>(cap.get(CAP_PROP_FRAME_COUNT));
442
+ cout << "Total frames: " << nFrame << endl;
443
+
444
+ VideoWriter writer("demo.mp4", VideoWriter::fourcc('m', 'p', '4', 'v'), fps, Size(img_w, img_h));
445
+
446
+ Mat img;
447
+ BYTETracker tracker(fps, 30);
448
+ int num_frames = 0;
449
+ int total_ms = 0;
450
+ while (true)
451
+ {
452
+ if(!cap.read(img))
453
+ break;
454
+ num_frames ++;
455
+ if (num_frames % 20 == 0)
456
+ {
457
+ cout << "Processing frame " << num_frames << " (" << num_frames * 1000000 / total_ms << " fps)" << endl;
458
+ }
459
+ if (img.empty())
460
+ break;
461
+ Mat pr_img = static_resize(img);
462
+
463
+ float* blob;
464
+ blob = blobFromImage(pr_img);
465
+ float scale = min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
466
+
467
+ // run inference
468
+ auto start = chrono::system_clock::now();
469
+ doInference(*context, blob, prob, output_size, pr_img.size());
470
+ vector<Object> objects;
471
+ decode_outputs(prob, objects, scale, img_w, img_h);
472
+ vector<STrack> output_stracks = tracker.update(objects);
473
+ auto end = chrono::system_clock::now();
474
+ total_ms = total_ms + chrono::duration_cast<chrono::microseconds>(end - start).count();
475
+
476
+ for (int i = 0; i < output_stracks.size(); i++)
477
+ {
478
+ vector<float> tlwh = output_stracks[i].tlwh;
479
+ if (tlwh[2] * tlwh[3] > 20)
480
+ {
481
+ Scalar s = tracker.get_color(output_stracks[i].track_id);
482
+ putText(img, format("%d", output_stracks[i].track_id), Point(tlwh[0], tlwh[1] - 5),
483
+ 0, 0.6, Scalar(0, 0, 255), 2, LINE_AA);
484
+ rectangle(img, Rect(tlwh[0], tlwh[1], tlwh[2], tlwh[3]), s, 2);
485
+ }
486
+ }
487
+ putText(img, format("frame: %d fps: %d num: %d", num_frames, num_frames * 1000000 / total_ms, output_stracks.size()),
488
+ Point(0, 30), 0, 0.6, Scalar(0, 0, 255), 2, LINE_AA);
489
+ writer.write(img);
490
+
491
+ delete blob;
492
+ char c = waitKey(1);
493
+ if (c > 0)
494
+ {
495
+ break;
496
+ }
497
+ }
498
+ cap.release();
499
+ cout << "FPS: " << num_frames * 1000000 / total_ms << endl;
500
+ // destroy the engine
501
+ context->destroy();
502
+ engine->destroy();
503
+ runtime->destroy();
504
+ return 0;
505
+ }
deploy/TensorRT/cpp/src/kalmanFilter.cpp ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "kalmanFilter.h"
2
+ #include <Eigen/Cholesky>
3
+
4
+ namespace byte_kalman
5
+ {
6
+ const double KalmanFilter::chi2inv95[10] = {
7
+ 0,
8
+ 3.8415,
9
+ 5.9915,
10
+ 7.8147,
11
+ 9.4877,
12
+ 11.070,
13
+ 12.592,
14
+ 14.067,
15
+ 15.507,
16
+ 16.919
17
+ };
18
+ KalmanFilter::KalmanFilter()
19
+ {
20
+ int ndim = 4;
21
+ double dt = 1.;
22
+
23
+ _motion_mat = Eigen::MatrixXf::Identity(8, 8);
24
+ for (int i = 0; i < ndim; i++) {
25
+ _motion_mat(i, ndim + i) = dt;
26
+ }
27
+ _update_mat = Eigen::MatrixXf::Identity(4, 8);
28
+
29
+ this->_std_weight_position = 1. / 20;
30
+ this->_std_weight_velocity = 1. / 160;
31
+ }
32
+
33
+ KAL_DATA KalmanFilter::initiate(const DETECTBOX &measurement)
34
+ {
35
+ DETECTBOX mean_pos = measurement;
36
+ DETECTBOX mean_vel;
37
+ for (int i = 0; i < 4; i++) mean_vel(i) = 0;
38
+
39
+ KAL_MEAN mean;
40
+ for (int i = 0; i < 8; i++) {
41
+ if (i < 4) mean(i) = mean_pos(i);
42
+ else mean(i) = mean_vel(i - 4);
43
+ }
44
+
45
+ KAL_MEAN std;
46
+ std(0) = 2 * _std_weight_position * measurement[3];
47
+ std(1) = 2 * _std_weight_position * measurement[3];
48
+ std(2) = 1e-2;
49
+ std(3) = 2 * _std_weight_position * measurement[3];
50
+ std(4) = 10 * _std_weight_velocity * measurement[3];
51
+ std(5) = 10 * _std_weight_velocity * measurement[3];
52
+ std(6) = 1e-5;
53
+ std(7) = 10 * _std_weight_velocity * measurement[3];
54
+
55
+ KAL_MEAN tmp = std.array().square();
56
+ KAL_COVA var = tmp.asDiagonal();
57
+ return std::make_pair(mean, var);
58
+ }
59
+
60
+ void KalmanFilter::predict(KAL_MEAN &mean, KAL_COVA &covariance)
61
+ {
62
+ //revise the data;
63
+ DETECTBOX std_pos;
64
+ std_pos << _std_weight_position * mean(3),
65
+ _std_weight_position * mean(3),
66
+ 1e-2,
67
+ _std_weight_position * mean(3);
68
+ DETECTBOX std_vel;
69
+ std_vel << _std_weight_velocity * mean(3),
70
+ _std_weight_velocity * mean(3),
71
+ 1e-5,
72
+ _std_weight_velocity * mean(3);
73
+ KAL_MEAN tmp;
74
+ tmp.block<1, 4>(0, 0) = std_pos;
75
+ tmp.block<1, 4>(0, 4) = std_vel;
76
+ tmp = tmp.array().square();
77
+ KAL_COVA motion_cov = tmp.asDiagonal();
78
+ KAL_MEAN mean1 = this->_motion_mat * mean.transpose();
79
+ KAL_COVA covariance1 = this->_motion_mat * covariance *(_motion_mat.transpose());
80
+ covariance1 += motion_cov;
81
+
82
+ mean = mean1;
83
+ covariance = covariance1;
84
+ }
85
+
86
+ KAL_HDATA KalmanFilter::project(const KAL_MEAN &mean, const KAL_COVA &covariance)
87
+ {
88
+ DETECTBOX std;
89
+ std << _std_weight_position * mean(3), _std_weight_position * mean(3),
90
+ 1e-1, _std_weight_position * mean(3);
91
+ KAL_HMEAN mean1 = _update_mat * mean.transpose();
92
+ KAL_HCOVA covariance1 = _update_mat * covariance * (_update_mat.transpose());
93
+ Eigen::Matrix<float, 4, 4> diag = std.asDiagonal();
94
+ diag = diag.array().square().matrix();
95
+ covariance1 += diag;
96
+ // covariance1.diagonal() << diag;
97
+ return std::make_pair(mean1, covariance1);
98
+ }
99
+
100
+ KAL_DATA
101
+ KalmanFilter::update(
102
+ const KAL_MEAN &mean,
103
+ const KAL_COVA &covariance,
104
+ const DETECTBOX &measurement)
105
+ {
106
+ KAL_HDATA pa = project(mean, covariance);
107
+ KAL_HMEAN projected_mean = pa.first;
108
+ KAL_HCOVA projected_cov = pa.second;
109
+
110
+ //chol_factor, lower =
111
+ //scipy.linalg.cho_factor(projected_cov, lower=True, check_finite=False)
112
+ //kalmain_gain =
113
+ //scipy.linalg.cho_solve((cho_factor, lower),
114
+ //np.dot(covariance, self._upadte_mat.T).T,
115
+ //check_finite=False).T
116
+ Eigen::Matrix<float, 4, 8> B = (covariance * (_update_mat.transpose())).transpose();
117
+ Eigen::Matrix<float, 8, 4> kalman_gain = (projected_cov.llt().solve(B)).transpose(); // eg.8x4
118
+ Eigen::Matrix<float, 1, 4> innovation = measurement - projected_mean; //eg.1x4
119
+ auto tmp = innovation * (kalman_gain.transpose());
120
+ KAL_MEAN new_mean = (mean.array() + tmp.array()).matrix();
121
+ KAL_COVA new_covariance = covariance - kalman_gain * projected_cov*(kalman_gain.transpose());
122
+ return std::make_pair(new_mean, new_covariance);
123
+ }
124
+
125
+ Eigen::Matrix<float, 1, -1>
126
+ KalmanFilter::gating_distance(
127
+ const KAL_MEAN &mean,
128
+ const KAL_COVA &covariance,
129
+ const std::vector<DETECTBOX> &measurements,
130
+ bool only_position)
131
+ {
132
+ KAL_HDATA pa = this->project(mean, covariance);
133
+ if (only_position) {
134
+ printf("not implement!");
135
+ exit(0);
136
+ }
137
+ KAL_HMEAN mean1 = pa.first;
138
+ KAL_HCOVA covariance1 = pa.second;
139
+
140
+ // Eigen::Matrix<float, -1, 4, Eigen::RowMajor> d(size, 4);
141
+ DETECTBOXSS d(measurements.size(), 4);
142
+ int pos = 0;
143
+ for (DETECTBOX box : measurements) {
144
+ d.row(pos++) = box - mean1;
145
+ }
146
+ Eigen::Matrix<float, -1, -1, Eigen::RowMajor> factor = covariance1.llt().matrixL();
147
+ Eigen::Matrix<float, -1, -1> z = factor.triangularView<Eigen::Lower>().solve<Eigen::OnTheRight>(d).transpose();
148
+ auto zz = ((z.array())*(z.array())).matrix();
149
+ auto square_maha = zz.colwise().sum();
150
+ return square_maha;
151
+ }
152
+ }
deploy/TensorRT/cpp/src/lapjv.cpp ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include <stdio.h>
2
+ #include <stdlib.h>
3
+ #include <string.h>
4
+
5
+ #include "lapjv.h"
6
+
7
+ /** Column-reduction and reduction transfer for a dense cost matrix.
8
+ */
9
+ int_t _ccrrt_dense(const uint_t n, cost_t *cost[],
10
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
11
+ {
12
+ int_t n_free_rows;
13
+ boolean *unique;
14
+
15
+ for (uint_t i = 0; i < n; i++) {
16
+ x[i] = -1;
17
+ v[i] = LARGE;
18
+ y[i] = 0;
19
+ }
20
+ for (uint_t i = 0; i < n; i++) {
21
+ for (uint_t j = 0; j < n; j++) {
22
+ const cost_t c = cost[i][j];
23
+ if (c < v[j]) {
24
+ v[j] = c;
25
+ y[j] = i;
26
+ }
27
+ PRINTF("i=%d, j=%d, c[i,j]=%f, v[j]=%f y[j]=%d\n", i, j, c, v[j], y[j]);
28
+ }
29
+ }
30
+ PRINT_COST_ARRAY(v, n);
31
+ PRINT_INDEX_ARRAY(y, n);
32
+ NEW(unique, boolean, n);
33
+ memset(unique, TRUE, n);
34
+ {
35
+ int_t j = n;
36
+ do {
37
+ j--;
38
+ const int_t i = y[j];
39
+ if (x[i] < 0) {
40
+ x[i] = j;
41
+ }
42
+ else {
43
+ unique[i] = FALSE;
44
+ y[j] = -1;
45
+ }
46
+ } while (j > 0);
47
+ }
48
+ n_free_rows = 0;
49
+ for (uint_t i = 0; i < n; i++) {
50
+ if (x[i] < 0) {
51
+ free_rows[n_free_rows++] = i;
52
+ }
53
+ else if (unique[i]) {
54
+ const int_t j = x[i];
55
+ cost_t min = LARGE;
56
+ for (uint_t j2 = 0; j2 < n; j2++) {
57
+ if (j2 == (uint_t)j) {
58
+ continue;
59
+ }
60
+ const cost_t c = cost[i][j2] - v[j2];
61
+ if (c < min) {
62
+ min = c;
63
+ }
64
+ }
65
+ PRINTF("v[%d] = %f - %f\n", j, v[j], min);
66
+ v[j] -= min;
67
+ }
68
+ }
69
+ FREE(unique);
70
+ return n_free_rows;
71
+ }
72
+
73
+
74
+ /** Augmenting row reduction for a dense cost matrix.
75
+ */
76
+ int_t _carr_dense(
77
+ const uint_t n, cost_t *cost[],
78
+ const uint_t n_free_rows,
79
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
80
+ {
81
+ uint_t current = 0;
82
+ int_t new_free_rows = 0;
83
+ uint_t rr_cnt = 0;
84
+ PRINT_INDEX_ARRAY(x, n);
85
+ PRINT_INDEX_ARRAY(y, n);
86
+ PRINT_COST_ARRAY(v, n);
87
+ PRINT_INDEX_ARRAY(free_rows, n_free_rows);
88
+ while (current < n_free_rows) {
89
+ int_t i0;
90
+ int_t j1, j2;
91
+ cost_t v1, v2, v1_new;
92
+ boolean v1_lowers;
93
+
94
+ rr_cnt++;
95
+ PRINTF("current = %d rr_cnt = %d\n", current, rr_cnt);
96
+ const int_t free_i = free_rows[current++];
97
+ j1 = 0;
98
+ v1 = cost[free_i][0] - v[0];
99
+ j2 = -1;
100
+ v2 = LARGE;
101
+ for (uint_t j = 1; j < n; j++) {
102
+ PRINTF("%d = %f %d = %f\n", j1, v1, j2, v2);
103
+ const cost_t c = cost[free_i][j] - v[j];
104
+ if (c < v2) {
105
+ if (c >= v1) {
106
+ v2 = c;
107
+ j2 = j;
108
+ }
109
+ else {
110
+ v2 = v1;
111
+ v1 = c;
112
+ j2 = j1;
113
+ j1 = j;
114
+ }
115
+ }
116
+ }
117
+ i0 = y[j1];
118
+ v1_new = v[j1] - (v2 - v1);
119
+ v1_lowers = v1_new < v[j1];
120
+ PRINTF("%d %d 1=%d,%f 2=%d,%f v1'=%f(%d,%g) \n", free_i, i0, j1, v1, j2, v2, v1_new, v1_lowers, v[j1] - v1_new);
121
+ if (rr_cnt < current * n) {
122
+ if (v1_lowers) {
123
+ v[j1] = v1_new;
124
+ }
125
+ else if (i0 >= 0 && j2 >= 0) {
126
+ j1 = j2;
127
+ i0 = y[j2];
128
+ }
129
+ if (i0 >= 0) {
130
+ if (v1_lowers) {
131
+ free_rows[--current] = i0;
132
+ }
133
+ else {
134
+ free_rows[new_free_rows++] = i0;
135
+ }
136
+ }
137
+ }
138
+ else {
139
+ PRINTF("rr_cnt=%d >= %d (current=%d * n=%d)\n", rr_cnt, current * n, current, n);
140
+ if (i0 >= 0) {
141
+ free_rows[new_free_rows++] = i0;
142
+ }
143
+ }
144
+ x[free_i] = j1;
145
+ y[j1] = free_i;
146
+ }
147
+ return new_free_rows;
148
+ }
149
+
150
+
151
+ /** Find columns with minimum d[j] and put them on the SCAN list.
152
+ */
153
+ uint_t _find_dense(const uint_t n, uint_t lo, cost_t *d, int_t *cols, int_t *y)
154
+ {
155
+ uint_t hi = lo + 1;
156
+ cost_t mind = d[cols[lo]];
157
+ for (uint_t k = hi; k < n; k++) {
158
+ int_t j = cols[k];
159
+ if (d[j] <= mind) {
160
+ if (d[j] < mind) {
161
+ hi = lo;
162
+ mind = d[j];
163
+ }
164
+ cols[k] = cols[hi];
165
+ cols[hi++] = j;
166
+ }
167
+ }
168
+ return hi;
169
+ }
170
+
171
+
172
+ // Scan all columns in TODO starting from arbitrary column in SCAN
173
+ // and try to decrease d of the TODO columns using the SCAN column.
174
+ int_t _scan_dense(const uint_t n, cost_t *cost[],
175
+ uint_t *plo, uint_t*phi,
176
+ cost_t *d, int_t *cols, int_t *pred,
177
+ int_t *y, cost_t *v)
178
+ {
179
+ uint_t lo = *plo;
180
+ uint_t hi = *phi;
181
+ cost_t h, cred_ij;
182
+
183
+ while (lo != hi) {
184
+ int_t j = cols[lo++];
185
+ const int_t i = y[j];
186
+ const cost_t mind = d[j];
187
+ h = cost[i][j] - v[j] - mind;
188
+ PRINTF("i=%d j=%d h=%f\n", i, j, h);
189
+ // For all columns in TODO
190
+ for (uint_t k = hi; k < n; k++) {
191
+ j = cols[k];
192
+ cred_ij = cost[i][j] - v[j] - h;
193
+ if (cred_ij < d[j]) {
194
+ d[j] = cred_ij;
195
+ pred[j] = i;
196
+ if (cred_ij == mind) {
197
+ if (y[j] < 0) {
198
+ return j;
199
+ }
200
+ cols[k] = cols[hi];
201
+ cols[hi++] = j;
202
+ }
203
+ }
204
+ }
205
+ }
206
+ *plo = lo;
207
+ *phi = hi;
208
+ return -1;
209
+ }
210
+
211
+
212
+ /** Single iteration of modified Dijkstra shortest path algorithm as explained in the JV paper.
213
+ *
214
+ * This is a dense matrix version.
215
+ *
216
+ * \return The closest free column index.
217
+ */
218
+ int_t find_path_dense(
219
+ const uint_t n, cost_t *cost[],
220
+ const int_t start_i,
221
+ int_t *y, cost_t *v,
222
+ int_t *pred)
223
+ {
224
+ uint_t lo = 0, hi = 0;
225
+ int_t final_j = -1;
226
+ uint_t n_ready = 0;
227
+ int_t *cols;
228
+ cost_t *d;
229
+
230
+ NEW(cols, int_t, n);
231
+ NEW(d, cost_t, n);
232
+
233
+ for (uint_t i = 0; i < n; i++) {
234
+ cols[i] = i;
235
+ pred[i] = start_i;
236
+ d[i] = cost[start_i][i] - v[i];
237
+ }
238
+ PRINT_COST_ARRAY(d, n);
239
+ while (final_j == -1) {
240
+ // No columns left on the SCAN list.
241
+ if (lo == hi) {
242
+ PRINTF("%d..%d -> find\n", lo, hi);
243
+ n_ready = lo;
244
+ hi = _find_dense(n, lo, d, cols, y);
245
+ PRINTF("check %d..%d\n", lo, hi);
246
+ PRINT_INDEX_ARRAY(cols, n);
247
+ for (uint_t k = lo; k < hi; k++) {
248
+ const int_t j = cols[k];
249
+ if (y[j] < 0) {
250
+ final_j = j;
251
+ }
252
+ }
253
+ }
254
+ if (final_j == -1) {
255
+ PRINTF("%d..%d -> scan\n", lo, hi);
256
+ final_j = _scan_dense(
257
+ n, cost, &lo, &hi, d, cols, pred, y, v);
258
+ PRINT_COST_ARRAY(d, n);
259
+ PRINT_INDEX_ARRAY(cols, n);
260
+ PRINT_INDEX_ARRAY(pred, n);
261
+ }
262
+ }
263
+
264
+ PRINTF("found final_j=%d\n", final_j);
265
+ PRINT_INDEX_ARRAY(cols, n);
266
+ {
267
+ const cost_t mind = d[cols[lo]];
268
+ for (uint_t k = 0; k < n_ready; k++) {
269
+ const int_t j = cols[k];
270
+ v[j] += d[j] - mind;
271
+ }
272
+ }
273
+
274
+ FREE(cols);
275
+ FREE(d);
276
+
277
+ return final_j;
278
+ }
279
+
280
+
281
+ /** Augment for a dense cost matrix.
282
+ */
283
+ int_t _ca_dense(
284
+ const uint_t n, cost_t *cost[],
285
+ const uint_t n_free_rows,
286
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
287
+ {
288
+ int_t *pred;
289
+
290
+ NEW(pred, int_t, n);
291
+
292
+ for (int_t *pfree_i = free_rows; pfree_i < free_rows + n_free_rows; pfree_i++) {
293
+ int_t i = -1, j;
294
+ uint_t k = 0;
295
+
296
+ PRINTF("looking at free_i=%d\n", *pfree_i);
297
+ j = find_path_dense(n, cost, *pfree_i, y, v, pred);
298
+ ASSERT(j >= 0);
299
+ ASSERT(j < n);
300
+ while (i != *pfree_i) {
301
+ PRINTF("augment %d\n", j);
302
+ PRINT_INDEX_ARRAY(pred, n);
303
+ i = pred[j];
304
+ PRINTF("y[%d]=%d -> %d\n", j, y[j], i);
305
+ y[j] = i;
306
+ PRINT_INDEX_ARRAY(x, n);
307
+ SWAP_INDICES(j, x[i]);
308
+ k++;
309
+ if (k >= n) {
310
+ ASSERT(FALSE);
311
+ }
312
+ }
313
+ }
314
+ FREE(pred);
315
+ return 0;
316
+ }
317
+
318
+
319
+ /** Solve dense sparse LAP.
320
+ */
321
+ int lapjv_internal(
322
+ const uint_t n, cost_t *cost[],
323
+ int_t *x, int_t *y)
324
+ {
325
+ int ret;
326
+ int_t *free_rows;
327
+ cost_t *v;
328
+
329
+ NEW(free_rows, int_t, n);
330
+ NEW(v, cost_t, n);
331
+ ret = _ccrrt_dense(n, cost, free_rows, x, y, v);
332
+ int i = 0;
333
+ while (ret > 0 && i < 2) {
334
+ ret = _carr_dense(n, cost, ret, free_rows, x, y, v);
335
+ i++;
336
+ }
337
+ if (ret > 0) {
338
+ ret = _ca_dense(n, cost, ret, free_rows, x, y, v);
339
+ }
340
+ FREE(v);
341
+ FREE(free_rows);
342
+ return ret;
343
+ }
deploy/TensorRT/cpp/src/utils.cpp ADDED
@@ -0,0 +1,429 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "BYTETracker.h"
2
+ #include "lapjv.h"
3
+
4
+ vector<STrack*> BYTETracker::joint_stracks(vector<STrack*> &tlista, vector<STrack> &tlistb)
5
+ {
6
+ map<int, int> exists;
7
+ vector<STrack*> res;
8
+ for (int i = 0; i < tlista.size(); i++)
9
+ {
10
+ exists.insert(pair<int, int>(tlista[i]->track_id, 1));
11
+ res.push_back(tlista[i]);
12
+ }
13
+ for (int i = 0; i < tlistb.size(); i++)
14
+ {
15
+ int tid = tlistb[i].track_id;
16
+ if (!exists[tid] || exists.count(tid) == 0)
17
+ {
18
+ exists[tid] = 1;
19
+ res.push_back(&tlistb[i]);
20
+ }
21
+ }
22
+ return res;
23
+ }
24
+
25
+ vector<STrack> BYTETracker::joint_stracks(vector<STrack> &tlista, vector<STrack> &tlistb)
26
+ {
27
+ map<int, int> exists;
28
+ vector<STrack> res;
29
+ for (int i = 0; i < tlista.size(); i++)
30
+ {
31
+ exists.insert(pair<int, int>(tlista[i].track_id, 1));
32
+ res.push_back(tlista[i]);
33
+ }
34
+ for (int i = 0; i < tlistb.size(); i++)
35
+ {
36
+ int tid = tlistb[i].track_id;
37
+ if (!exists[tid] || exists.count(tid) == 0)
38
+ {
39
+ exists[tid] = 1;
40
+ res.push_back(tlistb[i]);
41
+ }
42
+ }
43
+ return res;
44
+ }
45
+
46
+ vector<STrack> BYTETracker::sub_stracks(vector<STrack> &tlista, vector<STrack> &tlistb)
47
+ {
48
+ map<int, STrack> stracks;
49
+ for (int i = 0; i < tlista.size(); i++)
50
+ {
51
+ stracks.insert(pair<int, STrack>(tlista[i].track_id, tlista[i]));
52
+ }
53
+ for (int i = 0; i < tlistb.size(); i++)
54
+ {
55
+ int tid = tlistb[i].track_id;
56
+ if (stracks.count(tid) != 0)
57
+ {
58
+ stracks.erase(tid);
59
+ }
60
+ }
61
+
62
+ vector<STrack> res;
63
+ std::map<int, STrack>::iterator it;
64
+ for (it = stracks.begin(); it != stracks.end(); ++it)
65
+ {
66
+ res.push_back(it->second);
67
+ }
68
+
69
+ return res;
70
+ }
71
+
72
+ void BYTETracker::remove_duplicate_stracks(vector<STrack> &resa, vector<STrack> &resb, vector<STrack> &stracksa, vector<STrack> &stracksb)
73
+ {
74
+ vector<vector<float> > pdist = iou_distance(stracksa, stracksb);
75
+ vector<pair<int, int> > pairs;
76
+ for (int i = 0; i < pdist.size(); i++)
77
+ {
78
+ for (int j = 0; j < pdist[i].size(); j++)
79
+ {
80
+ if (pdist[i][j] < 0.15)
81
+ {
82
+ pairs.push_back(pair<int, int>(i, j));
83
+ }
84
+ }
85
+ }
86
+
87
+ vector<int> dupa, dupb;
88
+ for (int i = 0; i < pairs.size(); i++)
89
+ {
90
+ int timep = stracksa[pairs[i].first].frame_id - stracksa[pairs[i].first].start_frame;
91
+ int timeq = stracksb[pairs[i].second].frame_id - stracksb[pairs[i].second].start_frame;
92
+ if (timep > timeq)
93
+ dupb.push_back(pairs[i].second);
94
+ else
95
+ dupa.push_back(pairs[i].first);
96
+ }
97
+
98
+ for (int i = 0; i < stracksa.size(); i++)
99
+ {
100
+ vector<int>::iterator iter = find(dupa.begin(), dupa.end(), i);
101
+ if (iter == dupa.end())
102
+ {
103
+ resa.push_back(stracksa[i]);
104
+ }
105
+ }
106
+
107
+ for (int i = 0; i < stracksb.size(); i++)
108
+ {
109
+ vector<int>::iterator iter = find(dupb.begin(), dupb.end(), i);
110
+ if (iter == dupb.end())
111
+ {
112
+ resb.push_back(stracksb[i]);
113
+ }
114
+ }
115
+ }
116
+
117
+ void BYTETracker::linear_assignment(vector<vector<float> > &cost_matrix, int cost_matrix_size, int cost_matrix_size_size, float thresh,
118
+ vector<vector<int> > &matches, vector<int> &unmatched_a, vector<int> &unmatched_b)
119
+ {
120
+ if (cost_matrix.size() == 0)
121
+ {
122
+ for (int i = 0; i < cost_matrix_size; i++)
123
+ {
124
+ unmatched_a.push_back(i);
125
+ }
126
+ for (int i = 0; i < cost_matrix_size_size; i++)
127
+ {
128
+ unmatched_b.push_back(i);
129
+ }
130
+ return;
131
+ }
132
+
133
+ vector<int> rowsol; vector<int> colsol;
134
+ float c = lapjv(cost_matrix, rowsol, colsol, true, thresh);
135
+ for (int i = 0; i < rowsol.size(); i++)
136
+ {
137
+ if (rowsol[i] >= 0)
138
+ {
139
+ vector<int> match;
140
+ match.push_back(i);
141
+ match.push_back(rowsol[i]);
142
+ matches.push_back(match);
143
+ }
144
+ else
145
+ {
146
+ unmatched_a.push_back(i);
147
+ }
148
+ }
149
+
150
+ for (int i = 0; i < colsol.size(); i++)
151
+ {
152
+ if (colsol[i] < 0)
153
+ {
154
+ unmatched_b.push_back(i);
155
+ }
156
+ }
157
+ }
158
+
159
+ vector<vector<float> > BYTETracker::ious(vector<vector<float> > &atlbrs, vector<vector<float> > &btlbrs)
160
+ {
161
+ vector<vector<float> > ious;
162
+ if (atlbrs.size()*btlbrs.size() == 0)
163
+ return ious;
164
+
165
+ ious.resize(atlbrs.size());
166
+ for (int i = 0; i < ious.size(); i++)
167
+ {
168
+ ious[i].resize(btlbrs.size());
169
+ }
170
+
171
+ //bbox_ious
172
+ for (int k = 0; k < btlbrs.size(); k++)
173
+ {
174
+ vector<float> ious_tmp;
175
+ float box_area = (btlbrs[k][2] - btlbrs[k][0] + 1)*(btlbrs[k][3] - btlbrs[k][1] + 1);
176
+ for (int n = 0; n < atlbrs.size(); n++)
177
+ {
178
+ float iw = min(atlbrs[n][2], btlbrs[k][2]) - max(atlbrs[n][0], btlbrs[k][0]) + 1;
179
+ if (iw > 0)
180
+ {
181
+ float ih = min(atlbrs[n][3], btlbrs[k][3]) - max(atlbrs[n][1], btlbrs[k][1]) + 1;
182
+ if(ih > 0)
183
+ {
184
+ float ua = (atlbrs[n][2] - atlbrs[n][0] + 1)*(atlbrs[n][3] - atlbrs[n][1] + 1) + box_area - iw * ih;
185
+ ious[n][k] = iw * ih / ua;
186
+ }
187
+ else
188
+ {
189
+ ious[n][k] = 0.0;
190
+ }
191
+ }
192
+ else
193
+ {
194
+ ious[n][k] = 0.0;
195
+ }
196
+ }
197
+ }
198
+
199
+ return ious;
200
+ }
201
+
202
+ vector<vector<float> > BYTETracker::iou_distance(vector<STrack*> &atracks, vector<STrack> &btracks, int &dist_size, int &dist_size_size)
203
+ {
204
+ vector<vector<float> > cost_matrix;
205
+ if (atracks.size() * btracks.size() == 0)
206
+ {
207
+ dist_size = atracks.size();
208
+ dist_size_size = btracks.size();
209
+ return cost_matrix;
210
+ }
211
+ vector<vector<float> > atlbrs, btlbrs;
212
+ for (int i = 0; i < atracks.size(); i++)
213
+ {
214
+ atlbrs.push_back(atracks[i]->tlbr);
215
+ }
216
+ for (int i = 0; i < btracks.size(); i++)
217
+ {
218
+ btlbrs.push_back(btracks[i].tlbr);
219
+ }
220
+
221
+ dist_size = atracks.size();
222
+ dist_size_size = btracks.size();
223
+
224
+ vector<vector<float> > _ious = ious(atlbrs, btlbrs);
225
+
226
+ for (int i = 0; i < _ious.size();i++)
227
+ {
228
+ vector<float> _iou;
229
+ for (int j = 0; j < _ious[i].size(); j++)
230
+ {
231
+ _iou.push_back(1 - _ious[i][j]);
232
+ }
233
+ cost_matrix.push_back(_iou);
234
+ }
235
+
236
+ return cost_matrix;
237
+ }
238
+
239
+ vector<vector<float> > BYTETracker::iou_distance(vector<STrack> &atracks, vector<STrack> &btracks)
240
+ {
241
+ vector<vector<float> > atlbrs, btlbrs;
242
+ for (int i = 0; i < atracks.size(); i++)
243
+ {
244
+ atlbrs.push_back(atracks[i].tlbr);
245
+ }
246
+ for (int i = 0; i < btracks.size(); i++)
247
+ {
248
+ btlbrs.push_back(btracks[i].tlbr);
249
+ }
250
+
251
+ vector<vector<float> > _ious = ious(atlbrs, btlbrs);
252
+ vector<vector<float> > cost_matrix;
253
+ for (int i = 0; i < _ious.size(); i++)
254
+ {
255
+ vector<float> _iou;
256
+ for (int j = 0; j < _ious[i].size(); j++)
257
+ {
258
+ _iou.push_back(1 - _ious[i][j]);
259
+ }
260
+ cost_matrix.push_back(_iou);
261
+ }
262
+
263
+ return cost_matrix;
264
+ }
265
+
266
+ double BYTETracker::lapjv(const vector<vector<float> > &cost, vector<int> &rowsol, vector<int> &colsol,
267
+ bool extend_cost, float cost_limit, bool return_cost)
268
+ {
269
+ vector<vector<float> > cost_c;
270
+ cost_c.assign(cost.begin(), cost.end());
271
+
272
+ vector<vector<float> > cost_c_extended;
273
+
274
+ int n_rows = cost.size();
275
+ int n_cols = cost[0].size();
276
+ rowsol.resize(n_rows);
277
+ colsol.resize(n_cols);
278
+
279
+ int n = 0;
280
+ if (n_rows == n_cols)
281
+ {
282
+ n = n_rows;
283
+ }
284
+ else
285
+ {
286
+ if (!extend_cost)
287
+ {
288
+ cout << "set extend_cost=True" << endl;
289
+ system("pause");
290
+ exit(0);
291
+ }
292
+ }
293
+
294
+ if (extend_cost || cost_limit < LONG_MAX)
295
+ {
296
+ n = n_rows + n_cols;
297
+ cost_c_extended.resize(n);
298
+ for (int i = 0; i < cost_c_extended.size(); i++)
299
+ cost_c_extended[i].resize(n);
300
+
301
+ if (cost_limit < LONG_MAX)
302
+ {
303
+ for (int i = 0; i < cost_c_extended.size(); i++)
304
+ {
305
+ for (int j = 0; j < cost_c_extended[i].size(); j++)
306
+ {
307
+ cost_c_extended[i][j] = cost_limit / 2.0;
308
+ }
309
+ }
310
+ }
311
+ else
312
+ {
313
+ float cost_max = -1;
314
+ for (int i = 0; i < cost_c.size(); i++)
315
+ {
316
+ for (int j = 0; j < cost_c[i].size(); j++)
317
+ {
318
+ if (cost_c[i][j] > cost_max)
319
+ cost_max = cost_c[i][j];
320
+ }
321
+ }
322
+ for (int i = 0; i < cost_c_extended.size(); i++)
323
+ {
324
+ for (int j = 0; j < cost_c_extended[i].size(); j++)
325
+ {
326
+ cost_c_extended[i][j] = cost_max + 1;
327
+ }
328
+ }
329
+ }
330
+
331
+ for (int i = n_rows; i < cost_c_extended.size(); i++)
332
+ {
333
+ for (int j = n_cols; j < cost_c_extended[i].size(); j++)
334
+ {
335
+ cost_c_extended[i][j] = 0;
336
+ }
337
+ }
338
+ for (int i = 0; i < n_rows; i++)
339
+ {
340
+ for (int j = 0; j < n_cols; j++)
341
+ {
342
+ cost_c_extended[i][j] = cost_c[i][j];
343
+ }
344
+ }
345
+
346
+ cost_c.clear();
347
+ cost_c.assign(cost_c_extended.begin(), cost_c_extended.end());
348
+ }
349
+
350
+ double **cost_ptr;
351
+ cost_ptr = new double *[sizeof(double *) * n];
352
+ for (int i = 0; i < n; i++)
353
+ cost_ptr[i] = new double[sizeof(double) * n];
354
+
355
+ for (int i = 0; i < n; i++)
356
+ {
357
+ for (int j = 0; j < n; j++)
358
+ {
359
+ cost_ptr[i][j] = cost_c[i][j];
360
+ }
361
+ }
362
+
363
+ int* x_c = new int[sizeof(int) * n];
364
+ int *y_c = new int[sizeof(int) * n];
365
+
366
+ int ret = lapjv_internal(n, cost_ptr, x_c, y_c);
367
+ if (ret != 0)
368
+ {
369
+ cout << "Calculate Wrong!" << endl;
370
+ system("pause");
371
+ exit(0);
372
+ }
373
+
374
+ double opt = 0.0;
375
+
376
+ if (n != n_rows)
377
+ {
378
+ for (int i = 0; i < n; i++)
379
+ {
380
+ if (x_c[i] >= n_cols)
381
+ x_c[i] = -1;
382
+ if (y_c[i] >= n_rows)
383
+ y_c[i] = -1;
384
+ }
385
+ for (int i = 0; i < n_rows; i++)
386
+ {
387
+ rowsol[i] = x_c[i];
388
+ }
389
+ for (int i = 0; i < n_cols; i++)
390
+ {
391
+ colsol[i] = y_c[i];
392
+ }
393
+
394
+ if (return_cost)
395
+ {
396
+ for (int i = 0; i < rowsol.size(); i++)
397
+ {
398
+ if (rowsol[i] != -1)
399
+ {
400
+ //cout << i << "\t" << rowsol[i] << "\t" << cost_ptr[i][rowsol[i]] << endl;
401
+ opt += cost_ptr[i][rowsol[i]];
402
+ }
403
+ }
404
+ }
405
+ }
406
+ else if (return_cost)
407
+ {
408
+ for (int i = 0; i < rowsol.size(); i++)
409
+ {
410
+ opt += cost_ptr[i][rowsol[i]];
411
+ }
412
+ }
413
+
414
+ for (int i = 0; i < n; i++)
415
+ {
416
+ delete[]cost_ptr[i];
417
+ }
418
+ delete[]cost_ptr;
419
+ delete[]x_c;
420
+ delete[]y_c;
421
+
422
+ return opt;
423
+ }
424
+
425
+ Scalar BYTETracker::get_color(int idx)
426
+ {
427
+ idx += 3;
428
+ return Scalar(37 * idx % 255, 17 * idx % 255, 29 * idx % 255);
429
+ }
deploy/TensorRT/python/README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ByteTrack-TensorRT in Python
2
+
3
+ ## Install TensorRT Toolkit
4
+ Please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) and [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt) to install TensorRT (Version 7 recommended) and torch2trt.
5
+
6
+ ## Convert model
7
+
8
+ You can convert the Pytorch model “bytetrack_s_mot17” to TensorRT model by running:
9
+
10
+ ```shell
11
+ cd <ByteTrack_HOME>
12
+ python3 tools/trt.py -f exps/example/mot/yolox_s_mix_det.py -c pretrained/bytetrack_s_mot17.pth.tar
13
+ ```
14
+
15
+ ## Run TensorRT demo
16
+
17
+ You can use the converted model_trt.pth to run TensorRT demo with **130 FPS**:
18
+
19
+ ```shell
20
+ cd <ByteTrack_HOME>
21
+ python3 tools/demo_track.py video -f exps/example/mot/yolox_s_mix_det.py --trt --save_result
22
+ ```
deploy/ncnn/cpp/CMakeLists.txt ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ macro(ncnn_add_example name)
2
+ add_executable(${name} ${name}.cpp)
3
+ if(OpenCV_FOUND)
4
+ target_include_directories(${name} PRIVATE ${OpenCV_INCLUDE_DIRS})
5
+ target_link_libraries(${name} PRIVATE ncnn ${OpenCV_LIBS})
6
+ elseif(NCNN_SIMPLEOCV)
7
+ target_compile_definitions(${name} PUBLIC USE_NCNN_SIMPLEOCV)
8
+ target_link_libraries(${name} PRIVATE ncnn)
9
+ endif()
10
+
11
+ # add test to a virtual project group
12
+ set_property(TARGET ${name} PROPERTY FOLDER "examples")
13
+ endmacro()
14
+
15
+ if(NCNN_PIXEL)
16
+ find_package(OpenCV QUIET COMPONENTS opencv_world)
17
+ # for opencv 2.4 on ubuntu 16.04, there is no opencv_world but OpenCV_FOUND will be TRUE
18
+ if("${OpenCV_LIBS}" STREQUAL "")
19
+ set(OpenCV_FOUND FALSE)
20
+ endif()
21
+ if(NOT OpenCV_FOUND)
22
+ find_package(OpenCV QUIET COMPONENTS core highgui imgproc imgcodecs videoio)
23
+ endif()
24
+ if(NOT OpenCV_FOUND)
25
+ find_package(OpenCV QUIET COMPONENTS core highgui imgproc)
26
+ endif()
27
+
28
+ if(OpenCV_FOUND OR NCNN_SIMPLEOCV)
29
+ if(OpenCV_FOUND)
30
+ message(STATUS "OpenCV library: ${OpenCV_INSTALL_PATH}")
31
+ message(STATUS " version: ${OpenCV_VERSION}")
32
+ message(STATUS " libraries: ${OpenCV_LIBS}")
33
+ message(STATUS " include path: ${OpenCV_INCLUDE_DIRS}")
34
+
35
+ if(${OpenCV_VERSION_MAJOR} GREATER 3)
36
+ set(CMAKE_CXX_STANDARD 11)
37
+ endif()
38
+ endif()
39
+
40
+ include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../src)
41
+ include_directories(${CMAKE_CURRENT_BINARY_DIR}/../src)
42
+ include_directories(include)
43
+ include_directories(/usr/local/include/eigen3)
44
+
45
+ ncnn_add_example(squeezenet)
46
+ ncnn_add_example(squeezenet_c_api)
47
+ ncnn_add_example(fasterrcnn)
48
+ ncnn_add_example(rfcn)
49
+ ncnn_add_example(yolov2)
50
+ ncnn_add_example(yolov3)
51
+ if(OpenCV_FOUND)
52
+ ncnn_add_example(yolov4)
53
+ endif()
54
+ ncnn_add_example(yolov5)
55
+ ncnn_add_example(yolox)
56
+ ncnn_add_example(mobilenetv2ssdlite)
57
+ ncnn_add_example(mobilenetssd)
58
+ ncnn_add_example(squeezenetssd)
59
+ ncnn_add_example(shufflenetv2)
60
+ ncnn_add_example(peleenetssd_seg)
61
+ ncnn_add_example(simplepose)
62
+ ncnn_add_example(retinaface)
63
+ ncnn_add_example(yolact)
64
+ ncnn_add_example(nanodet)
65
+ ncnn_add_example(scrfd)
66
+ ncnn_add_example(scrfd_crowdhuman)
67
+ ncnn_add_example(rvm)
68
+ file(GLOB My_Source_Files src/*.cpp)
69
+ add_executable(bytetrack ${My_Source_Files})
70
+ if(OpenCV_FOUND)
71
+ target_include_directories(bytetrack PRIVATE ${OpenCV_INCLUDE_DIRS})
72
+ target_link_libraries(bytetrack PRIVATE ncnn ${OpenCV_LIBS})
73
+ elseif(NCNN_SIMPLEOCV)
74
+ target_compile_definitions(bytetrack PUBLIC USE_NCNN_SIMPLEOCV)
75
+ target_link_libraries(bytetrack PRIVATE ncnn)
76
+ endif()
77
+ # add test to a virtual project group
78
+ set_property(TARGET bytetrack PROPERTY FOLDER "examples")
79
+ else()
80
+ message(WARNING "OpenCV not found and NCNN_SIMPLEOCV disabled, examples won't be built")
81
+ endif()
82
+ else()
83
+ message(WARNING "NCNN_PIXEL not enabled, examples won't be built")
84
+ endif()
deploy/ncnn/cpp/README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ByteTrack-CPP-ncnn
2
+
3
+ ## Installation
4
+
5
+ Clone [ncnn](https://github.com/Tencent/ncnn) first, then please following [build tutorial of ncnn](https://github.com/Tencent/ncnn/wiki/how-to-build) to build on your own device.
6
+
7
+ Install eigen-3.3.9 [[google]](https://drive.google.com/file/d/1rqO74CYCNrmRAg8Rra0JP3yZtJ-rfket/view?usp=sharing), [[baidu(code:ueq4)]](https://pan.baidu.com/s/15kEfCxpy-T7tz60msxxExg).
8
+
9
+ ```shell
10
+ unzip eigen-3.3.9.zip
11
+ cd eigen-3.3.9
12
+ mkdir build
13
+ cd build
14
+ cmake ..
15
+ sudo make install
16
+ ```
17
+
18
+ ## Generate onnx file
19
+ Use provided tools to generate onnx file.
20
+ For example, if you want to generate onnx file of bytetrack_s_mot17.pth, please run the following command:
21
+ ```shell
22
+ cd <ByteTrack_HOME>
23
+ python3 tools/export_onnx.py -f exps/example/mot/yolox_s_mix_det.py -c pretrained/bytetrack_s_mot17.pth.tar
24
+ ```
25
+ Then, a bytetrack_s.onnx file is generated under <ByteTrack_HOME>.
26
+
27
+ ## Generate ncnn param and bin file
28
+ Put bytetrack_s.onnx under ncnn/build/tools/onnx and then run:
29
+
30
+ ```shell
31
+ cd ncnn/build/tools/onnx
32
+ ./onnx2ncnn bytetrack_s.onnx bytetrack_s.param bytetrack_s.bin
33
+ ```
34
+
35
+ Since Focus module is not supported in ncnn. Warnings like:
36
+ ```shell
37
+ Unsupported slice step !
38
+ ```
39
+ will be printed. However, don't worry! C++ version of Focus layer is already implemented in src/bytetrack.cpp.
40
+
41
+ ## Modify param file
42
+ Open **bytetrack_s.param**, and modify it.
43
+ Before (just an example):
44
+ ```
45
+ 235 268
46
+ Input images 0 1 images
47
+ Split splitncnn_input0 1 4 images images_splitncnn_0 images_splitncnn_1 images_splitncnn_2 images_splitncnn_3
48
+ Crop Slice_4 1 1 images_splitncnn_3 467 -23309=1,0 -23310=1,2147483647 -23311=1,1
49
+ Crop Slice_9 1 1 467 472 -23309=1,0 -23310=1,2147483647 -23311=1,2
50
+ Crop Slice_14 1 1 images_splitncnn_2 477 -23309=1,0 -23310=1,2147483647 -23311=1,1
51
+ Crop Slice_19 1 1 477 482 -23309=1,1 -23310=1,2147483647 -23311=1,2
52
+ Crop Slice_24 1 1 images_splitncnn_1 487 -23309=1,1 -23310=1,2147483647 -23311=1,1
53
+ Crop Slice_29 1 1 487 492 -23309=1,0 -23310=1,2147483647 -23311=1,2
54
+ Crop Slice_34 1 1 images_splitncnn_0 497 -23309=1,1 -23310=1,2147483647 -23311=1,1
55
+ Crop Slice_39 1 1 497 502 -23309=1,1 -23310=1,2147483647 -23311=1,2
56
+ Concat Concat_40 4 1 472 492 482 502 503 0=0
57
+ ...
58
+ ```
59
+ * Change first number for 235 to 235 - 9 = 226(since we will remove 10 layers and add 1 layers, total layers number should minus 9).
60
+ * Then remove 10 lines of code from Split to Concat, but remember the last but 2nd number: 503.
61
+ * Add YoloV5Focus layer After Input (using previous number 503):
62
+ ```
63
+ YoloV5Focus focus 1 1 images 503
64
+ ```
65
+ After(just an exmaple):
66
+ ```
67
+ 226 328
68
+ Input images 0 1 images
69
+ YoloV5Focus focus 1 1 images 503
70
+ ...
71
+ ```
72
+
73
+ ## Use ncnn_optimize to generate new param and bin
74
+ ```shell
75
+ # suppose you are still under ncnn/build/tools/onnx dir.
76
+ ../ncnnoptimize bytetrack_s.param bytetrack_s.bin bytetrack_s_op.param bytetrack_s_op.bin 65536
77
+ ```
78
+
79
+ ## Copy files and build ByteTrack
80
+ Copy or move 'src', 'include' folders and 'CMakeLists.txt' file into ncnn/examples. Copy bytetrack_s_op.param, bytetrack_s_op.bin and <ByteTrack_HOME>/videos/palace.mp4 into ncnn/build/examples. Then, build ByteTrack:
81
+
82
+ ```shell
83
+ cd ncnn/build/examples
84
+ cmake ..
85
+ make
86
+ ```
87
+
88
+ ## Run the demo
89
+ You can run the ncnn demo with **5 FPS** (96-core Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz):
90
+ ```shell
91
+ ./bytetrack palace.mp4
92
+ ```
93
+
94
+ You can modify 'num_threads' to optimize the running speed in [bytetrack.cpp](https://github.com/ifzhang/ByteTrack/blob/2e9a67895da6b47b948015f6861bba0bacd4e72f/deploy/ncnn/cpp/src/bytetrack.cpp#L309) according to the number of your CPU cores:
95
+
96
+ ```
97
+ yolox.opt.num_threads = 20;
98
+ ```
99
+
100
+
101
+ ## Acknowledgement
102
+
103
+ * [ncnn](https://github.com/Tencent/ncnn)
deploy/ncnn/cpp/include/BYTETracker.h ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include "STrack.h"
4
+
5
+ struct Object
6
+ {
7
+ cv::Rect_<float> rect;
8
+ int label;
9
+ float prob;
10
+ };
11
+
12
+ class BYTETracker
13
+ {
14
+ public:
15
+ BYTETracker(int frame_rate = 30, int track_buffer = 30);
16
+ ~BYTETracker();
17
+
18
+ vector<STrack> update(const vector<Object>& objects);
19
+ Scalar get_color(int idx);
20
+
21
+ private:
22
+ vector<STrack*> joint_stracks(vector<STrack*> &tlista, vector<STrack> &tlistb);
23
+ vector<STrack> joint_stracks(vector<STrack> &tlista, vector<STrack> &tlistb);
24
+
25
+ vector<STrack> sub_stracks(vector<STrack> &tlista, vector<STrack> &tlistb);
26
+ void remove_duplicate_stracks(vector<STrack> &resa, vector<STrack> &resb, vector<STrack> &stracksa, vector<STrack> &stracksb);
27
+
28
+ void linear_assignment(vector<vector<float> > &cost_matrix, int cost_matrix_size, int cost_matrix_size_size, float thresh,
29
+ vector<vector<int> > &matches, vector<int> &unmatched_a, vector<int> &unmatched_b);
30
+ vector<vector<float> > iou_distance(vector<STrack*> &atracks, vector<STrack> &btracks, int &dist_size, int &dist_size_size);
31
+ vector<vector<float> > iou_distance(vector<STrack> &atracks, vector<STrack> &btracks);
32
+ vector<vector<float> > ious(vector<vector<float> > &atlbrs, vector<vector<float> > &btlbrs);
33
+
34
+ double lapjv(const vector<vector<float> > &cost, vector<int> &rowsol, vector<int> &colsol,
35
+ bool extend_cost = false, float cost_limit = LONG_MAX, bool return_cost = true);
36
+
37
+ private:
38
+
39
+ float track_thresh;
40
+ float high_thresh;
41
+ float match_thresh;
42
+ int frame_id;
43
+ int max_time_lost;
44
+
45
+ vector<STrack> tracked_stracks;
46
+ vector<STrack> lost_stracks;
47
+ vector<STrack> removed_stracks;
48
+ byte_kalman::KalmanFilter kalman_filter;
49
+ };
deploy/ncnn/cpp/include/STrack.h ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include <opencv2/opencv.hpp>
4
+ #include "kalmanFilter.h"
5
+
6
+ using namespace cv;
7
+ using namespace std;
8
+
9
+ enum TrackState { New = 0, Tracked, Lost, Removed };
10
+
11
+ class STrack
12
+ {
13
+ public:
14
+ STrack(vector<float> tlwh_, float score);
15
+ ~STrack();
16
+
17
+ vector<float> static tlbr_to_tlwh(vector<float> &tlbr);
18
+ void static multi_predict(vector<STrack*> &stracks, byte_kalman::KalmanFilter &kalman_filter);
19
+ void static_tlwh();
20
+ void static_tlbr();
21
+ vector<float> tlwh_to_xyah(vector<float> tlwh_tmp);
22
+ vector<float> to_xyah();
23
+ void mark_lost();
24
+ void mark_removed();
25
+ int next_id();
26
+ int end_frame();
27
+
28
+ void activate(byte_kalman::KalmanFilter &kalman_filter, int frame_id);
29
+ void re_activate(STrack &new_track, int frame_id, bool new_id = false);
30
+ void update(STrack &new_track, int frame_id);
31
+
32
+ public:
33
+ bool is_activated;
34
+ int track_id;
35
+ int state;
36
+
37
+ vector<float> _tlwh;
38
+ vector<float> tlwh;
39
+ vector<float> tlbr;
40
+ int frame_id;
41
+ int tracklet_len;
42
+ int start_frame;
43
+
44
+ KAL_MEAN mean;
45
+ KAL_COVA covariance;
46
+ float score;
47
+
48
+ private:
49
+ byte_kalman::KalmanFilter kalman_filter;
50
+ };
deploy/ncnn/cpp/include/dataType.h ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include <cstddef>
4
+ #include <vector>
5
+
6
+ #include <Eigen/Core>
7
+ #include <Eigen/Dense>
8
+ typedef Eigen::Matrix<float, 1, 4, Eigen::RowMajor> DETECTBOX;
9
+ typedef Eigen::Matrix<float, -1, 4, Eigen::RowMajor> DETECTBOXSS;
10
+ typedef Eigen::Matrix<float, 1, 128, Eigen::RowMajor> FEATURE;
11
+ typedef Eigen::Matrix<float, Eigen::Dynamic, 128, Eigen::RowMajor> FEATURESS;
12
+ //typedef std::vector<FEATURE> FEATURESS;
13
+
14
+ //Kalmanfilter
15
+ //typedef Eigen::Matrix<float, 8, 8, Eigen::RowMajor> KAL_FILTER;
16
+ typedef Eigen::Matrix<float, 1, 8, Eigen::RowMajor> KAL_MEAN;
17
+ typedef Eigen::Matrix<float, 8, 8, Eigen::RowMajor> KAL_COVA;
18
+ typedef Eigen::Matrix<float, 1, 4, Eigen::RowMajor> KAL_HMEAN;
19
+ typedef Eigen::Matrix<float, 4, 4, Eigen::RowMajor> KAL_HCOVA;
20
+ using KAL_DATA = std::pair<KAL_MEAN, KAL_COVA>;
21
+ using KAL_HDATA = std::pair<KAL_HMEAN, KAL_HCOVA>;
22
+
23
+ //main
24
+ using RESULT_DATA = std::pair<int, DETECTBOX>;
25
+
26
+ //tracker:
27
+ using TRACKER_DATA = std::pair<int, FEATURESS>;
28
+ using MATCH_DATA = std::pair<int, int>;
29
+ typedef struct t {
30
+ std::vector<MATCH_DATA> matches;
31
+ std::vector<int> unmatched_tracks;
32
+ std::vector<int> unmatched_detections;
33
+ }TRACHER_MATCHD;
34
+
35
+ //linear_assignment:
36
+ typedef Eigen::Matrix<float, -1, -1, Eigen::RowMajor> DYNAMICM;
deploy/ncnn/cpp/include/kalmanFilter.h ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #pragma once
2
+
3
+ #include "dataType.h"
4
+
5
+ namespace byte_kalman
6
+ {
7
+ class KalmanFilter
8
+ {
9
+ public:
10
+ static const double chi2inv95[10];
11
+ KalmanFilter();
12
+ KAL_DATA initiate(const DETECTBOX& measurement);
13
+ void predict(KAL_MEAN& mean, KAL_COVA& covariance);
14
+ KAL_HDATA project(const KAL_MEAN& mean, const KAL_COVA& covariance);
15
+ KAL_DATA update(const KAL_MEAN& mean,
16
+ const KAL_COVA& covariance,
17
+ const DETECTBOX& measurement);
18
+
19
+ Eigen::Matrix<float, 1, -1> gating_distance(
20
+ const KAL_MEAN& mean,
21
+ const KAL_COVA& covariance,
22
+ const std::vector<DETECTBOX>& measurements,
23
+ bool only_position = false);
24
+
25
+ private:
26
+ Eigen::Matrix<float, 8, 8, Eigen::RowMajor> _motion_mat;
27
+ Eigen::Matrix<float, 4, 8, Eigen::RowMajor> _update_mat;
28
+ float _std_weight_position;
29
+ float _std_weight_velocity;
30
+ };
31
+ }
deploy/ncnn/cpp/include/lapjv.h ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #ifndef LAPJV_H
2
+ #define LAPJV_H
3
+
4
+ #define LARGE 1000000
5
+
6
+ #if !defined TRUE
7
+ #define TRUE 1
8
+ #endif
9
+ #if !defined FALSE
10
+ #define FALSE 0
11
+ #endif
12
+
13
+ #define NEW(x, t, n) if ((x = (t *)malloc(sizeof(t) * (n))) == 0) { return -1; }
14
+ #define FREE(x) if (x != 0) { free(x); x = 0; }
15
+ #define SWAP_INDICES(a, b) { int_t _temp_index = a; a = b; b = _temp_index; }
16
+
17
+ #if 0
18
+ #include <assert.h>
19
+ #define ASSERT(cond) assert(cond)
20
+ #define PRINTF(fmt, ...) printf(fmt, ##__VA_ARGS__)
21
+ #define PRINT_COST_ARRAY(a, n) \
22
+ while (1) { \
23
+ printf(#a" = ["); \
24
+ if ((n) > 0) { \
25
+ printf("%f", (a)[0]); \
26
+ for (uint_t j = 1; j < n; j++) { \
27
+ printf(", %f", (a)[j]); \
28
+ } \
29
+ } \
30
+ printf("]\n"); \
31
+ break; \
32
+ }
33
+ #define PRINT_INDEX_ARRAY(a, n) \
34
+ while (1) { \
35
+ printf(#a" = ["); \
36
+ if ((n) > 0) { \
37
+ printf("%d", (a)[0]); \
38
+ for (uint_t j = 1; j < n; j++) { \
39
+ printf(", %d", (a)[j]); \
40
+ } \
41
+ } \
42
+ printf("]\n"); \
43
+ break; \
44
+ }
45
+ #else
46
+ #define ASSERT(cond)
47
+ #define PRINTF(fmt, ...)
48
+ #define PRINT_COST_ARRAY(a, n)
49
+ #define PRINT_INDEX_ARRAY(a, n)
50
+ #endif
51
+
52
+
53
+ typedef signed int int_t;
54
+ typedef unsigned int uint_t;
55
+ typedef double cost_t;
56
+ typedef char boolean;
57
+ typedef enum fp_t { FP_1 = 1, FP_2 = 2, FP_DYNAMIC = 3 } fp_t;
58
+
59
+ extern int_t lapjv_internal(
60
+ const uint_t n, cost_t *cost[],
61
+ int_t *x, int_t *y);
62
+
63
+ #endif // LAPJV_H
deploy/ncnn/cpp/src/BYTETracker.cpp ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "BYTETracker.h"
2
+ #include <fstream>
3
+
4
+ BYTETracker::BYTETracker(int frame_rate, int track_buffer)
5
+ {
6
+ track_thresh = 0.5;
7
+ high_thresh = 0.6;
8
+ match_thresh = 0.8;
9
+
10
+ frame_id = 0;
11
+ max_time_lost = int(frame_rate / 30.0 * track_buffer);
12
+ cout << "Init ByteTrack!" << endl;
13
+ }
14
+
15
+ BYTETracker::~BYTETracker()
16
+ {
17
+ }
18
+
19
+ vector<STrack> BYTETracker::update(const vector<Object>& objects)
20
+ {
21
+
22
+ ////////////////// Step 1: Get detections //////////////////
23
+ this->frame_id++;
24
+ vector<STrack> activated_stracks;
25
+ vector<STrack> refind_stracks;
26
+ vector<STrack> removed_stracks;
27
+ vector<STrack> lost_stracks;
28
+ vector<STrack> detections;
29
+ vector<STrack> detections_low;
30
+
31
+ vector<STrack> detections_cp;
32
+ vector<STrack> tracked_stracks_swap;
33
+ vector<STrack> resa, resb;
34
+ vector<STrack> output_stracks;
35
+
36
+ vector<STrack*> unconfirmed;
37
+ vector<STrack*> tracked_stracks;
38
+ vector<STrack*> strack_pool;
39
+ vector<STrack*> r_tracked_stracks;
40
+
41
+ if (objects.size() > 0)
42
+ {
43
+ for (int i = 0; i < objects.size(); i++)
44
+ {
45
+ vector<float> tlbr_;
46
+ tlbr_.resize(4);
47
+ tlbr_[0] = objects[i].rect.x;
48
+ tlbr_[1] = objects[i].rect.y;
49
+ tlbr_[2] = objects[i].rect.x + objects[i].rect.width;
50
+ tlbr_[3] = objects[i].rect.y + objects[i].rect.height;
51
+
52
+ float score = objects[i].prob;
53
+
54
+ STrack strack(STrack::tlbr_to_tlwh(tlbr_), score);
55
+ if (score >= track_thresh)
56
+ {
57
+ detections.push_back(strack);
58
+ }
59
+ else
60
+ {
61
+ detections_low.push_back(strack);
62
+ }
63
+
64
+ }
65
+ }
66
+
67
+ // Add newly detected tracklets to tracked_stracks
68
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
69
+ {
70
+ if (!this->tracked_stracks[i].is_activated)
71
+ unconfirmed.push_back(&this->tracked_stracks[i]);
72
+ else
73
+ tracked_stracks.push_back(&this->tracked_stracks[i]);
74
+ }
75
+
76
+ ////////////////// Step 2: First association, with IoU //////////////////
77
+ strack_pool = joint_stracks(tracked_stracks, this->lost_stracks);
78
+ STrack::multi_predict(strack_pool, this->kalman_filter);
79
+
80
+ vector<vector<float> > dists;
81
+ int dist_size = 0, dist_size_size = 0;
82
+ dists = iou_distance(strack_pool, detections, dist_size, dist_size_size);
83
+
84
+ vector<vector<int> > matches;
85
+ vector<int> u_track, u_detection;
86
+ linear_assignment(dists, dist_size, dist_size_size, match_thresh, matches, u_track, u_detection);
87
+
88
+ for (int i = 0; i < matches.size(); i++)
89
+ {
90
+ STrack *track = strack_pool[matches[i][0]];
91
+ STrack *det = &detections[matches[i][1]];
92
+ if (track->state == TrackState::Tracked)
93
+ {
94
+ track->update(*det, this->frame_id);
95
+ activated_stracks.push_back(*track);
96
+ }
97
+ else
98
+ {
99
+ track->re_activate(*det, this->frame_id, false);
100
+ refind_stracks.push_back(*track);
101
+ }
102
+ }
103
+
104
+ ////////////////// Step 3: Second association, using low score dets //////////////////
105
+ for (int i = 0; i < u_detection.size(); i++)
106
+ {
107
+ detections_cp.push_back(detections[u_detection[i]]);
108
+ }
109
+ detections.clear();
110
+ detections.assign(detections_low.begin(), detections_low.end());
111
+
112
+ for (int i = 0; i < u_track.size(); i++)
113
+ {
114
+ if (strack_pool[u_track[i]]->state == TrackState::Tracked)
115
+ {
116
+ r_tracked_stracks.push_back(strack_pool[u_track[i]]);
117
+ }
118
+ }
119
+
120
+ dists.clear();
121
+ dists = iou_distance(r_tracked_stracks, detections, dist_size, dist_size_size);
122
+
123
+ matches.clear();
124
+ u_track.clear();
125
+ u_detection.clear();
126
+ linear_assignment(dists, dist_size, dist_size_size, 0.5, matches, u_track, u_detection);
127
+
128
+ for (int i = 0; i < matches.size(); i++)
129
+ {
130
+ STrack *track = r_tracked_stracks[matches[i][0]];
131
+ STrack *det = &detections[matches[i][1]];
132
+ if (track->state == TrackState::Tracked)
133
+ {
134
+ track->update(*det, this->frame_id);
135
+ activated_stracks.push_back(*track);
136
+ }
137
+ else
138
+ {
139
+ track->re_activate(*det, this->frame_id, false);
140
+ refind_stracks.push_back(*track);
141
+ }
142
+ }
143
+
144
+ for (int i = 0; i < u_track.size(); i++)
145
+ {
146
+ STrack *track = r_tracked_stracks[u_track[i]];
147
+ if (track->state != TrackState::Lost)
148
+ {
149
+ track->mark_lost();
150
+ lost_stracks.push_back(*track);
151
+ }
152
+ }
153
+
154
+ // Deal with unconfirmed tracks, usually tracks with only one beginning frame
155
+ detections.clear();
156
+ detections.assign(detections_cp.begin(), detections_cp.end());
157
+
158
+ dists.clear();
159
+ dists = iou_distance(unconfirmed, detections, dist_size, dist_size_size);
160
+
161
+ matches.clear();
162
+ vector<int> u_unconfirmed;
163
+ u_detection.clear();
164
+ linear_assignment(dists, dist_size, dist_size_size, 0.7, matches, u_unconfirmed, u_detection);
165
+
166
+ for (int i = 0; i < matches.size(); i++)
167
+ {
168
+ unconfirmed[matches[i][0]]->update(detections[matches[i][1]], this->frame_id);
169
+ activated_stracks.push_back(*unconfirmed[matches[i][0]]);
170
+ }
171
+
172
+ for (int i = 0; i < u_unconfirmed.size(); i++)
173
+ {
174
+ STrack *track = unconfirmed[u_unconfirmed[i]];
175
+ track->mark_removed();
176
+ removed_stracks.push_back(*track);
177
+ }
178
+
179
+ ////////////////// Step 4: Init new stracks //////////////////
180
+ for (int i = 0; i < u_detection.size(); i++)
181
+ {
182
+ STrack *track = &detections[u_detection[i]];
183
+ if (track->score < this->high_thresh)
184
+ continue;
185
+ track->activate(this->kalman_filter, this->frame_id);
186
+ activated_stracks.push_back(*track);
187
+ }
188
+
189
+ ////////////////// Step 5: Update state //////////////////
190
+ for (int i = 0; i < this->lost_stracks.size(); i++)
191
+ {
192
+ if (this->frame_id - this->lost_stracks[i].end_frame() > this->max_time_lost)
193
+ {
194
+ this->lost_stracks[i].mark_removed();
195
+ removed_stracks.push_back(this->lost_stracks[i]);
196
+ }
197
+ }
198
+
199
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
200
+ {
201
+ if (this->tracked_stracks[i].state == TrackState::Tracked)
202
+ {
203
+ tracked_stracks_swap.push_back(this->tracked_stracks[i]);
204
+ }
205
+ }
206
+ this->tracked_stracks.clear();
207
+ this->tracked_stracks.assign(tracked_stracks_swap.begin(), tracked_stracks_swap.end());
208
+
209
+ this->tracked_stracks = joint_stracks(this->tracked_stracks, activated_stracks);
210
+ this->tracked_stracks = joint_stracks(this->tracked_stracks, refind_stracks);
211
+
212
+ //std::cout << activated_stracks.size() << std::endl;
213
+
214
+ this->lost_stracks = sub_stracks(this->lost_stracks, this->tracked_stracks);
215
+ for (int i = 0; i < lost_stracks.size(); i++)
216
+ {
217
+ this->lost_stracks.push_back(lost_stracks[i]);
218
+ }
219
+
220
+ this->lost_stracks = sub_stracks(this->lost_stracks, this->removed_stracks);
221
+ for (int i = 0; i < removed_stracks.size(); i++)
222
+ {
223
+ this->removed_stracks.push_back(removed_stracks[i]);
224
+ }
225
+
226
+ remove_duplicate_stracks(resa, resb, this->tracked_stracks, this->lost_stracks);
227
+
228
+ this->tracked_stracks.clear();
229
+ this->tracked_stracks.assign(resa.begin(), resa.end());
230
+ this->lost_stracks.clear();
231
+ this->lost_stracks.assign(resb.begin(), resb.end());
232
+
233
+ for (int i = 0; i < this->tracked_stracks.size(); i++)
234
+ {
235
+ if (this->tracked_stracks[i].is_activated)
236
+ {
237
+ output_stracks.push_back(this->tracked_stracks[i]);
238
+ }
239
+ }
240
+ return output_stracks;
241
+ }
deploy/ncnn/cpp/src/STrack.cpp ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "STrack.h"
2
+
3
+ STrack::STrack(vector<float> tlwh_, float score)
4
+ {
5
+ _tlwh.resize(4);
6
+ _tlwh.assign(tlwh_.begin(), tlwh_.end());
7
+
8
+ is_activated = false;
9
+ track_id = 0;
10
+ state = TrackState::New;
11
+
12
+ tlwh.resize(4);
13
+ tlbr.resize(4);
14
+
15
+ static_tlwh();
16
+ static_tlbr();
17
+ frame_id = 0;
18
+ tracklet_len = 0;
19
+ this->score = score;
20
+ start_frame = 0;
21
+ }
22
+
23
+ STrack::~STrack()
24
+ {
25
+ }
26
+
27
+ void STrack::activate(byte_kalman::KalmanFilter &kalman_filter, int frame_id)
28
+ {
29
+ this->kalman_filter = kalman_filter;
30
+ this->track_id = this->next_id();
31
+
32
+ vector<float> _tlwh_tmp(4);
33
+ _tlwh_tmp[0] = this->_tlwh[0];
34
+ _tlwh_tmp[1] = this->_tlwh[1];
35
+ _tlwh_tmp[2] = this->_tlwh[2];
36
+ _tlwh_tmp[3] = this->_tlwh[3];
37
+ vector<float> xyah = tlwh_to_xyah(_tlwh_tmp);
38
+ DETECTBOX xyah_box;
39
+ xyah_box[0] = xyah[0];
40
+ xyah_box[1] = xyah[1];
41
+ xyah_box[2] = xyah[2];
42
+ xyah_box[3] = xyah[3];
43
+ auto mc = this->kalman_filter.initiate(xyah_box);
44
+ this->mean = mc.first;
45
+ this->covariance = mc.second;
46
+
47
+ static_tlwh();
48
+ static_tlbr();
49
+
50
+ this->tracklet_len = 0;
51
+ this->state = TrackState::Tracked;
52
+ if (frame_id == 1)
53
+ {
54
+ this->is_activated = true;
55
+ }
56
+ //this->is_activated = true;
57
+ this->frame_id = frame_id;
58
+ this->start_frame = frame_id;
59
+ }
60
+
61
+ void STrack::re_activate(STrack &new_track, int frame_id, bool new_id)
62
+ {
63
+ vector<float> xyah = tlwh_to_xyah(new_track.tlwh);
64
+ DETECTBOX xyah_box;
65
+ xyah_box[0] = xyah[0];
66
+ xyah_box[1] = xyah[1];
67
+ xyah_box[2] = xyah[2];
68
+ xyah_box[3] = xyah[3];
69
+ auto mc = this->kalman_filter.update(this->mean, this->covariance, xyah_box);
70
+ this->mean = mc.first;
71
+ this->covariance = mc.second;
72
+
73
+ static_tlwh();
74
+ static_tlbr();
75
+
76
+ this->tracklet_len = 0;
77
+ this->state = TrackState::Tracked;
78
+ this->is_activated = true;
79
+ this->frame_id = frame_id;
80
+ this->score = new_track.score;
81
+ if (new_id)
82
+ this->track_id = next_id();
83
+ }
84
+
85
+ void STrack::update(STrack &new_track, int frame_id)
86
+ {
87
+ this->frame_id = frame_id;
88
+ this->tracklet_len++;
89
+
90
+ vector<float> xyah = tlwh_to_xyah(new_track.tlwh);
91
+ DETECTBOX xyah_box;
92
+ xyah_box[0] = xyah[0];
93
+ xyah_box[1] = xyah[1];
94
+ xyah_box[2] = xyah[2];
95
+ xyah_box[3] = xyah[3];
96
+
97
+ auto mc = this->kalman_filter.update(this->mean, this->covariance, xyah_box);
98
+ this->mean = mc.first;
99
+ this->covariance = mc.second;
100
+
101
+ static_tlwh();
102
+ static_tlbr();
103
+
104
+ this->state = TrackState::Tracked;
105
+ this->is_activated = true;
106
+
107
+ this->score = new_track.score;
108
+ }
109
+
110
+ void STrack::static_tlwh()
111
+ {
112
+ if (this->state == TrackState::New)
113
+ {
114
+ tlwh[0] = _tlwh[0];
115
+ tlwh[1] = _tlwh[1];
116
+ tlwh[2] = _tlwh[2];
117
+ tlwh[3] = _tlwh[3];
118
+ return;
119
+ }
120
+
121
+ tlwh[0] = mean[0];
122
+ tlwh[1] = mean[1];
123
+ tlwh[2] = mean[2];
124
+ tlwh[3] = mean[3];
125
+
126
+ tlwh[2] *= tlwh[3];
127
+ tlwh[0] -= tlwh[2] / 2;
128
+ tlwh[1] -= tlwh[3] / 2;
129
+ }
130
+
131
+ void STrack::static_tlbr()
132
+ {
133
+ tlbr.clear();
134
+ tlbr.assign(tlwh.begin(), tlwh.end());
135
+ tlbr[2] += tlbr[0];
136
+ tlbr[3] += tlbr[1];
137
+ }
138
+
139
+ vector<float> STrack::tlwh_to_xyah(vector<float> tlwh_tmp)
140
+ {
141
+ vector<float> tlwh_output = tlwh_tmp;
142
+ tlwh_output[0] += tlwh_output[2] / 2;
143
+ tlwh_output[1] += tlwh_output[3] / 2;
144
+ tlwh_output[2] /= tlwh_output[3];
145
+ return tlwh_output;
146
+ }
147
+
148
+ vector<float> STrack::to_xyah()
149
+ {
150
+ return tlwh_to_xyah(tlwh);
151
+ }
152
+
153
+ vector<float> STrack::tlbr_to_tlwh(vector<float> &tlbr)
154
+ {
155
+ tlbr[2] -= tlbr[0];
156
+ tlbr[3] -= tlbr[1];
157
+ return tlbr;
158
+ }
159
+
160
+ void STrack::mark_lost()
161
+ {
162
+ state = TrackState::Lost;
163
+ }
164
+
165
+ void STrack::mark_removed()
166
+ {
167
+ state = TrackState::Removed;
168
+ }
169
+
170
+ int STrack::next_id()
171
+ {
172
+ static int _count = 0;
173
+ _count++;
174
+ return _count;
175
+ }
176
+
177
+ int STrack::end_frame()
178
+ {
179
+ return this->frame_id;
180
+ }
181
+
182
+ void STrack::multi_predict(vector<STrack*> &stracks, byte_kalman::KalmanFilter &kalman_filter)
183
+ {
184
+ for (int i = 0; i < stracks.size(); i++)
185
+ {
186
+ if (stracks[i]->state != TrackState::Tracked)
187
+ {
188
+ stracks[i]->mean[7] = 0;
189
+ }
190
+ kalman_filter.predict(stracks[i]->mean, stracks[i]->covariance);
191
+ }
192
+ }
deploy/ncnn/cpp/src/bytetrack.cpp ADDED
@@ -0,0 +1,396 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "layer.h"
2
+ #include "net.h"
3
+
4
+ #if defined(USE_NCNN_SIMPLEOCV)
5
+ #include "simpleocv.h"
6
+ #include <opencv2/opencv.hpp>
7
+ #else
8
+ #include <opencv2/core/core.hpp>
9
+ #include <opencv2/highgui/highgui.hpp>
10
+ #include <opencv2/imgproc/imgproc.hpp>
11
+ #include <opencv2/opencv.hpp>
12
+ #endif
13
+ #include <float.h>
14
+ #include <stdio.h>
15
+ #include <vector>
16
+ #include <chrono>
17
+ #include "BYTETracker.h"
18
+
19
+ #define YOLOX_NMS_THRESH 0.7 // nms threshold
20
+ #define YOLOX_CONF_THRESH 0.1 // threshold of bounding box prob
21
+ #define INPUT_W 1088 // target image size w after resize
22
+ #define INPUT_H 608 // target image size h after resize
23
+
24
+ Mat static_resize(Mat& img) {
25
+ float r = min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
26
+ // r = std::min(r, 1.0f);
27
+ int unpad_w = r * img.cols;
28
+ int unpad_h = r * img.rows;
29
+ Mat re(unpad_h, unpad_w, CV_8UC3);
30
+ resize(img, re, re.size());
31
+ Mat out(INPUT_H, INPUT_W, CV_8UC3, Scalar(114, 114, 114));
32
+ re.copyTo(out(Rect(0, 0, re.cols, re.rows)));
33
+ return out;
34
+ }
35
+
36
+ // YOLOX use the same focus in yolov5
37
+ class YoloV5Focus : public ncnn::Layer
38
+ {
39
+ public:
40
+ YoloV5Focus()
41
+ {
42
+ one_blob_only = true;
43
+ }
44
+
45
+ virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const
46
+ {
47
+ int w = bottom_blob.w;
48
+ int h = bottom_blob.h;
49
+ int channels = bottom_blob.c;
50
+
51
+ int outw = w / 2;
52
+ int outh = h / 2;
53
+ int outc = channels * 4;
54
+
55
+ top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator);
56
+ if (top_blob.empty())
57
+ return -100;
58
+
59
+ #pragma omp parallel for num_threads(opt.num_threads)
60
+ for (int p = 0; p < outc; p++)
61
+ {
62
+ const float* ptr = bottom_blob.channel(p % channels).row((p / channels) % 2) + ((p / channels) / 2);
63
+ float* outptr = top_blob.channel(p);
64
+
65
+ for (int i = 0; i < outh; i++)
66
+ {
67
+ for (int j = 0; j < outw; j++)
68
+ {
69
+ *outptr = *ptr;
70
+
71
+ outptr += 1;
72
+ ptr += 2;
73
+ }
74
+
75
+ ptr += w;
76
+ }
77
+ }
78
+
79
+ return 0;
80
+ }
81
+ };
82
+
83
+ DEFINE_LAYER_CREATOR(YoloV5Focus)
84
+
85
+ struct GridAndStride
86
+ {
87
+ int grid0;
88
+ int grid1;
89
+ int stride;
90
+ };
91
+
92
+ static inline float intersection_area(const Object& a, const Object& b)
93
+ {
94
+ cv::Rect_<float> inter = a.rect & b.rect;
95
+ return inter.area();
96
+ }
97
+
98
+ static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right)
99
+ {
100
+ int i = left;
101
+ int j = right;
102
+ float p = faceobjects[(left + right) / 2].prob;
103
+
104
+ while (i <= j)
105
+ {
106
+ while (faceobjects[i].prob > p)
107
+ i++;
108
+
109
+ while (faceobjects[j].prob < p)
110
+ j--;
111
+
112
+ if (i <= j)
113
+ {
114
+ // swap
115
+ std::swap(faceobjects[i], faceobjects[j]);
116
+
117
+ i++;
118
+ j--;
119
+ }
120
+ }
121
+
122
+ #pragma omp parallel sections
123
+ {
124
+ #pragma omp section
125
+ {
126
+ if (left < j) qsort_descent_inplace(faceobjects, left, j);
127
+ }
128
+ #pragma omp section
129
+ {
130
+ if (i < right) qsort_descent_inplace(faceobjects, i, right);
131
+ }
132
+ }
133
+ }
134
+
135
+ static void qsort_descent_inplace(std::vector<Object>& objects)
136
+ {
137
+ if (objects.empty())
138
+ return;
139
+
140
+ qsort_descent_inplace(objects, 0, objects.size() - 1);
141
+ }
142
+
143
+ static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold)
144
+ {
145
+ picked.clear();
146
+
147
+ const int n = faceobjects.size();
148
+
149
+ std::vector<float> areas(n);
150
+ for (int i = 0; i < n; i++)
151
+ {
152
+ areas[i] = faceobjects[i].rect.area();
153
+ }
154
+
155
+ for (int i = 0; i < n; i++)
156
+ {
157
+ const Object& a = faceobjects[i];
158
+
159
+ int keep = 1;
160
+ for (int j = 0; j < (int)picked.size(); j++)
161
+ {
162
+ const Object& b = faceobjects[picked[j]];
163
+
164
+ // intersection over union
165
+ float inter_area = intersection_area(a, b);
166
+ float union_area = areas[i] + areas[picked[j]] - inter_area;
167
+ // float IoU = inter_area / union_area
168
+ if (inter_area / union_area > nms_threshold)
169
+ keep = 0;
170
+ }
171
+
172
+ if (keep)
173
+ picked.push_back(i);
174
+ }
175
+ }
176
+
177
+ static void generate_grids_and_stride(const int target_w, const int target_h, std::vector<int>& strides, std::vector<GridAndStride>& grid_strides)
178
+ {
179
+ for (int i = 0; i < (int)strides.size(); i++)
180
+ {
181
+ int stride = strides[i];
182
+ int num_grid_w = target_w / stride;
183
+ int num_grid_h = target_h / stride;
184
+ for (int g1 = 0; g1 < num_grid_h; g1++)
185
+ {
186
+ for (int g0 = 0; g0 < num_grid_w; g0++)
187
+ {
188
+ GridAndStride gs;
189
+ gs.grid0 = g0;
190
+ gs.grid1 = g1;
191
+ gs.stride = stride;
192
+ grid_strides.push_back(gs);
193
+ }
194
+ }
195
+ }
196
+ }
197
+
198
+ static void generate_yolox_proposals(std::vector<GridAndStride> grid_strides, const ncnn::Mat& feat_blob, float prob_threshold, std::vector<Object>& objects)
199
+ {
200
+ const int num_grid = feat_blob.h;
201
+ const int num_class = feat_blob.w - 5;
202
+ const int num_anchors = grid_strides.size();
203
+
204
+ const float* feat_ptr = feat_blob.channel(0);
205
+ for (int anchor_idx = 0; anchor_idx < num_anchors; anchor_idx++)
206
+ {
207
+ const int grid0 = grid_strides[anchor_idx].grid0;
208
+ const int grid1 = grid_strides[anchor_idx].grid1;
209
+ const int stride = grid_strides[anchor_idx].stride;
210
+
211
+ // yolox/models/yolo_head.py decode logic
212
+ // outputs[..., :2] = (outputs[..., :2] + grids) * strides
213
+ // outputs[..., 2:4] = torch.exp(outputs[..., 2:4]) * strides
214
+ float x_center = (feat_ptr[0] + grid0) * stride;
215
+ float y_center = (feat_ptr[1] + grid1) * stride;
216
+ float w = exp(feat_ptr[2]) * stride;
217
+ float h = exp(feat_ptr[3]) * stride;
218
+ float x0 = x_center - w * 0.5f;
219
+ float y0 = y_center - h * 0.5f;
220
+
221
+ float box_objectness = feat_ptr[4];
222
+ for (int class_idx = 0; class_idx < num_class; class_idx++)
223
+ {
224
+ float box_cls_score = feat_ptr[5 + class_idx];
225
+ float box_prob = box_objectness * box_cls_score;
226
+ if (box_prob > prob_threshold)
227
+ {
228
+ Object obj;
229
+ obj.rect.x = x0;
230
+ obj.rect.y = y0;
231
+ obj.rect.width = w;
232
+ obj.rect.height = h;
233
+ obj.label = class_idx;
234
+ obj.prob = box_prob;
235
+
236
+ objects.push_back(obj);
237
+ }
238
+
239
+ } // class loop
240
+ feat_ptr += feat_blob.w;
241
+
242
+ } // point anchor loop
243
+ }
244
+
245
+ static int detect_yolox(ncnn::Mat& in_pad, std::vector<Object>& objects, ncnn::Extractor ex, float scale)
246
+ {
247
+
248
+ ex.input("images", in_pad);
249
+
250
+ std::vector<Object> proposals;
251
+
252
+ {
253
+ ncnn::Mat out;
254
+ ex.extract("output", out);
255
+
256
+ static const int stride_arr[] = {8, 16, 32}; // might have stride=64 in YOLOX
257
+ std::vector<int> strides(stride_arr, stride_arr + sizeof(stride_arr) / sizeof(stride_arr[0]));
258
+ std::vector<GridAndStride> grid_strides;
259
+ generate_grids_and_stride(INPUT_W, INPUT_H, strides, grid_strides);
260
+ generate_yolox_proposals(grid_strides, out, YOLOX_CONF_THRESH, proposals);
261
+ }
262
+ // sort all proposals by score from highest to lowest
263
+ qsort_descent_inplace(proposals);
264
+
265
+ // apply nms with nms_threshold
266
+ std::vector<int> picked;
267
+ nms_sorted_bboxes(proposals, picked, YOLOX_NMS_THRESH);
268
+
269
+ int count = picked.size();
270
+
271
+ objects.resize(count);
272
+ for (int i = 0; i < count; i++)
273
+ {
274
+ objects[i] = proposals[picked[i]];
275
+
276
+ // adjust offset to original unpadded
277
+ float x0 = (objects[i].rect.x) / scale;
278
+ float y0 = (objects[i].rect.y) / scale;
279
+ float x1 = (objects[i].rect.x + objects[i].rect.width) / scale;
280
+ float y1 = (objects[i].rect.y + objects[i].rect.height) / scale;
281
+
282
+ // clip
283
+ // x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
284
+ // y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
285
+ // x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
286
+ // y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);
287
+
288
+ objects[i].rect.x = x0;
289
+ objects[i].rect.y = y0;
290
+ objects[i].rect.width = x1 - x0;
291
+ objects[i].rect.height = y1 - y0;
292
+ }
293
+
294
+ return 0;
295
+ }
296
+
297
+ int main(int argc, char** argv)
298
+ {
299
+ if (argc != 2)
300
+ {
301
+ fprintf(stderr, "Usage: %s [videopath]\n", argv[0]);
302
+ return -1;
303
+ }
304
+
305
+ ncnn::Net yolox;
306
+
307
+ //yolox.opt.use_vulkan_compute = true;
308
+ //yolox.opt.use_bf16_storage = true;
309
+ yolox.opt.num_threads = 20;
310
+ //ncnn::set_cpu_powersave(0);
311
+
312
+ //ncnn::set_omp_dynamic(0);
313
+ //ncnn::set_omp_num_threads(20);
314
+
315
+ // Focus in yolov5
316
+ yolox.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator);
317
+
318
+ yolox.load_param("bytetrack_s_op.param");
319
+ yolox.load_model("bytetrack_s_op.bin");
320
+
321
+ ncnn::Extractor ex = yolox.create_extractor();
322
+
323
+ const char* videopath = argv[1];
324
+
325
+ VideoCapture cap(videopath);
326
+ if (!cap.isOpened())
327
+ return 0;
328
+
329
+ int img_w = cap.get(CV_CAP_PROP_FRAME_WIDTH);
330
+ int img_h = cap.get(CV_CAP_PROP_FRAME_HEIGHT);
331
+ int fps = cap.get(CV_CAP_PROP_FPS);
332
+ long nFrame = static_cast<long>(cap.get(CV_CAP_PROP_FRAME_COUNT));
333
+ cout << "Total frames: " << nFrame << endl;
334
+
335
+ VideoWriter writer("demo.mp4", CV_FOURCC('m', 'p', '4', 'v'), fps, Size(img_w, img_h));
336
+
337
+ Mat img;
338
+ BYTETracker tracker(fps, 30);
339
+ int num_frames = 0;
340
+ int total_ms = 1;
341
+ for (;;)
342
+ {
343
+ if(!cap.read(img))
344
+ break;
345
+ num_frames ++;
346
+ if (num_frames % 20 == 0)
347
+ {
348
+ cout << "Processing frame " << num_frames << " (" << num_frames * 1000000 / total_ms << " fps)" << endl;
349
+ }
350
+ if (img.empty())
351
+ break;
352
+
353
+ float scale = min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
354
+ Mat pr_img = static_resize(img);
355
+ ncnn::Mat in_pad = ncnn::Mat::from_pixels_resize(pr_img.data, ncnn::Mat::PIXEL_BGR2RGB, INPUT_W, INPUT_H, INPUT_W, INPUT_H);
356
+
357
+ // python 0-1 input tensor with rgb_means = (0.485, 0.456, 0.406), std = (0.229, 0.224, 0.225)
358
+ // so for 0-255 input image, rgb_mean should multiply 255 and norm should div by std.
359
+ const float mean_vals[3] = {255.f * 0.485f, 255.f * 0.456, 255.f * 0.406f};
360
+ const float norm_vals[3] = {1 / (255.f * 0.229f), 1 / (255.f * 0.224f), 1 / (255.f * 0.225f)};
361
+
362
+ in_pad.substract_mean_normalize(mean_vals, norm_vals);
363
+
364
+ std::vector<Object> objects;
365
+ auto start = chrono::system_clock::now();
366
+ //detect_yolox(img, objects);
367
+ detect_yolox(in_pad, objects, ex, scale);
368
+ vector<STrack> output_stracks = tracker.update(objects);
369
+ auto end = chrono::system_clock::now();
370
+ total_ms = total_ms + chrono::duration_cast<chrono::microseconds>(end - start).count();
371
+ for (int i = 0; i < output_stracks.size(); i++)
372
+ {
373
+ vector<float> tlwh = output_stracks[i].tlwh;
374
+ bool vertical = tlwh[2] / tlwh[3] > 1.6;
375
+ if (tlwh[2] * tlwh[3] > 20 && !vertical)
376
+ {
377
+ Scalar s = tracker.get_color(output_stracks[i].track_id);
378
+ putText(img, format("%d", output_stracks[i].track_id), Point(tlwh[0], tlwh[1] - 5),
379
+ 0, 0.6, Scalar(0, 0, 255), 2, LINE_AA);
380
+ rectangle(img, Rect(tlwh[0], tlwh[1], tlwh[2], tlwh[3]), s, 2);
381
+ }
382
+ }
383
+ putText(img, format("frame: %d fps: %d num: %d", num_frames, num_frames * 1000000 / total_ms, output_stracks.size()),
384
+ Point(0, 30), 0, 0.6, Scalar(0, 0, 255), 2, LINE_AA);
385
+ writer.write(img);
386
+ char c = waitKey(1);
387
+ if (c > 0)
388
+ {
389
+ break;
390
+ }
391
+ }
392
+ cap.release();
393
+ cout << "FPS: " << num_frames * 1000000 / total_ms << endl;
394
+
395
+ return 0;
396
+ }
deploy/ncnn/cpp/src/kalmanFilter.cpp ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "kalmanFilter.h"
2
+ #include <Eigen/Cholesky>
3
+
4
+ namespace byte_kalman
5
+ {
6
+ const double KalmanFilter::chi2inv95[10] = {
7
+ 0,
8
+ 3.8415,
9
+ 5.9915,
10
+ 7.8147,
11
+ 9.4877,
12
+ 11.070,
13
+ 12.592,
14
+ 14.067,
15
+ 15.507,
16
+ 16.919
17
+ };
18
+ KalmanFilter::KalmanFilter()
19
+ {
20
+ int ndim = 4;
21
+ double dt = 1.;
22
+
23
+ _motion_mat = Eigen::MatrixXf::Identity(8, 8);
24
+ for (int i = 0; i < ndim; i++) {
25
+ _motion_mat(i, ndim + i) = dt;
26
+ }
27
+ _update_mat = Eigen::MatrixXf::Identity(4, 8);
28
+
29
+ this->_std_weight_position = 1. / 20;
30
+ this->_std_weight_velocity = 1. / 160;
31
+ }
32
+
33
+ KAL_DATA KalmanFilter::initiate(const DETECTBOX &measurement)
34
+ {
35
+ DETECTBOX mean_pos = measurement;
36
+ DETECTBOX mean_vel;
37
+ for (int i = 0; i < 4; i++) mean_vel(i) = 0;
38
+
39
+ KAL_MEAN mean;
40
+ for (int i = 0; i < 8; i++) {
41
+ if (i < 4) mean(i) = mean_pos(i);
42
+ else mean(i) = mean_vel(i - 4);
43
+ }
44
+
45
+ KAL_MEAN std;
46
+ std(0) = 2 * _std_weight_position * measurement[3];
47
+ std(1) = 2 * _std_weight_position * measurement[3];
48
+ std(2) = 1e-2;
49
+ std(3) = 2 * _std_weight_position * measurement[3];
50
+ std(4) = 10 * _std_weight_velocity * measurement[3];
51
+ std(5) = 10 * _std_weight_velocity * measurement[3];
52
+ std(6) = 1e-5;
53
+ std(7) = 10 * _std_weight_velocity * measurement[3];
54
+
55
+ KAL_MEAN tmp = std.array().square();
56
+ KAL_COVA var = tmp.asDiagonal();
57
+ return std::make_pair(mean, var);
58
+ }
59
+
60
+ void KalmanFilter::predict(KAL_MEAN &mean, KAL_COVA &covariance)
61
+ {
62
+ //revise the data;
63
+ DETECTBOX std_pos;
64
+ std_pos << _std_weight_position * mean(3),
65
+ _std_weight_position * mean(3),
66
+ 1e-2,
67
+ _std_weight_position * mean(3);
68
+ DETECTBOX std_vel;
69
+ std_vel << _std_weight_velocity * mean(3),
70
+ _std_weight_velocity * mean(3),
71
+ 1e-5,
72
+ _std_weight_velocity * mean(3);
73
+ KAL_MEAN tmp;
74
+ tmp.block<1, 4>(0, 0) = std_pos;
75
+ tmp.block<1, 4>(0, 4) = std_vel;
76
+ tmp = tmp.array().square();
77
+ KAL_COVA motion_cov = tmp.asDiagonal();
78
+ KAL_MEAN mean1 = this->_motion_mat * mean.transpose();
79
+ KAL_COVA covariance1 = this->_motion_mat * covariance *(_motion_mat.transpose());
80
+ covariance1 += motion_cov;
81
+
82
+ mean = mean1;
83
+ covariance = covariance1;
84
+ }
85
+
86
+ KAL_HDATA KalmanFilter::project(const KAL_MEAN &mean, const KAL_COVA &covariance)
87
+ {
88
+ DETECTBOX std;
89
+ std << _std_weight_position * mean(3), _std_weight_position * mean(3),
90
+ 1e-1, _std_weight_position * mean(3);
91
+ KAL_HMEAN mean1 = _update_mat * mean.transpose();
92
+ KAL_HCOVA covariance1 = _update_mat * covariance * (_update_mat.transpose());
93
+ Eigen::Matrix<float, 4, 4> diag = std.asDiagonal();
94
+ diag = diag.array().square().matrix();
95
+ covariance1 += diag;
96
+ // covariance1.diagonal() << diag;
97
+ return std::make_pair(mean1, covariance1);
98
+ }
99
+
100
+ KAL_DATA
101
+ KalmanFilter::update(
102
+ const KAL_MEAN &mean,
103
+ const KAL_COVA &covariance,
104
+ const DETECTBOX &measurement)
105
+ {
106
+ KAL_HDATA pa = project(mean, covariance);
107
+ KAL_HMEAN projected_mean = pa.first;
108
+ KAL_HCOVA projected_cov = pa.second;
109
+
110
+ //chol_factor, lower =
111
+ //scipy.linalg.cho_factor(projected_cov, lower=True, check_finite=False)
112
+ //kalmain_gain =
113
+ //scipy.linalg.cho_solve((cho_factor, lower),
114
+ //np.dot(covariance, self._upadte_mat.T).T,
115
+ //check_finite=False).T
116
+ Eigen::Matrix<float, 4, 8> B = (covariance * (_update_mat.transpose())).transpose();
117
+ Eigen::Matrix<float, 8, 4> kalman_gain = (projected_cov.llt().solve(B)).transpose(); // eg.8x4
118
+ Eigen::Matrix<float, 1, 4> innovation = measurement - projected_mean; //eg.1x4
119
+ auto tmp = innovation * (kalman_gain.transpose());
120
+ KAL_MEAN new_mean = (mean.array() + tmp.array()).matrix();
121
+ KAL_COVA new_covariance = covariance - kalman_gain * projected_cov*(kalman_gain.transpose());
122
+ return std::make_pair(new_mean, new_covariance);
123
+ }
124
+
125
+ Eigen::Matrix<float, 1, -1>
126
+ KalmanFilter::gating_distance(
127
+ const KAL_MEAN &mean,
128
+ const KAL_COVA &covariance,
129
+ const std::vector<DETECTBOX> &measurements,
130
+ bool only_position)
131
+ {
132
+ KAL_HDATA pa = this->project(mean, covariance);
133
+ if (only_position) {
134
+ printf("not implement!");
135
+ exit(0);
136
+ }
137
+ KAL_HMEAN mean1 = pa.first;
138
+ KAL_HCOVA covariance1 = pa.second;
139
+
140
+ // Eigen::Matrix<float, -1, 4, Eigen::RowMajor> d(size, 4);
141
+ DETECTBOXSS d(measurements.size(), 4);
142
+ int pos = 0;
143
+ for (DETECTBOX box : measurements) {
144
+ d.row(pos++) = box - mean1;
145
+ }
146
+ Eigen::Matrix<float, -1, -1, Eigen::RowMajor> factor = covariance1.llt().matrixL();
147
+ Eigen::Matrix<float, -1, -1> z = factor.triangularView<Eigen::Lower>().solve<Eigen::OnTheRight>(d).transpose();
148
+ auto zz = ((z.array())*(z.array())).matrix();
149
+ auto square_maha = zz.colwise().sum();
150
+ return square_maha;
151
+ }
152
+ }
deploy/ncnn/cpp/src/lapjv.cpp ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include <stdio.h>
2
+ #include <stdlib.h>
3
+ #include <string.h>
4
+
5
+ #include "lapjv.h"
6
+
7
+ /** Column-reduction and reduction transfer for a dense cost matrix.
8
+ */
9
+ int_t _ccrrt_dense(const uint_t n, cost_t *cost[],
10
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
11
+ {
12
+ int_t n_free_rows;
13
+ boolean *unique;
14
+
15
+ for (uint_t i = 0; i < n; i++) {
16
+ x[i] = -1;
17
+ v[i] = LARGE;
18
+ y[i] = 0;
19
+ }
20
+ for (uint_t i = 0; i < n; i++) {
21
+ for (uint_t j = 0; j < n; j++) {
22
+ const cost_t c = cost[i][j];
23
+ if (c < v[j]) {
24
+ v[j] = c;
25
+ y[j] = i;
26
+ }
27
+ PRINTF("i=%d, j=%d, c[i,j]=%f, v[j]=%f y[j]=%d\n", i, j, c, v[j], y[j]);
28
+ }
29
+ }
30
+ PRINT_COST_ARRAY(v, n);
31
+ PRINT_INDEX_ARRAY(y, n);
32
+ NEW(unique, boolean, n);
33
+ memset(unique, TRUE, n);
34
+ {
35
+ int_t j = n;
36
+ do {
37
+ j--;
38
+ const int_t i = y[j];
39
+ if (x[i] < 0) {
40
+ x[i] = j;
41
+ }
42
+ else {
43
+ unique[i] = FALSE;
44
+ y[j] = -1;
45
+ }
46
+ } while (j > 0);
47
+ }
48
+ n_free_rows = 0;
49
+ for (uint_t i = 0; i < n; i++) {
50
+ if (x[i] < 0) {
51
+ free_rows[n_free_rows++] = i;
52
+ }
53
+ else if (unique[i]) {
54
+ const int_t j = x[i];
55
+ cost_t min = LARGE;
56
+ for (uint_t j2 = 0; j2 < n; j2++) {
57
+ if (j2 == (uint_t)j) {
58
+ continue;
59
+ }
60
+ const cost_t c = cost[i][j2] - v[j2];
61
+ if (c < min) {
62
+ min = c;
63
+ }
64
+ }
65
+ PRINTF("v[%d] = %f - %f\n", j, v[j], min);
66
+ v[j] -= min;
67
+ }
68
+ }
69
+ FREE(unique);
70
+ return n_free_rows;
71
+ }
72
+
73
+
74
+ /** Augmenting row reduction for a dense cost matrix.
75
+ */
76
+ int_t _carr_dense(
77
+ const uint_t n, cost_t *cost[],
78
+ const uint_t n_free_rows,
79
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
80
+ {
81
+ uint_t current = 0;
82
+ int_t new_free_rows = 0;
83
+ uint_t rr_cnt = 0;
84
+ PRINT_INDEX_ARRAY(x, n);
85
+ PRINT_INDEX_ARRAY(y, n);
86
+ PRINT_COST_ARRAY(v, n);
87
+ PRINT_INDEX_ARRAY(free_rows, n_free_rows);
88
+ while (current < n_free_rows) {
89
+ int_t i0;
90
+ int_t j1, j2;
91
+ cost_t v1, v2, v1_new;
92
+ boolean v1_lowers;
93
+
94
+ rr_cnt++;
95
+ PRINTF("current = %d rr_cnt = %d\n", current, rr_cnt);
96
+ const int_t free_i = free_rows[current++];
97
+ j1 = 0;
98
+ v1 = cost[free_i][0] - v[0];
99
+ j2 = -1;
100
+ v2 = LARGE;
101
+ for (uint_t j = 1; j < n; j++) {
102
+ PRINTF("%d = %f %d = %f\n", j1, v1, j2, v2);
103
+ const cost_t c = cost[free_i][j] - v[j];
104
+ if (c < v2) {
105
+ if (c >= v1) {
106
+ v2 = c;
107
+ j2 = j;
108
+ }
109
+ else {
110
+ v2 = v1;
111
+ v1 = c;
112
+ j2 = j1;
113
+ j1 = j;
114
+ }
115
+ }
116
+ }
117
+ i0 = y[j1];
118
+ v1_new = v[j1] - (v2 - v1);
119
+ v1_lowers = v1_new < v[j1];
120
+ PRINTF("%d %d 1=%d,%f 2=%d,%f v1'=%f(%d,%g) \n", free_i, i0, j1, v1, j2, v2, v1_new, v1_lowers, v[j1] - v1_new);
121
+ if (rr_cnt < current * n) {
122
+ if (v1_lowers) {
123
+ v[j1] = v1_new;
124
+ }
125
+ else if (i0 >= 0 && j2 >= 0) {
126
+ j1 = j2;
127
+ i0 = y[j2];
128
+ }
129
+ if (i0 >= 0) {
130
+ if (v1_lowers) {
131
+ free_rows[--current] = i0;
132
+ }
133
+ else {
134
+ free_rows[new_free_rows++] = i0;
135
+ }
136
+ }
137
+ }
138
+ else {
139
+ PRINTF("rr_cnt=%d >= %d (current=%d * n=%d)\n", rr_cnt, current * n, current, n);
140
+ if (i0 >= 0) {
141
+ free_rows[new_free_rows++] = i0;
142
+ }
143
+ }
144
+ x[free_i] = j1;
145
+ y[j1] = free_i;
146
+ }
147
+ return new_free_rows;
148
+ }
149
+
150
+
151
+ /** Find columns with minimum d[j] and put them on the SCAN list.
152
+ */
153
+ uint_t _find_dense(const uint_t n, uint_t lo, cost_t *d, int_t *cols, int_t *y)
154
+ {
155
+ uint_t hi = lo + 1;
156
+ cost_t mind = d[cols[lo]];
157
+ for (uint_t k = hi; k < n; k++) {
158
+ int_t j = cols[k];
159
+ if (d[j] <= mind) {
160
+ if (d[j] < mind) {
161
+ hi = lo;
162
+ mind = d[j];
163
+ }
164
+ cols[k] = cols[hi];
165
+ cols[hi++] = j;
166
+ }
167
+ }
168
+ return hi;
169
+ }
170
+
171
+
172
+ // Scan all columns in TODO starting from arbitrary column in SCAN
173
+ // and try to decrease d of the TODO columns using the SCAN column.
174
+ int_t _scan_dense(const uint_t n, cost_t *cost[],
175
+ uint_t *plo, uint_t*phi,
176
+ cost_t *d, int_t *cols, int_t *pred,
177
+ int_t *y, cost_t *v)
178
+ {
179
+ uint_t lo = *plo;
180
+ uint_t hi = *phi;
181
+ cost_t h, cred_ij;
182
+
183
+ while (lo != hi) {
184
+ int_t j = cols[lo++];
185
+ const int_t i = y[j];
186
+ const cost_t mind = d[j];
187
+ h = cost[i][j] - v[j] - mind;
188
+ PRINTF("i=%d j=%d h=%f\n", i, j, h);
189
+ // For all columns in TODO
190
+ for (uint_t k = hi; k < n; k++) {
191
+ j = cols[k];
192
+ cred_ij = cost[i][j] - v[j] - h;
193
+ if (cred_ij < d[j]) {
194
+ d[j] = cred_ij;
195
+ pred[j] = i;
196
+ if (cred_ij == mind) {
197
+ if (y[j] < 0) {
198
+ return j;
199
+ }
200
+ cols[k] = cols[hi];
201
+ cols[hi++] = j;
202
+ }
203
+ }
204
+ }
205
+ }
206
+ *plo = lo;
207
+ *phi = hi;
208
+ return -1;
209
+ }
210
+
211
+
212
+ /** Single iteration of modified Dijkstra shortest path algorithm as explained in the JV paper.
213
+ *
214
+ * This is a dense matrix version.
215
+ *
216
+ * \return The closest free column index.
217
+ */
218
+ int_t find_path_dense(
219
+ const uint_t n, cost_t *cost[],
220
+ const int_t start_i,
221
+ int_t *y, cost_t *v,
222
+ int_t *pred)
223
+ {
224
+ uint_t lo = 0, hi = 0;
225
+ int_t final_j = -1;
226
+ uint_t n_ready = 0;
227
+ int_t *cols;
228
+ cost_t *d;
229
+
230
+ NEW(cols, int_t, n);
231
+ NEW(d, cost_t, n);
232
+
233
+ for (uint_t i = 0; i < n; i++) {
234
+ cols[i] = i;
235
+ pred[i] = start_i;
236
+ d[i] = cost[start_i][i] - v[i];
237
+ }
238
+ PRINT_COST_ARRAY(d, n);
239
+ while (final_j == -1) {
240
+ // No columns left on the SCAN list.
241
+ if (lo == hi) {
242
+ PRINTF("%d..%d -> find\n", lo, hi);
243
+ n_ready = lo;
244
+ hi = _find_dense(n, lo, d, cols, y);
245
+ PRINTF("check %d..%d\n", lo, hi);
246
+ PRINT_INDEX_ARRAY(cols, n);
247
+ for (uint_t k = lo; k < hi; k++) {
248
+ const int_t j = cols[k];
249
+ if (y[j] < 0) {
250
+ final_j = j;
251
+ }
252
+ }
253
+ }
254
+ if (final_j == -1) {
255
+ PRINTF("%d..%d -> scan\n", lo, hi);
256
+ final_j = _scan_dense(
257
+ n, cost, &lo, &hi, d, cols, pred, y, v);
258
+ PRINT_COST_ARRAY(d, n);
259
+ PRINT_INDEX_ARRAY(cols, n);
260
+ PRINT_INDEX_ARRAY(pred, n);
261
+ }
262
+ }
263
+
264
+ PRINTF("found final_j=%d\n", final_j);
265
+ PRINT_INDEX_ARRAY(cols, n);
266
+ {
267
+ const cost_t mind = d[cols[lo]];
268
+ for (uint_t k = 0; k < n_ready; k++) {
269
+ const int_t j = cols[k];
270
+ v[j] += d[j] - mind;
271
+ }
272
+ }
273
+
274
+ FREE(cols);
275
+ FREE(d);
276
+
277
+ return final_j;
278
+ }
279
+
280
+
281
+ /** Augment for a dense cost matrix.
282
+ */
283
+ int_t _ca_dense(
284
+ const uint_t n, cost_t *cost[],
285
+ const uint_t n_free_rows,
286
+ int_t *free_rows, int_t *x, int_t *y, cost_t *v)
287
+ {
288
+ int_t *pred;
289
+
290
+ NEW(pred, int_t, n);
291
+
292
+ for (int_t *pfree_i = free_rows; pfree_i < free_rows + n_free_rows; pfree_i++) {
293
+ int_t i = -1, j;
294
+ uint_t k = 0;
295
+
296
+ PRINTF("looking at free_i=%d\n", *pfree_i);
297
+ j = find_path_dense(n, cost, *pfree_i, y, v, pred);
298
+ ASSERT(j >= 0);
299
+ ASSERT(j < n);
300
+ while (i != *pfree_i) {
301
+ PRINTF("augment %d\n", j);
302
+ PRINT_INDEX_ARRAY(pred, n);
303
+ i = pred[j];
304
+ PRINTF("y[%d]=%d -> %d\n", j, y[j], i);
305
+ y[j] = i;
306
+ PRINT_INDEX_ARRAY(x, n);
307
+ SWAP_INDICES(j, x[i]);
308
+ k++;
309
+ if (k >= n) {
310
+ ASSERT(FALSE);
311
+ }
312
+ }
313
+ }
314
+ FREE(pred);
315
+ return 0;
316
+ }
317
+
318
+
319
+ /** Solve dense sparse LAP.
320
+ */
321
+ int lapjv_internal(
322
+ const uint_t n, cost_t *cost[],
323
+ int_t *x, int_t *y)
324
+ {
325
+ int ret;
326
+ int_t *free_rows;
327
+ cost_t *v;
328
+
329
+ NEW(free_rows, int_t, n);
330
+ NEW(v, cost_t, n);
331
+ ret = _ccrrt_dense(n, cost, free_rows, x, y, v);
332
+ int i = 0;
333
+ while (ret > 0 && i < 2) {
334
+ ret = _carr_dense(n, cost, ret, free_rows, x, y, v);
335
+ i++;
336
+ }
337
+ if (ret > 0) {
338
+ ret = _ca_dense(n, cost, ret, free_rows, x, y, v);
339
+ }
340
+ FREE(v);
341
+ FREE(free_rows);
342
+ return ret;
343
+ }
deploy/ncnn/cpp/src/utils.cpp ADDED
@@ -0,0 +1,429 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #include "BYTETracker.h"
2
+ #include "lapjv.h"
3
+
4
+ vector<STrack*> BYTETracker::joint_stracks(vector<STrack*> &tlista, vector<STrack> &tlistb)
5
+ {
6
+ map<int, int> exists;
7
+ vector<STrack*> res;
8
+ for (int i = 0; i < tlista.size(); i++)
9
+ {
10
+ exists.insert(pair<int, int>(tlista[i]->track_id, 1));
11
+ res.push_back(tlista[i]);
12
+ }
13
+ for (int i = 0; i < tlistb.size(); i++)
14
+ {
15
+ int tid = tlistb[i].track_id;
16
+ if (!exists[tid] || exists.count(tid) == 0)
17
+ {
18
+ exists[tid] = 1;
19
+ res.push_back(&tlistb[i]);
20
+ }
21
+ }
22
+ return res;
23
+ }
24
+
25
+ vector<STrack> BYTETracker::joint_stracks(vector<STrack> &tlista, vector<STrack> &tlistb)
26
+ {
27
+ map<int, int> exists;
28
+ vector<STrack> res;
29
+ for (int i = 0; i < tlista.size(); i++)
30
+ {
31
+ exists.insert(pair<int, int>(tlista[i].track_id, 1));
32
+ res.push_back(tlista[i]);
33
+ }
34
+ for (int i = 0; i < tlistb.size(); i++)
35
+ {
36
+ int tid = tlistb[i].track_id;
37
+ if (!exists[tid] || exists.count(tid) == 0)
38
+ {
39
+ exists[tid] = 1;
40
+ res.push_back(tlistb[i]);
41
+ }
42
+ }
43
+ return res;
44
+ }
45
+
46
+ vector<STrack> BYTETracker::sub_stracks(vector<STrack> &tlista, vector<STrack> &tlistb)
47
+ {
48
+ map<int, STrack> stracks;
49
+ for (int i = 0; i < tlista.size(); i++)
50
+ {
51
+ stracks.insert(pair<int, STrack>(tlista[i].track_id, tlista[i]));
52
+ }
53
+ for (int i = 0; i < tlistb.size(); i++)
54
+ {
55
+ int tid = tlistb[i].track_id;
56
+ if (stracks.count(tid) != 0)
57
+ {
58
+ stracks.erase(tid);
59
+ }
60
+ }
61
+
62
+ vector<STrack> res;
63
+ std::map<int, STrack>::iterator it;
64
+ for (it = stracks.begin(); it != stracks.end(); ++it)
65
+ {
66
+ res.push_back(it->second);
67
+ }
68
+
69
+ return res;
70
+ }
71
+
72
+ void BYTETracker::remove_duplicate_stracks(vector<STrack> &resa, vector<STrack> &resb, vector<STrack> &stracksa, vector<STrack> &stracksb)
73
+ {
74
+ vector<vector<float> > pdist = iou_distance(stracksa, stracksb);
75
+ vector<pair<int, int> > pairs;
76
+ for (int i = 0; i < pdist.size(); i++)
77
+ {
78
+ for (int j = 0; j < pdist[i].size(); j++)
79
+ {
80
+ if (pdist[i][j] < 0.15)
81
+ {
82
+ pairs.push_back(pair<int, int>(i, j));
83
+ }
84
+ }
85
+ }
86
+
87
+ vector<int> dupa, dupb;
88
+ for (int i = 0; i < pairs.size(); i++)
89
+ {
90
+ int timep = stracksa[pairs[i].first].frame_id - stracksa[pairs[i].first].start_frame;
91
+ int timeq = stracksb[pairs[i].second].frame_id - stracksb[pairs[i].second].start_frame;
92
+ if (timep > timeq)
93
+ dupb.push_back(pairs[i].second);
94
+ else
95
+ dupa.push_back(pairs[i].first);
96
+ }
97
+
98
+ for (int i = 0; i < stracksa.size(); i++)
99
+ {
100
+ vector<int>::iterator iter = find(dupa.begin(), dupa.end(), i);
101
+ if (iter == dupa.end())
102
+ {
103
+ resa.push_back(stracksa[i]);
104
+ }
105
+ }
106
+
107
+ for (int i = 0; i < stracksb.size(); i++)
108
+ {
109
+ vector<int>::iterator iter = find(dupb.begin(), dupb.end(), i);
110
+ if (iter == dupb.end())
111
+ {
112
+ resb.push_back(stracksb[i]);
113
+ }
114
+ }
115
+ }
116
+
117
+ void BYTETracker::linear_assignment(vector<vector<float> > &cost_matrix, int cost_matrix_size, int cost_matrix_size_size, float thresh,
118
+ vector<vector<int> > &matches, vector<int> &unmatched_a, vector<int> &unmatched_b)
119
+ {
120
+ if (cost_matrix.size() == 0)
121
+ {
122
+ for (int i = 0; i < cost_matrix_size; i++)
123
+ {
124
+ unmatched_a.push_back(i);
125
+ }
126
+ for (int i = 0; i < cost_matrix_size_size; i++)
127
+ {
128
+ unmatched_b.push_back(i);
129
+ }
130
+ return;
131
+ }
132
+
133
+ vector<int> rowsol; vector<int> colsol;
134
+ float c = lapjv(cost_matrix, rowsol, colsol, true, thresh);
135
+ for (int i = 0; i < rowsol.size(); i++)
136
+ {
137
+ if (rowsol[i] >= 0)
138
+ {
139
+ vector<int> match;
140
+ match.push_back(i);
141
+ match.push_back(rowsol[i]);
142
+ matches.push_back(match);
143
+ }
144
+ else
145
+ {
146
+ unmatched_a.push_back(i);
147
+ }
148
+ }
149
+
150
+ for (int i = 0; i < colsol.size(); i++)
151
+ {
152
+ if (colsol[i] < 0)
153
+ {
154
+ unmatched_b.push_back(i);
155
+ }
156
+ }
157
+ }
158
+
159
+ vector<vector<float> > BYTETracker::ious(vector<vector<float> > &atlbrs, vector<vector<float> > &btlbrs)
160
+ {
161
+ vector<vector<float> > ious;
162
+ if (atlbrs.size()*btlbrs.size() == 0)
163
+ return ious;
164
+
165
+ ious.resize(atlbrs.size());
166
+ for (int i = 0; i < ious.size(); i++)
167
+ {
168
+ ious[i].resize(btlbrs.size());
169
+ }
170
+
171
+ //bbox_ious
172
+ for (int k = 0; k < btlbrs.size(); k++)
173
+ {
174
+ vector<float> ious_tmp;
175
+ float box_area = (btlbrs[k][2] - btlbrs[k][0] + 1)*(btlbrs[k][3] - btlbrs[k][1] + 1);
176
+ for (int n = 0; n < atlbrs.size(); n++)
177
+ {
178
+ float iw = min(atlbrs[n][2], btlbrs[k][2]) - max(atlbrs[n][0], btlbrs[k][0]) + 1;
179
+ if (iw > 0)
180
+ {
181
+ float ih = min(atlbrs[n][3], btlbrs[k][3]) - max(atlbrs[n][1], btlbrs[k][1]) + 1;
182
+ if(ih > 0)
183
+ {
184
+ float ua = (atlbrs[n][2] - atlbrs[n][0] + 1)*(atlbrs[n][3] - atlbrs[n][1] + 1) + box_area - iw * ih;
185
+ ious[n][k] = iw * ih / ua;
186
+ }
187
+ else
188
+ {
189
+ ious[n][k] = 0.0;
190
+ }
191
+ }
192
+ else
193
+ {
194
+ ious[n][k] = 0.0;
195
+ }
196
+ }
197
+ }
198
+
199
+ return ious;
200
+ }
201
+
202
+ vector<vector<float> > BYTETracker::iou_distance(vector<STrack*> &atracks, vector<STrack> &btracks, int &dist_size, int &dist_size_size)
203
+ {
204
+ vector<vector<float> > cost_matrix;
205
+ if (atracks.size() * btracks.size() == 0)
206
+ {
207
+ dist_size = atracks.size();
208
+ dist_size_size = btracks.size();
209
+ return cost_matrix;
210
+ }
211
+ vector<vector<float> > atlbrs, btlbrs;
212
+ for (int i = 0; i < atracks.size(); i++)
213
+ {
214
+ atlbrs.push_back(atracks[i]->tlbr);
215
+ }
216
+ for (int i = 0; i < btracks.size(); i++)
217
+ {
218
+ btlbrs.push_back(btracks[i].tlbr);
219
+ }
220
+
221
+ dist_size = atracks.size();
222
+ dist_size_size = btracks.size();
223
+
224
+ vector<vector<float> > _ious = ious(atlbrs, btlbrs);
225
+
226
+ for (int i = 0; i < _ious.size();i++)
227
+ {
228
+ vector<float> _iou;
229
+ for (int j = 0; j < _ious[i].size(); j++)
230
+ {
231
+ _iou.push_back(1 - _ious[i][j]);
232
+ }
233
+ cost_matrix.push_back(_iou);
234
+ }
235
+
236
+ return cost_matrix;
237
+ }
238
+
239
+ vector<vector<float> > BYTETracker::iou_distance(vector<STrack> &atracks, vector<STrack> &btracks)
240
+ {
241
+ vector<vector<float> > atlbrs, btlbrs;
242
+ for (int i = 0; i < atracks.size(); i++)
243
+ {
244
+ atlbrs.push_back(atracks[i].tlbr);
245
+ }
246
+ for (int i = 0; i < btracks.size(); i++)
247
+ {
248
+ btlbrs.push_back(btracks[i].tlbr);
249
+ }
250
+
251
+ vector<vector<float> > _ious = ious(atlbrs, btlbrs);
252
+ vector<vector<float> > cost_matrix;
253
+ for (int i = 0; i < _ious.size(); i++)
254
+ {
255
+ vector<float> _iou;
256
+ for (int j = 0; j < _ious[i].size(); j++)
257
+ {
258
+ _iou.push_back(1 - _ious[i][j]);
259
+ }
260
+ cost_matrix.push_back(_iou);
261
+ }
262
+
263
+ return cost_matrix;
264
+ }
265
+
266
+ double BYTETracker::lapjv(const vector<vector<float> > &cost, vector<int> &rowsol, vector<int> &colsol,
267
+ bool extend_cost, float cost_limit, bool return_cost)
268
+ {
269
+ vector<vector<float> > cost_c;
270
+ cost_c.assign(cost.begin(), cost.end());
271
+
272
+ vector<vector<float> > cost_c_extended;
273
+
274
+ int n_rows = cost.size();
275
+ int n_cols = cost[0].size();
276
+ rowsol.resize(n_rows);
277
+ colsol.resize(n_cols);
278
+
279
+ int n = 0;
280
+ if (n_rows == n_cols)
281
+ {
282
+ n = n_rows;
283
+ }
284
+ else
285
+ {
286
+ if (!extend_cost)
287
+ {
288
+ cout << "set extend_cost=True" << endl;
289
+ system("pause");
290
+ exit(0);
291
+ }
292
+ }
293
+
294
+ if (extend_cost || cost_limit < LONG_MAX)
295
+ {
296
+ n = n_rows + n_cols;
297
+ cost_c_extended.resize(n);
298
+ for (int i = 0; i < cost_c_extended.size(); i++)
299
+ cost_c_extended[i].resize(n);
300
+
301
+ if (cost_limit < LONG_MAX)
302
+ {
303
+ for (int i = 0; i < cost_c_extended.size(); i++)
304
+ {
305
+ for (int j = 0; j < cost_c_extended[i].size(); j++)
306
+ {
307
+ cost_c_extended[i][j] = cost_limit / 2.0;
308
+ }
309
+ }
310
+ }
311
+ else
312
+ {
313
+ float cost_max = -1;
314
+ for (int i = 0; i < cost_c.size(); i++)
315
+ {
316
+ for (int j = 0; j < cost_c[i].size(); j++)
317
+ {
318
+ if (cost_c[i][j] > cost_max)
319
+ cost_max = cost_c[i][j];
320
+ }
321
+ }
322
+ for (int i = 0; i < cost_c_extended.size(); i++)
323
+ {
324
+ for (int j = 0; j < cost_c_extended[i].size(); j++)
325
+ {
326
+ cost_c_extended[i][j] = cost_max + 1;
327
+ }
328
+ }
329
+ }
330
+
331
+ for (int i = n_rows; i < cost_c_extended.size(); i++)
332
+ {
333
+ for (int j = n_cols; j < cost_c_extended[i].size(); j++)
334
+ {
335
+ cost_c_extended[i][j] = 0;
336
+ }
337
+ }
338
+ for (int i = 0; i < n_rows; i++)
339
+ {
340
+ for (int j = 0; j < n_cols; j++)
341
+ {
342
+ cost_c_extended[i][j] = cost_c[i][j];
343
+ }
344
+ }
345
+
346
+ cost_c.clear();
347
+ cost_c.assign(cost_c_extended.begin(), cost_c_extended.end());
348
+ }
349
+
350
+ double **cost_ptr;
351
+ cost_ptr = new double *[sizeof(double *) * n];
352
+ for (int i = 0; i < n; i++)
353
+ cost_ptr[i] = new double[sizeof(double) * n];
354
+
355
+ for (int i = 0; i < n; i++)
356
+ {
357
+ for (int j = 0; j < n; j++)
358
+ {
359
+ cost_ptr[i][j] = cost_c[i][j];
360
+ }
361
+ }
362
+
363
+ int* x_c = new int[sizeof(int) * n];
364
+ int *y_c = new int[sizeof(int) * n];
365
+
366
+ int ret = lapjv_internal(n, cost_ptr, x_c, y_c);
367
+ if (ret != 0)
368
+ {
369
+ cout << "Calculate Wrong!" << endl;
370
+ system("pause");
371
+ exit(0);
372
+ }
373
+
374
+ double opt = 0.0;
375
+
376
+ if (n != n_rows)
377
+ {
378
+ for (int i = 0; i < n; i++)
379
+ {
380
+ if (x_c[i] >= n_cols)
381
+ x_c[i] = -1;
382
+ if (y_c[i] >= n_rows)
383
+ y_c[i] = -1;
384
+ }
385
+ for (int i = 0; i < n_rows; i++)
386
+ {
387
+ rowsol[i] = x_c[i];
388
+ }
389
+ for (int i = 0; i < n_cols; i++)
390
+ {
391
+ colsol[i] = y_c[i];
392
+ }
393
+
394
+ if (return_cost)
395
+ {
396
+ for (int i = 0; i < rowsol.size(); i++)
397
+ {
398
+ if (rowsol[i] != -1)
399
+ {
400
+ //cout << i << "\t" << rowsol[i] << "\t" << cost_ptr[i][rowsol[i]] << endl;
401
+ opt += cost_ptr[i][rowsol[i]];
402
+ }
403
+ }
404
+ }
405
+ }
406
+ else if (return_cost)
407
+ {
408
+ for (int i = 0; i < rowsol.size(); i++)
409
+ {
410
+ opt += cost_ptr[i][rowsol[i]];
411
+ }
412
+ }
413
+
414
+ for (int i = 0; i < n; i++)
415
+ {
416
+ delete[]cost_ptr[i];
417
+ }
418
+ delete[]cost_ptr;
419
+ delete[]x_c;
420
+ delete[]y_c;
421
+
422
+ return opt;
423
+ }
424
+
425
+ Scalar BYTETracker::get_color(int idx)
426
+ {
427
+ idx += 3;
428
+ return Scalar(37 * idx % 255, 17 * idx % 255, 29 * idx % 255);
429
+ }
deploy/scripts/export_onnx.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from loguru import logger
2
+
3
+ import torch
4
+ from torch import nn
5
+
6
+ from yolox.exp import get_exp
7
+ from yolox.models.network_blocks import SiLU
8
+ from yolox.utils import replace_module
9
+
10
+ import argparse
11
+ import os
12
+
13
+
14
+ def make_parser():
15
+ parser = argparse.ArgumentParser("YOLOX onnx deploy")
16
+ parser.add_argument(
17
+ "--output-name", type=str, default="ocsort.onnx", help="output name of models"
18
+ )
19
+ parser.add_argument(
20
+ "--input", default="images", type=str, help="input node name of onnx model"
21
+ )
22
+ parser.add_argument(
23
+ "--output", default="output", type=str, help="output node name of onnx model"
24
+ )
25
+ parser.add_argument(
26
+ "-o", "--opset", default=11, type=int, help="onnx opset version"
27
+ )
28
+ parser.add_argument("--no-onnxsim", action="store_true", help="use onnxsim or not")
29
+ parser.add_argument(
30
+ "-f",
31
+ "--exp_file",
32
+ default=None,
33
+ type=str,
34
+ help="expriment description file",
35
+ )
36
+ parser.add_argument("-expn", "--experiment-name", type=str, default=None)
37
+ parser.add_argument("-n", "--name", type=str, default=None, help="model name")
38
+ parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt path")
39
+ parser.add_argument(
40
+ "opts",
41
+ help="Modify config options using the command-line",
42
+ default=None,
43
+ nargs=argparse.REMAINDER,
44
+ )
45
+
46
+ return parser
47
+
48
+
49
+ @logger.catch
50
+ def main():
51
+ args = make_parser().parse_args()
52
+ logger.info("args value: {}".format(args))
53
+ exp = get_exp(args.exp_file, args.name)
54
+ exp.merge(args.opts)
55
+
56
+ if not args.experiment_name:
57
+ args.experiment_name = exp.exp_name
58
+
59
+ model = exp.get_model()
60
+ if args.ckpt is None:
61
+ file_name = os.path.join(exp.output_dir, args.experiment_name)
62
+ ckpt_file = os.path.join(file_name, "best_ckpt.pth.tar")
63
+ else:
64
+ ckpt_file = args.ckpt
65
+
66
+ # load the model state dict
67
+ ckpt = torch.load(ckpt_file, map_location="cpu")
68
+
69
+ model.eval()
70
+ if "model" in ckpt:
71
+ ckpt = ckpt["model"]
72
+ model.load_state_dict(ckpt)
73
+ model = replace_module(model, nn.SiLU, SiLU)
74
+ model.head.decode_in_inference = False
75
+
76
+ logger.info("loading checkpoint done.")
77
+ dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
78
+ torch.onnx._export(
79
+ model,
80
+ dummy_input,
81
+ args.output_name,
82
+ input_names=[args.input],
83
+ output_names=[args.output],
84
+ opset_version=args.opset,
85
+ )
86
+ logger.info("generated onnx model named {}".format(args.output_name))
87
+
88
+ if not args.no_onnxsim:
89
+ import onnx
90
+
91
+ from onnxsim import simplify
92
+
93
+ # use onnxsimplify to reduce reduent model.
94
+ onnx_model = onnx.load(args.output_name)
95
+ model_simp, check = simplify(onnx_model)
96
+ assert check, "Simplified ONNX model could not be validated"
97
+ onnx.save(model_simp, args.output_name)
98
+ logger.info("generated simplified onnx model named {}".format(args.output_name))
99
+
100
+
101
+ if __name__ == "__main__":
102
+ main()
deploy/scripts/trt.py ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from loguru import logger
2
+
3
+ import tensorrt as trt
4
+ import torch
5
+ from torch2trt import torch2trt
6
+
7
+ from yolox.exp import get_exp
8
+
9
+ import argparse
10
+ import os
11
+ import shutil
12
+
13
+
14
+ def make_parser():
15
+ parser = argparse.ArgumentParser("YOLOX ncnn deploy")
16
+ parser.add_argument("-expn", "--experiment-name", type=str, default=None)
17
+ parser.add_argument("-n", "--name", type=str, default=None, help="model name")
18
+
19
+ parser.add_argument(
20
+ "-f",
21
+ "--exp_file",
22
+ default=None,
23
+ type=str,
24
+ help="pls input your expriment description file",
25
+ )
26
+ parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt path")
27
+ return parser
28
+
29
+
30
+ @logger.catch
31
+ def main():
32
+ args = make_parser().parse_args()
33
+ exp = get_exp(args.exp_file, args.name)
34
+ if not args.experiment_name:
35
+ args.experiment_name = exp.exp_name
36
+
37
+ model = exp.get_model()
38
+ file_name = os.path.join(exp.output_dir, args.experiment_name)
39
+ os.makedirs(file_name, exist_ok=True)
40
+ if args.ckpt is None:
41
+ ckpt_file = os.path.join(file_name, "best_ckpt.pth.tar")
42
+ else:
43
+ ckpt_file = args.ckpt
44
+
45
+ ckpt = torch.load(ckpt_file, map_location="cpu")
46
+ # load the model state dict
47
+
48
+ model.load_state_dict(ckpt["model"])
49
+ logger.info("loaded checkpoint done.")
50
+ model.eval()
51
+ model.cuda()
52
+ model.head.decode_in_inference = False
53
+ x = torch.ones(1, 3, exp.test_size[0], exp.test_size[1]).cuda()
54
+ model_trt = torch2trt(
55
+ model,
56
+ [x],
57
+ fp16_mode=True,
58
+ log_level=trt.Logger.INFO,
59
+ max_workspace_size=(1 << 32),
60
+ )
61
+ torch.save(model_trt.state_dict(), os.path.join(file_name, "model_trt.pth"))
62
+ logger.info("Converted TensorRT model done.")
63
+ engine_file = os.path.join(file_name, "model_trt.engine")
64
+ engine_file_demo = os.path.join("deploy", "TensorRT", "cpp", "model_trt.engine")
65
+ with open(engine_file, "wb") as f:
66
+ f.write(model_trt.engine.serialize())
67
+
68
+ shutil.copyfile(engine_file, engine_file_demo)
69
+
70
+ logger.info("Converted TensorRT model engine file is saved for C++ inference.")
71
+
72
+
73
+ if __name__ == "__main__":
74
+ main()
docs/DEPLOY.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment
2
+
3
+ We provide support to some popular deployment tools. This part is built upon the implementation of [YOLOX Deployment](https://github.com/Megvii-BaseDetection/YOLOX/tree/main/demo) and [the adaptation by ByteTrack](https://github.com/ifzhang/ByteTrack/tree/main/deploy).
4
+
5
+
6
+ ## ONNX support
7
+
8
+ 1. convert the pytorch model to onnx checkpoints, we provide an example here.
9
+ ```python
10
+ # In pratice you may want smaller model for faster inference.
11
+ python deploy/scripts/export_onnx.py --output-name ocsort.onnx -f exps/example/mot/yolox_x_mix_det.py -c pretrained/bytetrack_x_mot17.pth.tar
12
+ ```
13
+
14
+ 2. run on the provided model video by
15
+ ```shell
16
+ cd $OCSORT_HOME/deploy/ONNXRuntime
17
+ python onnx_inference.py
18
+ ```
19
+
20
+ ## TensorRT support (Python)
21
+
22
+ 1. Follow [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) and [torch2trt](https://github.com/NVIDIA-AI-IOT/torch2trt) to install TensorRT (Version 7 recommended) and torch2trt.
23
+
24
+ 2. Convert Model
25
+ ```python
26
+ # you have to download checkpoint bytetrack_s_mot17.pth.tar from model zoo of ByteTrack
27
+ python3 deploy/scripts/trt.py -f exps/example/mot/yolox_s_mix_det.py -c pretrained/bytetrack_s_mot17.pth.tar
28
+ ```
29
+
30
+ 3. Run on a demo video
31
+ ```python
32
+ python3 tools/demo_track.py video -f exps/example/mot/yolox_s_mix_det.py --trt --save_result
33
+ ```
34
+
35
+ *Note: We haven't validated the C++ support for TensorRT yet, please refer to [ByteTrack guidance](https://github.com/ifzhang/ByteTrack/tree/main/deploy/TensorRT/cpp) for adaptation for now.*
36
+
37
+ ## ncnn support
38
+ Please follow the [guidelines](https://github.com/ifzhang/ByteTrack/tree/main/deploy/ncnn/cpp) from ByteTrack to deploy by support from ncnn.
exps/SU-T-ReID.py ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # encoding: utf-8
2
+ import os
3
+ import random
4
+ import torch
5
+ import torch.nn as nn
6
+ import torch.distributed as dist
7
+
8
+ from yolox.exp import Exp as MyExp
9
+ from yolox.data import get_yolox_datadir
10
+
11
+ class Exp(MyExp):
12
+ def __init__(self):
13
+ super(Exp, self).__init__()
14
+ self.num_classes = 1
15
+ self.depth = 1.33
16
+ self.width = 1.25
17
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
18
+ self.train_ann = "train.json"
19
+ self.val_ann = "test.json"
20
+ self.input_size = (800, 1440)
21
+ self.test_size = (800, 1440)
22
+ self.random_size = (18, 32)
23
+ self.max_epoch = 80
24
+ self.print_interval = 20
25
+ self.eval_interval = 5
26
+ self.test_conf = 0.1
27
+ self.nmsthre = 0.7
28
+ self.no_aug_epochs = 10
29
+ self.basic_lr_per_img = 0.001 / 64.0
30
+ self.warmup_epochs = 1
31
+
32
+ # tracking params
33
+ self.ckpt = "Checkpoint.pth.tar"
34
+ self.use_byte = True
35
+ self.dataset = "mft25"
36
+ self.inertia = 0.05
37
+ self.iou_thresh = 0.25
38
+ self.asso = "fishiou"
39
+ self.TCM_first_step = True
40
+ self.TCM_byte_step = True
41
+ self.TCM_first_step_weight = 1.0
42
+ self.TCM_byte_step_weight = 1.0
43
+ self.with_reid = True
44
+ self.with_fastreid =True
45
+ self.EG_weight_high_score= 1.3
46
+ self.EG_weight_low_score= 1.2
47
+
48
+ self.fast_reid_config = "fast_reid/configs/SBS_S101.yml"
49
+ self.fast_reid_weights = "ReID-Checkpoint.pth"
50
+
51
+ self.with_longterm_reid_correction = True
52
+ self.longterm_reid_correction_thresh = 0.4
53
+ self.longterm_reid_correction_thresh_low = 0.4
54
+
55
+ def get_data_loader(self, batch_size, is_distributed, no_aug=False):
56
+ from yolox.data import (
57
+ MOTDataset,
58
+ TrainTransform,
59
+ YoloBatchSampler,
60
+ DataLoader,
61
+ InfiniteSampler,
62
+ MosaicDetection,
63
+ )
64
+
65
+ dataset = MOTDataset(
66
+ data_dir=os.path.join(get_yolox_datadir(), "mft25"),
67
+ json_file=self.train_ann,
68
+ name='',
69
+ img_size=self.input_size,
70
+ preproc=TrainTransform(
71
+ rgb_means=(0.485, 0.456, 0.406),
72
+ std=(0.229, 0.224, 0.225),
73
+ max_labels=500,
74
+ ),
75
+ )
76
+
77
+ dataset = MosaicDetection(
78
+ dataset,
79
+ mosaic=not no_aug,
80
+ img_size=self.input_size,
81
+ preproc=TrainTransform(
82
+ rgb_means=(0.485, 0.456, 0.406),
83
+ std=(0.229, 0.224, 0.225),
84
+ max_labels=1000,
85
+ ),
86
+ degrees=self.degrees,
87
+ translate=self.translate,
88
+ scale=self.scale,
89
+ shear=self.shear,
90
+ perspective=self.perspective,
91
+ enable_mixup=self.enable_mixup,
92
+ )
93
+
94
+ self.dataset = dataset
95
+
96
+ if is_distributed:
97
+ batch_size = batch_size // dist.get_world_size()
98
+
99
+ sampler = InfiniteSampler(
100
+ len(self.dataset), seed=self.seed if self.seed else 0
101
+ )
102
+
103
+ batch_sampler = YoloBatchSampler(
104
+ sampler=sampler,
105
+ batch_size=batch_size,
106
+ drop_last=False,
107
+ input_dimension=self.input_size,
108
+ mosaic=not no_aug,
109
+ )
110
+
111
+ dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
112
+ dataloader_kwargs["batch_sampler"] = batch_sampler
113
+ train_loader = DataLoader(self.dataset, **dataloader_kwargs)
114
+
115
+ return train_loader
116
+
117
+ def get_eval_loader(self, batch_size, is_distributed, testdev=False, run_tracking=False): # [hgx0411] dataloader related
118
+ from yolox.data import MOTDataset, ValTransform
119
+
120
+ valdataset = MOTDataset(
121
+ data_dir=os.path.join(get_yolox_datadir(), "mft25"),
122
+ json_file=self.val_ann,
123
+ img_size=self.test_size,
124
+ name='test',
125
+ preproc=ValTransform(
126
+ rgb_means=(0.485, 0.456, 0.406),
127
+ std=(0.229, 0.224, 0.225),
128
+ ),
129
+ run_tracking=run_tracking
130
+ )
131
+
132
+ if is_distributed:
133
+ batch_size = batch_size // dist.get_world_size()
134
+ sampler = torch.utils.data.distributed.DistributedSampler(
135
+ valdataset, shuffle=False
136
+ )
137
+ else:
138
+ sampler = torch.utils.data.SequentialSampler(valdataset)
139
+
140
+ dataloader_kwargs = {
141
+ "num_workers": self.data_num_workers,
142
+ "pin_memory": True,
143
+ "sampler": sampler,
144
+ }
145
+ dataloader_kwargs["batch_size"] = batch_size
146
+ val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)
147
+
148
+ return val_loader
149
+
150
+ def get_evaluator(self, batch_size, is_distributed, testdev=False):
151
+ from yolox.evaluators import COCOEvaluator
152
+
153
+ val_loader = self.get_eval_loader(batch_size, is_distributed, testdev=testdev, run_tracking=False) # [hgx0411] dataloader related
154
+ evaluator = COCOEvaluator(
155
+ dataloader=val_loader,
156
+ img_size=self.test_size,
157
+ confthre=self.test_conf,
158
+ nmsthre=self.nmsthre,
159
+ num_classes=self.num_classes,
160
+ testdev=testdev,
161
+ )
162
+ return evaluator
exps/SU-T.py ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # encoding: utf-8
2
+ import os
3
+ import random
4
+ import torch
5
+ import torch.nn as nn
6
+ import torch.distributed as dist
7
+
8
+ from yolox.exp import Exp as MyExp
9
+ from yolox.data import get_yolox_datadir
10
+
11
+ class Exp(MyExp):
12
+ def __init__(self):
13
+ super(Exp, self).__init__()
14
+ self.num_classes = 1
15
+ self.depth = 1.33
16
+ self.width = 1.25
17
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
18
+ self.train_ann = "train.json"
19
+ self.val_ann = "test.json"
20
+ self.input_size = (800, 1440)
21
+ self.test_size = (800, 1440)
22
+ self.random_size = (18, 32)
23
+ self.max_epoch = 80
24
+ self.print_interval = 20
25
+ self.eval_interval = 5
26
+ self.test_conf = 0.1
27
+ self.nmsthre = 0.7
28
+ self.no_aug_epochs = 10
29
+ self.basic_lr_per_img = 0.001 / 64.0
30
+ self.warmup_epochs = 1
31
+
32
+ # tracking params
33
+ self.ckpt = "Checkpoint.pth.tar"
34
+ self.use_byte = True
35
+ self.dataset = "mft25"
36
+ self.inertia = 0.05
37
+ self.iou_thresh = 0.25
38
+ self.asso = "fishiou"
39
+ self.TCM_first_step = True
40
+ self.TCM_byte_step = True
41
+ self.TCM_first_step_weight = 1.0
42
+ self.TCM_byte_step_weight = 1.0
43
+ self.with_reid = False
44
+
45
+ def get_data_loader(self, batch_size, is_distributed, no_aug=False):
46
+ from yolox.data import (
47
+ MOTDataset,
48
+ TrainTransform,
49
+ YoloBatchSampler,
50
+ DataLoader,
51
+ InfiniteSampler,
52
+ MosaicDetection,
53
+ )
54
+
55
+ dataset = MOTDataset(
56
+ data_dir=os.path.join(get_yolox_datadir(), "mft25"),
57
+ json_file=self.train_ann,
58
+ name='',
59
+ img_size=self.input_size,
60
+ preproc=TrainTransform(
61
+ rgb_means=(0.485, 0.456, 0.406),
62
+ std=(0.229, 0.224, 0.225),
63
+ max_labels=500,
64
+ ),
65
+ )
66
+
67
+ dataset = MosaicDetection(
68
+ dataset,
69
+ mosaic=not no_aug,
70
+ img_size=self.input_size,
71
+ preproc=TrainTransform(
72
+ rgb_means=(0.485, 0.456, 0.406),
73
+ std=(0.229, 0.224, 0.225),
74
+ max_labels=1000,
75
+ ),
76
+ degrees=self.degrees,
77
+ translate=self.translate,
78
+ scale=self.scale,
79
+ shear=self.shear,
80
+ perspective=self.perspective,
81
+ enable_mixup=self.enable_mixup,
82
+ )
83
+
84
+ self.dataset = dataset
85
+
86
+ if is_distributed:
87
+ batch_size = batch_size // dist.get_world_size()
88
+
89
+ sampler = InfiniteSampler(
90
+ len(self.dataset), seed=self.seed if self.seed else 0
91
+ )
92
+
93
+ batch_sampler = YoloBatchSampler(
94
+ sampler=sampler,
95
+ batch_size=batch_size,
96
+ drop_last=False,
97
+ input_dimension=self.input_size,
98
+ mosaic=not no_aug,
99
+ )
100
+
101
+ dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
102
+ dataloader_kwargs["batch_sampler"] = batch_sampler
103
+ train_loader = DataLoader(self.dataset, **dataloader_kwargs)
104
+
105
+ return train_loader
106
+
107
+ def get_eval_loader(self, batch_size, is_distributed, testdev=False, run_tracking=False): # [hgx0411] dataloader related
108
+ from yolox.data import MOTDataset, ValTransform
109
+
110
+ valdataset = MOTDataset(
111
+ data_dir=os.path.join(get_yolox_datadir(), "mft25"),
112
+ json_file=self.val_ann,
113
+ img_size=self.test_size,
114
+ name='test',
115
+ preproc=ValTransform(
116
+ rgb_means=(0.485, 0.456, 0.406),
117
+ std=(0.229, 0.224, 0.225),
118
+ ),
119
+ run_tracking=run_tracking
120
+ )
121
+
122
+ if is_distributed:
123
+ batch_size = batch_size // dist.get_world_size()
124
+ sampler = torch.utils.data.distributed.DistributedSampler(
125
+ valdataset, shuffle=False
126
+ )
127
+ else:
128
+ sampler = torch.utils.data.SequentialSampler(valdataset)
129
+
130
+ dataloader_kwargs = {
131
+ "num_workers": self.data_num_workers,
132
+ "pin_memory": True,
133
+ "sampler": sampler,
134
+ }
135
+ dataloader_kwargs["batch_size"] = batch_size
136
+ val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)
137
+
138
+ return val_loader
139
+
140
+ def get_evaluator(self, batch_size, is_distributed, testdev=False):
141
+ from yolox.evaluators import COCOEvaluator
142
+
143
+ val_loader = self.get_eval_loader(batch_size, is_distributed, testdev=testdev, run_tracking=False) # [hgx0411] dataloader related
144
+ evaluator = COCOEvaluator(
145
+ dataloader=val_loader,
146
+ img_size=self.test_size,
147
+ confthre=self.test_conf,
148
+ nmsthre=self.nmsthre,
149
+ num_classes=self.num_classes,
150
+ testdev=testdev,
151
+ )
152
+ return evaluator
exps/default/nano.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+ import torch.nn as nn
7
+
8
+ from yolox.exp import Exp as MyExp
9
+
10
+
11
+ class Exp(MyExp):
12
+ def __init__(self):
13
+ super(Exp, self).__init__()
14
+ self.depth = 0.33
15
+ self.width = 0.25
16
+ self.scale = (0.5, 1.5)
17
+ self.random_size = (10, 20)
18
+ self.test_size = (416, 416)
19
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
20
+ self.enable_mixup = False
21
+
22
+ def get_model(self, sublinear=False):
23
+
24
+ def init_yolo(M):
25
+ for m in M.modules():
26
+ if isinstance(m, nn.BatchNorm2d):
27
+ m.eps = 1e-3
28
+ m.momentum = 0.03
29
+ if "model" not in self.__dict__:
30
+ from yolox.models import YOLOX, YOLOPAFPN, YOLOXHead
31
+ in_channels = [256, 512, 1024]
32
+ # NANO model use depthwise = True, which is main difference.
33
+ backbone = YOLOPAFPN(self.depth, self.width, in_channels=in_channels, depthwise=True)
34
+ head = YOLOXHead(self.num_classes, self.width, in_channels=in_channels, depthwise=True)
35
+ self.model = YOLOX(backbone, head)
36
+
37
+ self.model.apply(init_yolo)
38
+ self.model.head.initialize_biases(1e-2)
39
+ return self.model
exps/default/yolov3.py ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+ import torch
7
+ import torch.nn as nn
8
+
9
+ from yolox.exp import Exp as MyExp
10
+
11
+
12
+ class Exp(MyExp):
13
+ def __init__(self):
14
+ super(Exp, self).__init__()
15
+ self.depth = 1.0
16
+ self.width = 1.0
17
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
18
+
19
+ def get_model(self, sublinear=False):
20
+ def init_yolo(M):
21
+ for m in M.modules():
22
+ if isinstance(m, nn.BatchNorm2d):
23
+ m.eps = 1e-3
24
+ m.momentum = 0.03
25
+ if "model" not in self.__dict__:
26
+ from yolox.models import YOLOX, YOLOFPN, YOLOXHead
27
+ backbone = YOLOFPN()
28
+ head = YOLOXHead(self.num_classes, self.width, in_channels=[128, 256, 512], act="lrelu")
29
+ self.model = YOLOX(backbone, head)
30
+ self.model.apply(init_yolo)
31
+ self.model.head.initialize_biases(1e-2)
32
+
33
+ return self.model
34
+
35
+ def get_data_loader(self, batch_size, is_distributed, no_aug=False):
36
+ from data.datasets.cocodataset import COCODataset
37
+ from data.datasets.mosaicdetection import MosaicDetection
38
+ from data.datasets.data_augment import TrainTransform
39
+ from data.datasets.dataloading import YoloBatchSampler, DataLoader, InfiniteSampler
40
+ import torch.distributed as dist
41
+
42
+ dataset = COCODataset(
43
+ data_dir='data/COCO/',
44
+ json_file=self.train_ann,
45
+ img_size=self.input_size,
46
+ preproc=TrainTransform(
47
+ rgb_means=(0.485, 0.456, 0.406),
48
+ std=(0.229, 0.224, 0.225),
49
+ max_labels=50
50
+ ),
51
+ )
52
+
53
+ dataset = MosaicDetection(
54
+ dataset,
55
+ mosaic=not no_aug,
56
+ img_size=self.input_size,
57
+ preproc=TrainTransform(
58
+ rgb_means=(0.485, 0.456, 0.406),
59
+ std=(0.229, 0.224, 0.225),
60
+ max_labels=120
61
+ ),
62
+ degrees=self.degrees,
63
+ translate=self.translate,
64
+ scale=self.scale,
65
+ shear=self.shear,
66
+ perspective=self.perspective,
67
+ )
68
+
69
+ self.dataset = dataset
70
+
71
+ if is_distributed:
72
+ batch_size = batch_size // dist.get_world_size()
73
+ sampler = InfiniteSampler(len(self.dataset), seed=self.seed if self.seed else 0)
74
+ else:
75
+ sampler = torch.utils.data.RandomSampler(self.dataset)
76
+
77
+ batch_sampler = YoloBatchSampler(
78
+ sampler=sampler,
79
+ batch_size=batch_size,
80
+ drop_last=False,
81
+ input_dimension=self.input_size,
82
+ mosaic=not no_aug
83
+ )
84
+
85
+ dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
86
+ dataloader_kwargs["batch_sampler"] = batch_sampler
87
+ train_loader = DataLoader(self.dataset, **dataloader_kwargs)
88
+
89
+ return train_loader
exps/default/yolox_l.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+
7
+ from yolox.exp import Exp as MyExp
8
+
9
+
10
+ class Exp(MyExp):
11
+ def __init__(self):
12
+ super(Exp, self).__init__()
13
+ self.depth = 1.0
14
+ self.width = 1.0
15
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
exps/default/yolox_m.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+
7
+ from yolox.exp import Exp as MyExp
8
+
9
+
10
+ class Exp(MyExp):
11
+ def __init__(self):
12
+ super(Exp, self).__init__()
13
+ self.depth = 0.67
14
+ self.width = 0.75
15
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
exps/default/yolox_s.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+
7
+ from yolox.exp import Exp as MyExp
8
+
9
+
10
+ class Exp(MyExp):
11
+ def __init__(self):
12
+ super(Exp, self).__init__()
13
+ self.depth = 0.33
14
+ self.width = 0.50
15
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
exps/default/yolox_tiny.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+
7
+ from yolox.exp import Exp as MyExp
8
+
9
+
10
+ class Exp(MyExp):
11
+ def __init__(self):
12
+ super(Exp, self).__init__()
13
+ self.depth = 0.33
14
+ self.width = 0.375
15
+ self.scale = (0.5, 1.5)
16
+ self.random_size = (10, 20)
17
+ self.test_size = (416, 416)
18
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
19
+ self.enable_mixup = False
exps/default/yolox_x.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import os
6
+
7
+ from yolox.exp import Exp as MyExp
8
+
9
+
10
+ class Exp(MyExp):
11
+ def __init__(self):
12
+ super(Exp, self).__init__()
13
+ self.depth = 1.33
14
+ self.width = 1.25
15
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
fast_reid/CHANGELOG.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Changelog
2
+
3
+ ### v1.3
4
+
5
+ #### New Features
6
+ - Vision Transformer backbone, see config in `configs/Market1501/bagtricks_vit.yml`
7
+ - Self-Distillation with EMA update
8
+ - Gradient Clip
9
+
10
+ #### Improvements
11
+ - Faster dataloader with pre-fetch thread and cuda stream
12
+ - Optimize DDP training speed by removing `find_unused_parameters` in DDP
13
+
14
+
15
+ ### v1.2 (06/04/2021)
16
+
17
+ #### New Features
18
+
19
+ - Multiple machine training support
20
+ - [RepVGG](https://github.com/DingXiaoH/RepVGG) backbone
21
+ - [Partial FC](projects/FastFace)
22
+
23
+ #### Improvements
24
+
25
+ - Torch2trt pipeline
26
+ - Decouple linear transforms and softmax
27
+ - config decorator
28
+
29
+ ### v1.1 (29/01/2021)
30
+
31
+ #### New Features
32
+
33
+ - NAIC20(reid track) [1-st solution](projects/NAIC20)
34
+ - Multi-teacher Knowledge Distillation
35
+ - TRT network definition APIs in [FastRT](projects/FastRT)
36
+
37
+ #### Bug Fixes
38
+
39
+ #### Improvements
fast_reid/GETTING_STARTED.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Getting Started with Fastreid
2
+
3
+ ## Prepare pretrained model
4
+
5
+ If you use backbones supported by fastreid, you do not need to do anything. It will automatically download the pre-train models.
6
+ But if your network is not connected, you can download pre-train models manually and put it in `~/.cache/torch/checkpoints`.
7
+
8
+ If you want to use other pre-train models, such as MoCo pre-train, you can download by yourself and set the pre-train model path in `configs/Base-bagtricks.yml`.
9
+
10
+ ## Compile with cython to accelerate evalution
11
+
12
+ ```bash
13
+ cd fastreid/evaluation/rank_cylib; make all
14
+ ```
15
+
16
+ ## Training & Evaluation in Command Line
17
+
18
+ We provide a script in "tools/train_net.py", that is made to train all the configs provided in fastreid.
19
+ You may want to use it as a reference to write your own training script.
20
+
21
+ To train a model with "train_net.py", first setup up the corresponding datasets following [datasets/README.md](https://github.com/JDAI-CV/fast-reid/tree/master/datasets), then run:
22
+
23
+ ```bash
24
+ python3 tools/train_net.py --config-file ./configs/Market1501/bagtricks_R50.yml MODEL.DEVICE "cuda:0"
25
+ ```
26
+
27
+ The configs are made for 1-GPU training.
28
+
29
+ If you want to train model with 4 GPUs, you can run:
30
+
31
+ ```bash
32
+ python3 tools/train_net.py --config-file ./configs/Market1501/bagtricks_R50.yml --num-gpus 4
33
+ ```
34
+
35
+ If you want to train model with multiple machines, you can run:
36
+
37
+ ```
38
+ # machine 1
39
+ export GLOO_SOCKET_IFNAME=eth0
40
+ export NCCL_SOCKET_IFNAME=eth0
41
+
42
+ python3 tools/train_net.py --config-file configs/Market1501/bagtricks_R50.yml \
43
+ --num-gpus 4 --num-machines 2 --machine-rank 0 --dist-url tcp://ip:port
44
+
45
+ # machine 2
46
+ export GLOO_SOCKET_IFNAME=eth0
47
+ export NCCL_SOCKET_IFNAME=eth0
48
+
49
+ python3 tools/train_net.py --config-file configs/Market1501/bagtricks_R50.yml \
50
+ --num-gpus 4 --num-machines 2 --machine-rank 1 --dist-url tcp://ip:port
51
+ ```
52
+
53
+ Make sure the dataset path and code are the same in different machines, and machines can communicate with each other.
54
+
55
+ To evaluate a model's performance, use
56
+
57
+ ```bash
58
+ python3 tools/train_net.py --config-file ./configs/Market1501/bagtricks_R50.yml --eval-only \
59
+ MODEL.WEIGHTS /path/to/checkpoint_file MODEL.DEVICE "cuda:0"
60
+ ```
61
+
62
+ For more options, see `python3 tools/train_net.py -h`.