Heat-Vision / paper /Methodology.txt
TulkinRB's picture
Add stuff
0bdfe9d
Methodology:
Pretrained YOlO8-l, YOLO11-l, YOLOWORLD-l model variations were fine-tuned to the FLIR ADASv2 dataset, by applying SGD on the pre-partitioned "train" section of the dataset with data augmentation as is provided via the ultralytics suite [1].
Target epoch count was set to 400, with an early-stopping threshold of 50 epochs, measured on the validation partition of the dataset.
Training was performed on the TAU slurm cluster, with GPU acceleration on "titan xp", "geforce_rtx_2080" units.
YOLO8 converged most rapidly, stopping after 103 epochs, after about 25 hours of training
YOLO 11converged after 200 epochs, and slightly longer in terms of time passed, about 30 hours.
YOLO-WORLD reached epoch 69 before succumbing to the imposed time limit of 72 hours, without having hit the early-stopping threshold.
An initial hyper-parameter search was attempted on YOLO11, attempting to test different L2 regularization, learning rate and momentum coefficients, via YOLO's provided genetic algorithm.
This attempt was initialised on the already-converged YOLO11 heat-vision model - A choice which proved to be a fatal mistake, as results have shown to be statistically meaningless,
though a slight (but significant) improvement (being seen in Best_hypertuned_YOLO11.pt) in the mAP50 metric did occur.
For further information on this initial attempt, see appendix B.
[(TAL, THIS IS THE PART WHERE YOU PUT ACTUAL HYPER-PARAMETER SEARCH, FOCUSING ONLY ON L2)]
A tracking module was produced to interpret the frame-wise results of the aforementioned models, being a partial reproduction of Bot-SORT [2] and Byte-Track[3].
As is customary in the field, constant velocity was assumed in all parameters.
We've chosen to use the format of BoT-SORT, having the internal state model consist of x,y,w,h,v_x,v_y,v_h,v_w
With x,y designating the coordinates of the center of the bounding box, and h,w designating the hight and width of the box, respectively.
[(Put LaTeX description of state model here)]
[(Put description of full algorithm here)]
Pseudocode of the algorithm is available in appendix A.