COMPARISON OF YOLOV8 AND PYTORCH-RETINANET  FOR VEHICLE DETECTION

Bùi Xuân Tùng; Trịnh Quang Minh; Ngô Thị Lan; Đặng Thị Dung; Huỳnh Duy Đặng

doi:10.34238/tnu-jst.11942

COMPARISON OF YOLOV8 AND PYTORCH-RETINANET FOR VEHICLE DETECTION

About this article

Received: 23/01/25 Revised: 11/03/25 Published: 21/03/25

Authors

1. Bui Xuan Tung , Tay Do University
2. Trinh Quang Minh, Tay Do University
3. Ngo Thi Lan, Tay Do University
4. Dang Thi Dung, Can Tho University of Engineering - Technology
5. Huynh Duy Dang, Can Tho University of Engineering - Technology

Abstract

This study aims to evaluate and compare the effectiveness of two deep learning models - PyTorch-RetinaNet and YOLOv8 - for vehicle detection, addressing the challenges in object detection across varying size, shape, and lighting conditions. The research methodology utilized a comprehensive dataset of 4,058 vehicle images with 12 distinct object classes, implementing both models with varying learning rates (0.001, 0.01, and 0.0001). The dataset was split into training (65%), validation (24%), and testing (11%) sets, with preprocessing techniques including image resizing, brightness normalization, and data augmentation applied to enhance model performance. The experimental results revealed distinct capabilities for each model: PyTorch-RetinaNet achieved a mAP50 of 38.6% and mAP50-95 of 24.7%, exhibiting particular strength in detecting large objects (mAP50-95 of 42.0%) and maintaining stable recall metrics (AR@1: 30.9%, AR@10: 54.7%, AR@100: 55.9%). In contrast, YOLOv8 demonstrated superior overall performance with a mAP50 of 45.6%, mAP50-95 of 33.0%, precision of 48.3%, and recall of 61.5%, particularly excelling in handling overlapping objects with confidence scores of 0.79-0.89. The findings suggest YOLOv8 is more suitable for real-time applications, while PyTorch-RetinaNet excels in scenarios requiring precise detection across varying object sizes.

Keywords

YOLOv8, PyTorch-RetinaNet, Vehicle Detection, Machine Learning, Deep Learning

Full Text:

PDF

References

[1] Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, "Object Detection with Deep Learning: A Review," IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3212-3232, 2019.

[2] G. Tan, Z. Guo, and Y. Xiao, "PA-RetinaNet: Path augmented RetinaNet for dense object detection," in International Conference on Artificial Neural Networks, 2019, pp. 138-149.

[3] B. Koonce and B. Koonce, "ResNet 50," in Convolutional Neural Networks with Swift for TensorFlow: Image Recognition and Dataset Categorization, 2021, pp. 63-72.

[4] D. Reis, J. Kupec, J. Hong, and A. Daoudi, "Real-time flying object detection with YOLOv8," arXiv preprint arXiv:2305.09972, pp. 1-12, 2023.

[5] S. Alexandrova, Z. Tatlock, and M. Cakmak, "RoboFlow: A flow-based visual programming language for mobile manipulation tasks," in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5537-5544.

[6] L. Tan, T. Huangfu, L. Wu, and W. Chen, "Comparison of RetinaNet, SSD, and YOLO v3 for Real-Time Pill Identification," BMC Medical Informatics and Decision Making, vol. 21, pp. 1- 11, 2021.

[7] N. I. Nife and M. Chtourou, "A Comprehensive Study of Deep Learning and Performance Comparison of Deep Neural Network Models (YOLO, RetinaNet)," International Journal of Online & Biomedical Engineering, vol. 19, no. 12, pp. 456-469, 2023.

[8] D. Reis, J. Kupec, J. Hong, and A. Daoudi, "Real-time flying object detection with YOLOv8," IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3212-3232, 2023.

[9] L. Tan, T. Huangfu, L. Wu, and W. Chen, "Comparison of RetinaNet, SSD, and YOLO v3 for Real-Time Pill Identification," IEEE Transactions on Medical Imaging, vol. 21, no. 1, pp. 1- 11, 2021.

[10] H. Guo, Y. Zhang, L. Chen, and A. A. Khan, "Research on vehicle detection based on improved YOLOv8 network," arXiv preprint arXiv:2501.00300, pp. 1-8, 2024.

[11] Y. Li, S. Zhou, and H. Chen, "Attention-based fusion factor in FPN for object detection," Applied Intelligence, vol. 52, no. 13, pp. 15547-15556, 2022.

[12] N. Wulandari, I. Ardiyanto, and H. A. Nugroho, "A Comparison of Deep Learning Approach for Underwater Object Detection," Journal of Engineering Systems and Information Technology, vol. 6, no. 2, pp. 252-258, 2022.

[13] Z. Luo, F. Branchaud-Charron, C. Lemaire, J. Konrad, S. Li, A. Mishra, and P. M. Jodoin, "MIO-TCD: A new benchmark dataset for vehicle classification and localization," IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 5129-5141, 2018.

[14] X. Pan, R. Snyder, J. N. Wang, C. Lander, C. Wickizer, R. Van, and Y. Shao, "Training machine learning potentials for reactive systems: A Colab tutorial on basic models," Journal of Computational Chemistry, vol. 45, no. 10, pp. 638-647, 2024.

DOI: https://doi.org/10.34238/tnu-jst.11942

Refbacks

There are currently no refbacks.



Remember me