XÂY DỰNG MÔ HÌNH HỌC SÂU PHÁT HIỆN ĐỐI TƯỢNG NGỤY TRANG QUÂN SỰ DỰA TRÊN MÔ HÌNH YOLOv9

Phạm Thu Hương

doi:10.34238/tnu-jst.11154

XÂY DỰNG MÔ HÌNH HỌC SÂU PHÁT HIỆN ĐỐI TƯỢNG NGỤY TRANG QUÂN SỰ DỰA TRÊN MÔ HÌNH YOLOv9

Thông tin bài báo

Ngày nhận bài: 20/09/24 Ngày hoàn thiện: 13/11/24 Ngày đăng: 14/11/24

Các tác giả

Phạm Thu Hương , Viện Công nghệ thông tin - Viện Khoa học và Công nghệ quân sự

Tóm tắt

Phát hiện đối tượng ngụy trang quân sự là một thách thức lớn do các đối tượng này thường được thiết kế để hòa lẫn vào môi trường xung quanh. Bài báo này đề xuất một phương pháp tự động phát hiện ngụy trang quân sự bằng cách sử dụng kỹ thuật học sâu, cụ thể là mô hình YOLOv9 (You Only Look Once). YOLOv9 là một trong những mô hình phát hiện đối tượng tiên tiến, nổi bật với khả năng xử lý thời gian thực và độ chính xác cao. Mô hình YOLOv9 được huấn luyện và đánh giá trên tập dữ liệu đặc thù, bao gồm các hình ảnh chứa đối tượng ngụy trang quân sự trong nhiều bối cảnh khác nhau. Kết quả thử nghiệm cho thấy mô hình YOLOv9 đạt được hiệu suất cao trong việc phát hiện đối tượng ngụy trang quân sự, với độ chính xác vượt trội so với các phương pháp truyền thống. Bài báo này không chỉ chứng minh tính khả thi của việc áp dụng học sâu vào phát hiện ngụy trang mà còn mở ra những hướng đi mới cho các nghiên cứu và ứng dụng trong lĩnh vực này.

Từ khóa

Học sâu; Thị giác máy tính; Ngụy trang quân sự; Phát hiện đối tượng; YoLo

Toàn văn:

PDF

Tài liệu tham khảo

[1] P. Viola and M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, 2004.

[2] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, 2005.

[3] P. Felzenszwalb, D. McAllester, and D. Ramanan, “A discriminatively trained, multiscale, deformable part model,” in 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2008, pp. 1-8.

[4] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Proceedings of 14th European Conference on Computer Vision–ECCV 2016, Amsterdam, The Netherlands, Part I, 2016, pp. 21-37.

[5] T. Lin, “Focal loss for dense object detection,” arXiv preprint arXiv:1708.02002, 2017.

[6] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580-587.

[7] R. Girshick, “Fast R-CNN,” in 2015 IEEE International Conference on Computer Vision, 2015, pp. 1440-1448.

[8] S. Ren, “Faster RCNN: towards real-time object detection with region proposal networks,” in Proceedings of Advances in Neural Information Processing Systems, 2015, pp. 91-99.

[9] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2961-2969.

[10] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788.

[11] H. Bi, C. Zhang, K. Wang, J. Tong, and F. Zheng, “Rethinking camouflaged object detection: Models and datasets,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 9, pp. 5708-5724, 2021.

[12] M. Liu, and X. Di, “Extraordinary MHNet: Military high-level camouflage object detection network and dataset,” Neurocomputing, vol. 549, p. 126466, 2023.

[13] C. Y. Wang, I. H. Yeh, and H. Y. M. Liao, “Yolov9: Learning what you want to learn using programmable gradient information,” arXiv preprint arXiv:2402.13616, 2014.

[14] P. Skurowski, H. Abdulameer, J. Błaszczyk, T. Depta, A. Kornacki, and P. Kozieł, “Animal camouflage analysis: Chameleon database,” 2017. [Online]. Available: https://www.polsl.pl/rau6/ chameleon-database-animal-camouflage-analysis/. [Accessed Nov. 8, 2024].

[15] T. N. Le, T. V. Nguyen, Z. Nie, M. T. Tran, and A. Sugimoto, “Anabranch network for camouflaged object segmentation,” Computer Vision and Image Understanding, vol. 184, pp. 45-56, 2019.

[16] D. P. Fan, G.P. Ji, M. M. Cheng, and L. Shao, “Concealed object detection,” IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6024-6042, 2021.

[17] Y. Lv, J. Zhang, Y. Dai, A. Li, B. Liu, N. Barnes, and D.P. Fan, “Simultaneously localize, segment and rank the camouflaged objects,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11591-11601.

[18] X. Yi, J. Wu, B. Ma, Y. Ou, and L. Liu, “MOD: benchmark for military object detection,” arXiv preprint arXiv:2104.13763, 2021.

[19] Y. Zheng, X. Zhang, F. Wang, T. Cao, M. Sun, and X. Wang, “Detection of people with camouflage pattern via dense deconvolution network,” IEEE Signal Processing Letters, vol. 26, no. 1, pp. 29-33, 2018.

[20] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Y. Fu, and A.C. Berg, “Ssd: Single shot multibox detector,” In Proceedings of 14th European Conference on Computer Vision–ECCV 2016, Amsterdam, The Netherlands, Part I 14, 2016, pp. 21-37.

[21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 6, pp. 1137-1149, 2016.

[22] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” In European conference on computer vision, 2020, pp. 213-229.

[23] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” arXiv preprint arXiv:1904.01355, 2019.

[24] T. Y. Ross and G. K. H. P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2980-2988.

[25] Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154-6162.

[26] P. Sun, R. Zhang, Y. Jiang, T. Kong, C. Xu, W. Zhan, M. Tomizuka, L. Li, Z. Yuan, C. Wang, and P. Luo, “Sparse r-cnn: End-to-end object detection with learnable proposals,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 14454-14463.

[27] D. Wang, K. Shang, H. Wu and C. Wang, “Decoupled R-CNN: Sensitivity-specific detector for higher accurate localization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 9, pp. 6324-6336, 2022.

[28] Q. Wang, S. Zhang, Y. Qian, G. Zhang, and H. Wang, “Enhancing representation learning by exploiting effective receptive fields for object detection,” Neurocomputing, vol. 481, pp. 22-32, 2022.

[29] S. Xu, X. Wang, W. Lv, Q. Chang, C. Cui, K. Deng, G. Wang, Q. Dang, S. Wei, Y. Du, and B. Lai, “PP-YOLOE: An evolved version of YOLO,” arXiv preprint arXiv:2203.16250, 2022.

[30] Z. Gao, L. Wang, B. Han, and S. Guo, “Adamixer: A fast-converging query-based object detector,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5364-5373.

[31] H. Qiu, H. Li, Q. Wu, J. Cui, Z. Song, L. Wang, and M. Zhang, “CrossDet++: Growing crossline representation for object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 3, pp. 1093-1108, 2022.

[32] C.Y. Wang, A. Bochkovskiy, and H.Y.M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 7464-7475.

DOI: https://doi.org/10.34238/tnu-jst.11154

Các bài báo tham chiếu

Hiện tại không có bài báo tham chiếu



Ghi nhớ