TẤN CÔNG ĐỐI KHÁNG VÀO MÔ HÌNH HỌC SÂU  SỬ DỤNG PHƯƠNG PHÁP BIẾN ĐỔI ĐIỂM ẢNH

Trương Phi Hồ; Hoàng Thanh Nam; Trần Quang Tuấn; Phạm Minh Thuấn; Phạm Duy Trung

doi:10.34238/tnu-jst.6964

TẤN CÔNG ĐỐI KHÁNG VÀO MÔ HÌNH HỌC SÂU SỬ DỤNG PHƯƠNG PHÁP BIẾN ĐỔI ĐIỂM ẢNH

Thông tin bài báo

Ngày nhận bài: 22/11/22 Ngày hoàn thiện: 26/12/22 Ngày đăng: 26/12/22

Các tác giả

1. Trương Phi Hồ, Học viện Kỹ thuật Mật mã
2. Hoàng Thanh Nam, Học viện Kỹ thuật Mật mã
3. Trần Quang Tuấn, Trường Đại học Thông tin Liên lạc
4. Phạm Minh Thuấn, Ban Cơ yếu Chính phủ
5. Phạm Duy Trung , Học viện Kỹ thuật Mật mã

Tóm tắt

Học sâu hiện nay đang phát triển và được nhiều nhóm tác giả quan tâm nghiên cứu, tuy nhiên các mô hình học sâu có những rủi ro tiềm tàng về an toàn có thể trở thành những lỗ hổng nghiêm trọng cho các ứng dụng. Hiện nay mẫu đối kháng được thiết kế đánh lừa mạng nơ ron trong mạng thần kinh sâu hoạt động sai so với thiết kế ban đầu, và xác suất thành công của các mẫu đối kháng là rất đáng lo ngại, điều này đặt ra những lo ngại về bảo mật cho các mô hình học máy. Việc nghiên cứu và hiểu rõ các tấn công đối kháng giúp tăng cường độ an toàn cho các mô hình học máy. Thực tế, hầu hết các nghiên cứu về các cuộc tấn công đối kháng có thể đánh lừa các mô hình hộp đen. Bài báo sử dụng phương pháp thay đổi điểm ảnh để thực hiện một cuộc tấn công đối kháng, từ đó có thể tấn công và đánh lừa hệ thống học sâu. Bằng cách này, phương pháp biến đổi điểm ảnh sử dụng tập dữ liệu Mèo và Chó thử nghiệm trên mô hình InceptionV3. Kết quả chứng minh phương pháp đề xuất có tỉ lệ thành công cao khiến mô hình học sâu nhận dạng sai theo hướng mục tiêu đã được chỉ định.

Từ khóa

Học sâu; Tấn công đối kháng; Tấn công hộp đen; Mạng thần kinh sâu; Huấn luyện mô hình

Toàn văn:

PDF

Tài liệu tham khảo

[1] A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, “Synthesizing robust adversarial examples,” in International conference on machine learning, PMLR, 2018, pp. 284-293.

[2] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song, “Robust physicalworld attacks on deep learning visual classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1625-1634 .

[3] S. Bhambri, S. Muku, A. Tulasi, and A. B. Buduru, “A survey of black-box adversarial attacks on computer vision models,” arXiv preprint arXiv:1912.01667, 2019.

[4] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can machine learning be secure?,” in Proceedings of the 2006 ACM Symposium on Information, computer and communications security, 2006, pp. 16-25.

[5] B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli, “Evasion attacks against machine learning at test time,” in Joint European conference on machine learning and knowledge discovery in databases, Springer, 2013, pp. 387-402.

[6] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The caltech-ucsd birds-200-2011 dataset,” Technical Report CNS-TR-2010-001, California Institute of Technology, 2011.

[7] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.

[8] N. Papernot, P. McDaniel, and I. Goodfellow, “Transferability in machine learning: from phenomena to black-box attacks using adversarial Samples,” arXiv preprint arXiv:1605.07277, 2016.

[9] P. Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C. J. Hsieh, “Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models,” in Proceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 15-26.

[10] M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition,” in Proceedings of the 2016 acm sigsac conference on computer and communications Security, 2016, pp. 1528-1540.

[11] M. Barreno, B. Nelson, A. D. Joseph, and J. D. Tygar, “The security of machine learning,” Machine Learning, vol. 81, pp. 121-148, 2010.

[12] B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines,” arXiv preprint arXiv:1206.6389, 2012.

[13] T. Chen, C. Luo, and L. Li, “Intriguing properties of contrastive losses,” Advances in Neural Information Processing Systems, vol. 34, pp. 11834-11845, 2021.

[14] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.

[15] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against machine learning,” in Proceedings of the 2017 ACM on Asia conference on computer and communications security, 2017, pp. 509-519.

[16] Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transferable adversarial examples and black-box attacks,” arXiv preprint arXiv:1611.02770, 2016.

[17] W. Brendel, J. Rauber, and M. Bethge, “Decision-based adversarial attacks: Reliable attacks against black-box machine learning models,” arXiv preprint arXiv:1712.04248, 2017.

[18] N. Narodytska and S. P. Kasiviswanathan, “Simple Black-Box Adversarial Attacks on Deep Neural Networks,” in CVPR Workshops, 2017, doi: 10.1109/CVPRW.2017.172.

[19] A. Ilyas, L. Engstrom, A. Athalye, and J. Lin, “Blackbox adversarial attacks with limited queries and information,” in International Conference on Machine Learning, PMLR, 2018, pp. 2137-2146 .

[20] M. Alzantot, Y. Sharma, S. Chakraborty, H. Zhang, C.J. Hsieh, and M. B. Srivastava, “Genattack: Practical black-box attacks with gradient-free optimization,” in Proceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 1111-1119.

[21] J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for fooling deep neural networks,” IEEE Transactions on Evolutionary Computation, vol. 23, no. 5, pp. 828-841, 2019.

[22] R. Storn and K. Price, “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341-359, 1997.

[23] L. Yang, P. Luo, C. C. Loy, and X. Tang, “A large-scale car dataset for fine-grained categorization and verification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3973-3981.

[24] O. M. Parkhi, A. Vedaldi, A. Zisserman, and C. V. Jawahar, “Cats and Dogs,” in IEEE Conference on Computer Vision and Pattern Recognition, 2012, doi: 10.1109/CVPR.2012.6248092.

[25] J. Y. Zhu, C. Xiao, B. Li, W. He, M. Liu, and D. Song, “Spatially transformed adversarial examples,” arXiv:1801.02612v2, 2018.

DOI: https://doi.org/10.34238/tnu-jst.6964

Các bài báo tham chiếu

Hiện tại không có bài báo tham chiếu



Ghi nhớ