SỬ DỤNG MẠNG CNN TRÍCH RÚT ĐẶC TRƯNG  LIÊN QUAN ĐẾN CÁC TIN NHẮN KHẨN CẤP TRÊN MẠNG XÃ HỘI

Đào Nam Anh; Nguyễn Quỳnh Anh; Lê Mạnh Hùng

doi:10.34238/tnu-jst.6133

SỬ DỤNG MẠNG CNN TRÍCH RÚT ĐẶC TRƯNG LIÊN QUAN ĐẾN CÁC TIN NHẮN KHẨN CẤP TRÊN MẠNG XÃ HỘI

Thông tin bài báo

Ngày nhận bài: 07/06/22 Ngày hoàn thiện: 03/08/22 Ngày đăng: 04/08/22

Các tác giả

1. Đào Nam Anh, Trường Đại học Điện lực
2. Nguyễn Quỳnh Anh , Trường Đại học Điện lực
3. Lê Mạnh Hùng, Trường Đại học Điện lực

Tóm tắt

Từ các thông tin trên các trang mạng xã hội, bài toán phân tích xác định nội dung là thật hay giả là một vấn đề cần nghiên cứu trong xử lý ngôn ngữ tự nhiên (NLP). Bài báo trình bày một phương pháp để phân loại trường hợp cấp thiết trong các tin nhắn trên Tweeter. Nhóm nghiên cứu dựa vào biểu diễn các đặc trưng văn bản bằng các mẫu hình ảnh thay vì sử dụng các đặc trưng text được trích xuất trực tiếp từ tin nhắn văn bản. Trong các kỹ thuật xử lý ngôn ngữ tự nhiên, các đặc trưng text thường trích chọn dựa trên việc phân đoạn hoặc phân tích thống kê tần suất xuất hiện của các từ khóa trong các tin nhắn văn bản. Để làm tăng độ chính xác của việc phân lớp nhóm nghiên cứu đã cài đặt một phương pháp dựa trên nhận dạng các mẫu ảnh. Việc chuyển từ đặc trưng text thành ảnh cho phép áp dụng các phép toán tích chập để nhận dạng các mẫu. Điều này mở ra một sự kết hợp giữa NLP và phân tích hình ảnh. Bài báo sử dụng mạng nơ ron tích chập (CNN) thực hiện với các mẫu ảnh để phân lớp các câu. Nghiên cứu cũng được so sánh với các phương pháp khác để đánh giá trong phần mô phỏng so sánh của nghiên cứu đề xuất.

Từ khóa

Đặc trưng ảnh; Trích rút đặc trưng; Xử lý ngôn ngữ; Mạng CNN; Mạng xã hội

Toàn văn:

PDF (English)

Tài liệu tham khảo

[1] J. R. Finkel, T. Grenager, and C. Manning, “Incorporating non-local information into information extraction systems by Gibbs sampling,” Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 2005, pp. 363–370.

[2] Y. Kryvasheyeu, H. Chen, E. Moro, P. V. Hentenryck, and M. Cebrian, “Performance of social network sensors during Hurricane Sandy,” PloS ONE, vol. 10, no. 2, 2015, Art. no. e0117288, doi: 10.1371/journal.pone.0117288.

[3] B. Herfort, J. P. Albuquerque, S. J. Schelhorn, and A. Zipf, “Does the spatiotemporal distribution of tweets match the spatiotemporal distribution of flood phenomena? A study about the River Elbe Flood in June 2013, Twitter Analysis of River Elbe Flood,” Proceedings of the 11th International ISCRAM Conference, May 2014, pp. 1-6.

[4] B. Resch, F. Uslander, and C. Havas. “Combining machine-learning topic models and spatiotemporal analysis of social media data for disaster footprint and damage assessment,” Proceeding of the Cartography and Geographic Information Science 45.4, 2018, pp. 362-376.

[5] K. Stowe, J. Anderson, M. Palmer, L. Palen, and K. Anderson, “Improving Classification of Twitter Behavior During Hurricane Events,” Workshop on Natural Language Processing for Social Media, 2018, pp. 67-75.

[6] M. Park, Y. Sun, and M. L. McLaughlin, “Social media propagation of content promoting risky health behavior,” Proceeding conference Cyberpsychology, Behavior, and Social Networking, 2017, pp. 278-285.

[7] K. Garimella, G. D. F. Morales, A. Gionis, and M. Mathioudakis, “Quantifying controversy on social media,” ACM Transactions on Social Computing, vol. 1, no. 1, pp. 1-27, 2018.

[8] V. Vasilis, “The importance of Neutral Class in Sentiment Analysis,” 2013. [Online]. Available: https://blog.datumbox.com/the-importance-of-neutral-class-in-sentiment-analysis/. [Accessed February 10, 2022].

[9] C. E. Schuller, B. Xia, and Y. Havasi, “New avenues in opinion mining and sentiment analysis,” Conference IEEE Intelligent Systems, vol. 28, no. 2, pp. 15-21, 2013.

[10] L. H. Lin, S. B. Miles, and N. A. Smith, “Natural Language Processing for Analyzing Disaster Recovery Trends Expressed in Large Text corpora,” 2018 IEEE Global Humanitarian
Technology Conf., October 2018, pp. 1-8.

[11] S. Verma, S. Vieweg, W. J. Corvey, L. Palen, J. H. Martin, M. Palmer, A. Schram, and K. M. Anderson, “Natural Language Processing to the Rescue? Extracting Situational Awareness Tweets During Mass Emergency,” Proceedings of the Fifth International Conference on Weblogs and Social Media, 2011, pp. 545-554.

[12] S. H. Li, D. Caragea, C. Caragea, and N. Herndon, “Disaster Response Aided by Tweet Classification with a Domain Adaptation Approach,” Journal of Contingencies and Crisis Management (JCCM), Special Issue on HCI in Critical Systems, pp. 1-20, 2017.

[13] K. Stowe, M. Paul, M. Palmer, L. Palen, and K. Anderson, “Identifying and Categorizing Disaster-Related Tweets, Inter,” Workshop on Natural Language Processing for Social Media, 2016, pp. 1-6.

[14] R. S. Joao, “On Informative Tweet Identification for Tracking Mass Events,” Proceedings of the 13th International Conference on Agents and Artificial Intelligence, vol. 2, 2021, pp. 1226- 1273, doi: 10.5220/0010392712661273

[15] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735- 1780, 1997, doi:10.1162/neco.1997.9.8.1735.

[16] M. Y. Kabir and S. Madria, “A Deep Learning Approach for Tweet Classification and Rescue Scheduling for Effective Disaster Management,” Proceedings of the 27th ACM Sigspatial International Conference on Advances in Geographic Information, 2019, pp. 269-278.

[17] Trim, “The Art of Tokenization,” IBM Developer Works, 2013. [Online]. Available: https://trimc-nlp.blogspot.com/2020/11/the-art-of-tokenization.html. [Accessed February 10, 2022].

[18] D. Barber, Bayesian Reasoning and Machine Learning. Cambridge University Press, 2007.

[19] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” International Conference on Learning Representations, 2015, pp. 1-14.

[20] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A.Rabinovich, “Going Deeper with Convolutions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1-9.

[21] A. G. Howard, “Some improvements on deep convolutional neural network based image classification,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 11-20.

[22] X. T. Dang and N. A. Dao, “Deep Learning-Based Imbalanced Data Classification for Chest X-Ray Image Analysis,” The International Conference on Intelligent Systems & Networks, vol. 243, Springer, 2021, pp. 109-115.

[23] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.

[24] Kaggle, “Natural Language Processing with Disaster Tweets,” Aug. 2015, [Online]. Available: https://www.kaggle.com/c/nlp-getting-started/overview. [Accessed February 10, 2022].

[25] G. Ma, “Tweets Classification with BERT in the Field of Disaster Management,” Workshop Department of Civil Engineering 2019, Stanford University, 2019, pp. 1-15.

[26] M. A. Sit, C. Koylu, and I. Demir, “Identifying disaster related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma,” International Journal of Digital Earth, vol. 12, no. 11, pp. 1205-1229, 2019.

[27] B. Klein, F. Castanedo, I. Elejalde, D. L. de-Ipina, and A. P. Nespral, “Lecture Notes in Computer Science,” Ubiquitous Computing and Ambient Intelligence. Context-Awareness and Context-Driven Interaction. vol. 8276. Springer, Cham - LNISA, 2013, pp. 239-246.

[28] G. Shriya, and R. Debaditya, “Identification of Disaster-Related Tweets Using Natural Language,” Inter. Conf. on Recent Trends in AI, IOT, Smart Cities & App., 2020, pp. 28-36.

DOI: https://doi.org/10.34238/tnu-jst.6133

Các bài báo tham chiếu

Hiện tại không có bài báo tham chiếu



Ghi nhớ