MỘT PHƯƠNG PHÁP TỐI ƯU THAM SỐ TÍNH MỜ TRÍCH RÚT  TẬP CÂU TÓM TẮT TỐI ƯU TỪ DỮ LIỆU SỐ

Phạm Đình Phong; Phạm Thị Lan; Trần Xuân Thanh

doi:10.34238/tnu-jst.9824

MỘT PHƯƠNG PHÁP TỐI ƯU THAM SỐ TÍNH MỜ TRÍCH RÚT TẬP CÂU TÓM TẮT TỐI ƯU TỪ DỮ LIỆU SỐ

Thông tin bài báo

Ngày nhận bài: 01/03/24 Ngày hoàn thiện: 28/03/24 Ngày đăng: 29/03/24

Các tác giả

1. Phạm Đình Phong , Trường Đại học Giao thông vận tải
2. Phạm Thị Lan, Trường Đại học Sư phạm Hà Nội
3. Trần Xuân Thanh, 1) Trường Đại học Công nghệ Đông Á, 2) Học viện Khoa học và Công nghệ - Viện Hàn lâm Khoa học và Công nghệ Việt Nam

Tóm tắt

Trích rút tập câu tóm tắt bằng ngôn ngữ từ dữ liệu số giúp đưa ra các câu tóm tắt được diễn đạt bằng ngôn ngữ tự nhiên mô tả tri thức ẩn dấu trong tập dữ liệu số. Một số mô hình thuật toán di truyền được đề xuất nhằm trích rút tập câu tóm tắt tối ưu, trong đó, mô hình thuật toán trích rút tập câu tóm tắt đảm bảo tính giải nghĩa nội dung các câu tóm tắt trên cơ sở kết hợp thuật toán di truyền với chiến lược tham lam cho kết quả khá tốt. Tuy nhiên, việc xác định các tham số tính mờ của mô hình thuật toán phụ thuộc vào cảm nhận trực giác của chuyên gia. Trong bài báo này, chúng tôi đề xuất một thuật toán tối ưu các tham số tính mờ nhằm nâng cao chất lượng tập câu tóm tắt được trích xuất từ dữ liệu số. Kết quả thực nghiệm với cơ sở dữ liệu creep cho thấy, với bộ tham số tính mờ được tối ưu, chất lượng của tập câu tóm tắt được trích rút tốt hơn trên ba độ đo là giá trị hàm thích nghi, giá trị chân lý trung bình và số câu có từ lượng hóa lớn hơn a half.

Từ khóa

Tóm tắt ngôn ngữ; Đại số gia tử; Tính giải nghĩa được; Cấu trúc đa ngữ nghĩa; Tối ưu bầy đàn

Toàn văn:

PDF

Tài liệu tham khảo

[1] R. R. Yager, "A new approach to the summarization of data," Information Sciences, vol. 28, no. 1, pp. 69-86, 1982.

[2] J. Kacprzyk, R. R. Yager, and S. Zadrożny, "A fuzzy logic based approach to linguistic summaries of databases," International Journal of Applied Mathematics and Computer Science, vol. 10, no. 4, pp. 813-834, 2000.

[3] J. Kacprzyk and S. Zadrożny, "Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools," Information Sciences, vol. 173, no. 4, pp. 281-304, 2005.

[4] C. A. D. Díaz, R. B. Pérez, and E. V. Morales, "Using Linguistic Data Summarization in the study of creep data for the design of new steels," in 11th International Conference on Intelligent Systems Design and Applications (ISDA), 2011, pp. 160-165.

[5] T. Altintop, R. R. Yager, D. Akay, F. E. Boran, and M. Ünal, "Fuzzy Linguistic Summarization with Genetic Algorithm: An Application with Operational and Financial Healthcare Data," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 25, no. 04, pp. 599-620, 2017.

[6] R. J. Almeida, M.-J. Lesot, B. Bouchon-Meunier, U. Kaymak, and G. Moyse, "Linguistic summaries of categorical time series for septic shock patient data," Fuzz-IEEE 2013-IEEE International Conference on Fuzzy Systems, Hyderabad, India. IEEE, Jul. 2013, pp.1-8.

[7] J. Kacprzyk and R. R. Yager, "Linguistic summaries of data using fuzzy logic," International Journal of General System, vol. 30, no. 2, pp. 133-154, 2001.

[8] M. D. Peláez-Aguilera, M. Espinilla, M. R. F. Olmo, and J. Medina, "Fuzzy linguistic protoforms to summarize heart rate streams of patients with ischemic heart disease," Complexity, vol. 2019, pp. 1-11, 2019.

[9] A. Duraj, P. S. Szczepaniak, and L. Chomatek, "Intelligent Detection of Information Outliers Using Linguistic Summaries with Non-monotonic Quantifiers," International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, 2020, pp. 787-799.

[10] A. Jain, M. Popescu, J. Keller, M. Rantz, and B. Markway, "Linguistic summarization of in-home sensor data," Journal of biomedical informatics, vol. 96, 2019, Art. no. 103240.

[11] A. Wilbik, I. Vanderfeesten, D. Bergmans, S. Heines, and W. van Mook, "Linguistic summaries for compliance analysis of a glucose management clinical protocol," IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2018, pp. 1-7.

[12] F. E. Boran and D. Akay, "A generic method for the evaluation of interval type-2 fuzzy linguistic summaries," IEEE transactions on cybernetics, vol. 44, no. 9, pp. 1632-1645, 2013.

[13] C. Donis-Diaz, R. Bello, and J. Kacprzyk, "Linguistic data summarization using an enhanced genetic algorithm," Technical Transactions – Automatic Control, vol. 2013, no. 2, pp. 3-12, 2013.

[14] C. Donis-Diaz, A. Muro, R. Bello-Pérez, and E. V. Morales, "A hybrid model of genetic algorithm with local search to discover linguistic data summaries from creep data," Expert Systems with Applications, vol. 41, no. 4, pp. 2035-2042, 2014.

[15] T. L. Pham, C. H. Nguyen, and D. P. Pham, “Extracting an optimal set of linguistic summaries using genetic algorithm combined with greedy strategy,” Journal on Information Technologies & Communications, vol. 2020, no. 2, pp. 75-87, 2020.

[16] C. H. Nguyen, T. S. Tran, and D. P. Pham, "Modeling of a semantics core of linguistic terms based on an extension of hedge algebra semantics and its application," Knowledge-Based Systems, vol. 67, pp. 244-262, 2014.

[17] C. H. Nguyen, T. L. Pham, T. N. Nguyen, C. H. Ho, and T. A. Nguyen, "The linguistic summarization and the interpretability, scalability of fuzzy representations of multilevel semantic structures of word-domains," Microprocessors and Microsystems, vol. 81, 2021, Art. no. 103641.

[18] J. Kennedy and R. C. Eberhart, “Particle Swarm Optimization,” Proceedings of the IEEE International Conference on Neural Networks, Piscataway, New Jersey. IEEE Service Center, 1995, pp. 1942-1948.

DOI: https://doi.org/10.34238/tnu-jst.9824

Các bài báo tham chiếu

Hiện tại không có bài báo tham chiếu



Ghi nhớ