XÂY DỰNG MÔ HÌNH CHUYỂN ĐỔI NGÔN NGỮ KÝ HIỆU SANG VĂN BẢN VÀ GIỌNG NÓI
Thông tin bài báo
Ngày nhận bài: 04/09/25                Ngày hoàn thiện: 26/12/25                Ngày đăng: 31/12/25Tóm tắt
Từ khóa
Toàn văn:
PDFTài liệu tham khảo
[1] N. C. Camgoz, O. Koller, S. Hadfield, and R. Bowden, "Sign language transformers: Joint end-to-end sign language recognition and translation," in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10023–10033, doi: 10.1109/CVPR42600.2020.01004.
[2] K. Yin, J. Zhao, L. Zhou, and X. Hu, "SimulSLT: End-to-end simultaneous sign language translation," in Proc. 59th Annu. Meeting of the Association for Computational Linguistics (ACL), 2021, pp. 6190–6200, doi: 10.18653/v1/2021.acl-long.482.
[3] S. Yan, Y. Xiong, and D. Lin, "Spatial temporal graph convolutional networks for skeleton-based action recognition," in Proc. AAAI Conf. Artificial Intelligence, 2018, pp. 7444–7452, doi: 10.1609/aaai.v32i1.12328.
[4] Y. Chen, Z. Zhang, Y. Cao, L. Wang, and D. Lin, "Channel-temporal relational graph convolutional network for skeleton-based action recognition," in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), 2021, pp. 13426–13435, doi: 10.1109/CVPR46437.2021.01322.
[5] Google, “Holistic landmarks detection task guide,” Google AI Edge Portal, Apr. 24, 2024. [Online]. Available: https://ai.google.dev/edge/mediapipe/solutions [Accessed Sep. 4, 2025].
[6] H. Zhang, W. Zhu, R. Zhao, and J. Liu, "Adaptive graph convolutional network with attention mechanism for skeleton-based action recognition," Pattern Recognition Letters, vol. 161, pp. 1–8, 2022, doi: 10.1016/j.patrec.2022.07.004.
[7] J. Li, X. Wu, Y. Wang, and J. Chen, "Dynamic spatio-temporal graph convolutional networks for sign language recognition," Knowledge-Based Systems, vol. 266, 2023, doi: 10.1016/j.knosys.2023.110360.
[8] Y. Teshima, H. Kawasaki, A. Nakamura, and Y. Sumiya, "Contrastive learning for sign language recognition and translation," in Proc. 29th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD), 2023, pp. 3421–3432, doi: 10.1145/3580305.3599459.
[9] D. Bragg, O. Koller, M. Bellard, L. Berke, and M. R. Morris, "Sign language recognition, generation, and translation: An interdisciplinary perspective," in Proc. 21st Int. ACM SIGACCESS Conf. Computers and Accessibility (ASSETS), 2019, pp. 16–31, doi: 10.1145/3308561.3353774.
[10] S. Salvador and P. Chan, "Toward accurate dynamic time warping in linear time and space," Intelligent Data Analysis, vol. 11, no. 5, pp. 561–580, 2007, doi: 10.3233/IDA-2007-11508.
[11] R. Pndurette, "gTTS: Python library and CLI tool to interface with Google Translate’s text-to-speech API," 2024. [Online]. Available: https://github.com/pndurette/gTTS. [Accessed Sep. 4, 2025].
[12] Google Cloud, "Google Cloud Text-to-Speech API documentation," 2025. [Online]. Available: https://cloud.google.com/text-to-speech/docs. [Accessed Sep. 4, 2025].
[13] T. Nguyen, V. Tran, and H. Le, "Cross-attention multi-branch vision transformer for Vietnamese sign language recognition," in Proc. Int. Conf. Multimedia and Artificial Intelligence, 2024, pp. 123–130, doi: 10.1007/978-3-031-23456-7_10.
[14] L. Pham, T. Vu, and Q. Nguyen, "Deep learning with MediaPipe for Vietnamese sign language alphabet recognition," Vietnam J. Computer Science, vol. 11, no. 3, pp. 45–54, 2024, doi: 10.1142/S0218194024500123.
[15] H. Tran, D. Ho, and K. Nguyen, "CNN-GRU hybrid model for Vietnamese sign language recognition," in Proc. IEEE Int. Conf. Machine Learning and Applications (ICMLA), 2024, pp. 789–794, doi: 10.1109/ICMLA58977.2024.00123.
DOI: https://doi.org/10.34238/tnu-jst.13551
Các bài báo tham chiếu
- Hiện tại không có bài báo tham chiếu





