MỘT KỸ THUẬT MÔ PHỎNG CỬ ĐỘNG CỦA MÔ HÌNH ĐẦU NGƯỜI 3D THEO LỜI THOẠI TIẾNG VIỆT
Thông tin bài báo
Ngày nhận bài: 07/07/23                Ngày hoàn thiện: 30/08/23                Ngày đăng: 31/08/23Tóm tắt
Từ khóa
Toàn văn:
PDFTài liệu tham khảo
[1] L. Chen, R. K. Maddox, Z. Duan, and C. Xu, "Hierarchical cross-modal talking face generation with dynamic pixel-wise loss," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 7832-7841.
[2] H. Zhou, Y. Liu, Z. Liu, P. Luo, and X. Wang, "Talking face generation by adversarially disentangled audio-visual representation," in Proceedings of the AAAI conference on artificial intelligence, 2019, vol. 33, no. 01, pp. 9299-9306.
[3] Y. Song, J. Zhu, D. Li, A. Wang, and H. Qi, "Talking face generation by conditional recurrent adversarial network," in Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019, pp. 919-925.
[4] V. Konstantinos, P. Stavros, and P. Maja, "Realistic speech-driven facial animation with gans," International Journal of Computer Vision, vol. 128, pp. 1398-1413, 2020.
[5] K. Prajwal, R. Mukhopadhyay, V. P. Namboodiri, and C. Jawahar, "A lip sync expert is all you need for speech to lip generation in the wild," in Proceedings of the 28th ACM international conference on multimedia, 2020, pp. 484-492.
[6] Y. Zhou, X. Han, E. Shechtman, J. Echevarria, E. Kalogerakis, and D.Li, "Makelttalk: speaker-aware talking-head animation," ACM Transactions On Graphics, vol. 39, no. 6, pp. 1-15, 2020.
[7] Y. Ding, C. Pelachaud, and T. Artieres, "Modeling multimodal behaviors from speech prosody," in Proceedings of 13th International Conference on Intelligent Virtual Agents, Springer, 2013, pp. 217-228.
[8] L. Chen, C. Guofeng, L. Celong, L. Zhong, K. Ziyi, X. Yi, and X. Chenliang, "Talking-head generation with rhythmic head motion," in European Conference on Computer Vision, Springer, 2020, pp. 35-51.
[9] M. Fratarcangeli and M. Schaerf, "Realistic modeling of animatable faces in MPEG-4," in Proceedings of 17th Annual Conference on Computer Animation and Social Agents, 2004, pp. 285-297.
[10] L. Turban, D. Girard, N. Kose, and J.-L. Dugelay, "From Kinect video to realistic and animatable MPEG-4 face model: A complete framework," in 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2015, pp. 1-6.
[11] I. R. Ali, H. Kolivand, and M. H. Alkawaz, "Lip syncing method for realistic expressive 3D face model," Multimedia Tools Applications, vol. 77, pp. 5323-5366, 2018.
[12] Y. Zhao, D. Jiang, and H. Sahli, "3D emotional facial animation synthesis with factored conditional Restricted Boltzmann Machines," in 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), 2015, pp. 797-803.
[13] S. Dahmani, V. Colotte, V. Girard, and S. Ouni, "Conditional variational auto-encoder for text-driven expressive audiovisual speech synthesis," in INTERSPEECH 2019-20th Annual Conference of the International Speech Communication Association, 2019, pp. 2598-2602.
[14] M. Liu, Y. Duan, R. A. Ince, C. Chen, O. G. Garrod, P. G. Schyns, and R. E. Jack, "Building a generative space of facial expressions of emotions using psychological data-driven methods," in Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, 2020, pp. 1-3.
[15] S. Wang, L. Li, Y. Ding, C. Fan, and X. Yu, "Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion," arXiv e-prints, no. arXiv: 2107.09293, 2021.
[16] C. D. Thi and T. L. Son, "3D character expression animation according to Vietnamese sentence semantics," TNU Journal of Science and Technology, vol. 227, no. 16, pp. 20 - 28, 2022.
DOI: https://doi.org/10.34238/tnu-jst.8297
Các bài báo tham chiếu
- Hiện tại không có bài báo tham chiếu