IMAGE RECOGNITION WITH IMBALANCED DATA BASED ON DEEP LEARNING | Thành | TNU Journal of Science and Technology

IMAGE RECOGNITION WITH IMBALANCED DATA BASED ON DEEP LEARNING

About this article

Received: 19/03/25                Revised: 09/05/25                Published: 10/05/25

Authors

1. Tran Van Thanh, Lac Hong University
2. Nguyen Van Dai, University of Science - VNU
3. Ha Manh Toan, Institute of Information Technology - VAST
4. Duong Thi Nhung Email to author, Thai Nguyen University

Abstract


Skin cancer has been a serious health problem to human society in recent times, and patients will easily face dangerous situations if their diseases are not detected early. To address this issue, this research has been conducted towards an automatic classification of skin lesion images that can be captured by using a normal camera. Experiments have been conducted on the HAM10000 set, which had 7 different lesion types and a significant imbalance between classes. Accordingly, this research focuses on handling data imbalance, which helps to increase the efficiency in identifying minority classes but still needs to ensure the performance in identifying majority classes. Comprehensive and comparative experiments are also conducted with popular deep learning architectures including ConvNeXtTiny, DenseNet 201, Inception-ResNet-v2, and MobileNet-v3 Small to discuss and clarify the hypothesis. The study confirmed the superiority of the proposed method with the highest balanced accuracy value of 0.7584 and the overall accuracy value of 0.8408 for the ConvNeXtTiny model.

Keywords


Skin cancer detection; Class imbalance; HAM10000; Convolution neural network; Balanced accuracy

References


[1] World Health Organization, "Cancer today: Data visualization tools for exploring the global cancer burden in 2022," International Agency for Research on Cancer, 2022. [Online]. Available: https://gco.iarc.who.int/media/globocan/factsheets/populations/900-world-fact-sheet.pdf. [Accessed March. 13, 2025].

[2] American Cancer Society, "Cancer Facts & Figures 2022," American Cancer Society, Atlanta, GA, 2022. [Online]. Available: https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2022/2022-cancer-facts-and-figures.pdf. [Accessed March. 13, 2025].

[3]European Cancer Information System, Estimates of cancer incidence and mortality in 2022, European Commission, Joint Research Centre, 2022.

[4] H. Koh et al., "Changing epidemiology of skin cancer in Asia," Journal of Dermatological Science, vol. 94, no. 1, pp. 2-9, 2019.

[5] G. P. Guy et al., "Annual total cost of skin cancer treatment in the U.S.," American Journal of Preventive Medicine, vol. 48, no. 2, pp. 183-187, 2015.

[6] A. C. Society, "Survival Rates for Melanoma Skin Cancer," American Cancer Society Medical Content and News, 2022.

[7] N. C. F. Codella et al., "Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI)," in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018,pp. 168-172.

[8] M. E. Celebi et al., "A methodological approach to the classification of dermoscopy images," Computerized Medical Imaging and Graphics, vol. 31, no. 6, pp. 362-373, 2007.

[9] A. Esteva et al., "Dermatologist-level classification of skin cancer with deep neural networks," Nature, vol. 542, no. 7639, pp. 115-118, 2017.

[10] S. R. Hassan, S. Afroge, and M. Mizan, “Skin lesion classification using densely connected convolutional network,” in 2020 IEEE Region 10 Symposium (TENSYMP), 2020,pp. 750-753.

[11] X. Zhang et al., "Multi-class skin lesion classification using deep learning with attention mechanism," Computers in Biology and Medicine, vol. 129, 2021, Art. no. 104380.

[12] S. S. Chaturvedi, K. Gupta, and P. S. Prasad, “Skin lesion analyser: An efficient seven-way multi-class skin cancer classification using mobilenet,” in International Conference on Advanced Machine Learning Technologies and Applications. Springer, 2020, pp. 165-176.

[13] M. Lucius, J. De All, J. A. De All, M. Belvisi, L. Radizza, M. Lanfranconi, V. Lorenzatti, and C. M. Galmarini, “Deep Neural Frameworks Improve the Accuracy of General Practitioners in the Classification of Pigmented Skin Lesions,” Diagnostics, vol. 10, no. 11, Nov. 2020, Art. no. 969, doi: 10.3390/diagnostics10110969.

[14] P. Tschandl et al., "The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions," Scientific Data, vol. 5, 2018, Art. no. 180161.

[15] N. V. Chawla et al., "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.

[16] H. He and E. A. Garcia, "Learning from Imbalanced Data," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263-1284, 2009.

[17] B. Krawczyk, "Learning from imbalanced data: open challenges and future directions," Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221-232, 2016.

[18] T. Y. Lin et al., "Focal Loss for Dense Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318-327, 2020.

[19] J. Wang et al., "Automated skin lesion classification using deep learning for skin cancer detection," Medical & Biological Engineering & Computing, vol. 58, pp. 1665-1679, 2020.

[20] Z.Liu et al.,“A ConvNet for the 2020s,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976-11986.

[21] G. Huang et al.,“Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.

[22] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Thirty-first AAAI conference on artificial intelligence,2017, pp.4278 - 4284.

[23] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, "Searching for MobileNetV3," in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019,pp. 1314-1324.

[24]Y. Cuiet et al.,“Class-Balanced Loss Based on Effective Number of Samples,”arXiv preprint arXiv:1901.05555, 2019, pp. 9268-9277.

[25] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014, pp. 1-13.




DOI: https://doi.org/10.34238/tnu-jst.12337

Refbacks

  • There are currently no refbacks.
TNU Journal of Science and Technology
Rooms 408, 409 - Administration Building - Thai Nguyen University
Tan Thinh Ward - Thai Nguyen City
Phone: (+84) 208 3840 288 - E-mail: jst@tnu.edu.vn
Based on Open Journal Systems
©2018 All Rights Reserved