SPARSE STAR COORDINATES: VISUALIZATION FOR HIGH DIMENSION LOW SAMPLE SIZE | Long | TNU Journal of Science and Technology

SPARSE STAR COORDINATES: VISUALIZATION FOR HIGH DIMENSION LOW SAMPLE SIZE

About this article

Received: 17/04/23                Revised: 24/05/23                Published: 24/05/23

Authors

1. Tran Van Long Email to author, University of Transport and Communications, Hanoi
2. Bui Viet Huong, University of Transport and Communications, Hanoi

Abstract


The visual analysis of group structures and trends of high-dimensional data is a central topic in many fields, particularly in genomic data analysis. Gene expression data have a small number of observations and a large number of attributes. The traditional statistical methods are not directly applied to analyze for high dimension, low sample size. In this paper, we introduce a new visualization technique approach to visual analytics of high-dimension, low-sample size. We propose a sparse star coordinates visualization technique based on star coordinates that group structures are preserved thanks to the optimal layouts of star coordinate systems on the visual space. The larger star coordinates are more important dimensions in cluster analysis. The sparse star coordinate system attains by ranking the best quality visualization of the order of the dominant attributes to analyze the group structures of the high-dimension, low-sample size data sets. We present our proposed method with quality measurement and attest to the effectiveness of our approach for several real data sets.

Keywords


Star coordinates; High dimension low sample size; Data visualization; Silhouette coefficient; Feature Importance

References


[1] L. Shusen, M. Dan, W. Bei, P. Bremer, and V. Pascucci, "Visualizing high-dimensional data: Advances in the past decade," IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 3, pp. 1249-1268, 2017.

[2] E. Kandogan, "Star coordinates: A multi-dimensional visualization technique with uniform treatment of dimensions," Proceedings of the IEEE Information Visualization Symposium, Hot Topics, 2000, pp. 4-8.

[3] E. Kandogan, "Visualizing multi-dimensional clusters, trends, and outliers using star coordinates," Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD' 01, 2001, pp. 107-116.

[4] G. Z. Germain, G. N. Luis, and G. Erick, "iStar (i*): An interactive star coordinates approach for high-dimensional data exploration," Computers and Graphics, vol. 60, pp. 107-118, 2016.

[5] W. Yunhai, L. Jingting, N. Feiping, T. Holger, G. Minglun, and J. L. Dirk, "Linear Discriminative Star Coordinates for Exploring Class and Cluster Separation of High Dimensional Data," Computer Graphics Forum, vol. 36, no. 3, pp. 401-410, 2017.

[6] H. Rave, V. Molchanov, and L. Linsen, "Axes Bundling and Brushing in Sta Coordinates," International Symposium on Vision, Modeling, and Visualization, 2021, doi: 10.2312/vmv.20211365.

[7] A. Sanchez, C. Soguero-Ruiz, I. Mora-Jiménez, F. J. Rivas-Flores, D. J. Lehmann, and M. Rubio-Sánchez, "Scaled radial axes for interactive visual feature selection: A case study for analyzing chronic conditions," Expert Systems with Applications, vol. 100, pp. 182-196, 2018.

[8] A. Sanchez, L. Raya, M. A. Mohedano-Munoz, and M. Rubio-Sánchez, "Feature selection based on star coordinates plots associated with eigenvalue problems," The Visual Computer, vol. 37, pp. 203–216, 2021.

[9] P. Hoffman, G. Grinstein, K. Marx, I. Grosse, and E. Stanley, "DNA visual and analytic data mining," Proceedings of the 8th conference on Visualization'97, 1997, pp. 437-441.

[10] M. Rubio-Sánchez, L. Raya, F. Díaz, and A. Sanche, "A comparative study between RadViz and Star Coordinates," IEEE transactions on visualization and computer graphics, vol. 22, no. 1, pp. 619-628, 2016.

[11] G. Leban, B. Zupan, G. Vidmar, and I. Bratko, "VizRank: Data visualization guided by machine learning," Data Mining and Knowledge Discovery, vol. 13, no. 2, pp. 119-136, 2006.

[12] J. Demsar, G. Leban, and B. Zupan, "FreeViz: An intelligent multivariate visualization approach to explorative analysis of biomedical data," Journal of Biomedical Informatics, vol. 40, no. 6, pp. 661-671, 2007.

[13] Y. C. Wang, Q. Zhang, F. Lin, C. K. Goh, and H. S. Seah, "PolarViz: A discriminating visualization and visual analytics tool for high-dimensional data," The Visual Computer, vol. 35, pp. 1567–1582, 2019.

[14] T. V. Long, "ArcViz: An Extended Radial Visualization for Classes Separation of High Dimensional Data," The 10th International Conference on Knowledge and Systems Engineering (KSE 2018), 2018, pp. 158-162.

[15] J. F. McCarthy, K. Marx, P. E. Hoffman, A. G. Gee, P. O'Neil, M. Ujwal, and J. Hotchkiss, "Applications of Machine Learning and High-Dimensional Visualization in Cancer Detection, Diagnosis and Management,” Annals of the New York Academy of Sciences, vol. 1020, no. 1, pp. 239 - 262, 2004.




DOI: https://doi.org/10.34238/tnu-jst.7768

Refbacks

  • There are currently no refbacks.
TNU Journal of Science and Technology
Rooms 408, 409 - Administration Building - Thai Nguyen University
Tan Thinh Ward - Thai Nguyen City
Phone: (+84) 208 3840 288 - E-mail: jst@tnu.edu.vn
Based on Open Journal Systems
©2018 All Rights Reserved