From Plain to Sparse Correspondence Analysis: A Principal Component Analysis Approach

Authors

  • Hervé Abdi School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX, USA
  • Vincent Guillemot Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
  • Ruiping Liu School of Applied Science, Beijing Information Science and Technology Univer- sity, Beijing, China
  • Ndeye Niang Cedric Lab, Conservatoire national des arts et métiers, Paris, France
  • Gilbert Saporta CNAM
  • Ju-Chi Yu Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Canada

DOI:

https://doi.org/10.26398/IJAS.0035-014

Keywords:

Sparsity, Generalized singular value decomposition, Correspondence analysis, LASSO, Penalized matrix decomposition

Abstract

Correspondence Analysis (CA) is the method of choice to analyze contingency matrices and is widely applied in text analysis, psychometrics, chemometrics etc. But CA becomes difficult to interpret when the number of rows or columns is large, a configuration routinely found in contemporary statistical practice. For principal component analysis (PCA), this interpretation problem has been traditionally handled with rotation and more recently with sparsification methods such as the LASSO. Curiously, despite the strong connections between CA and PCA, sparsifying correspondence analysis remains essentially unexplored. In this paper, we derive an extension of the Penalized Matrix Decomposition (a method based on the singular value decomposition) to sparsify CA. We present some theoretical results and properties of the resulting sparse correspondence analysis and illustrate this new method with an analysis of the causes of deaths in the United States in 2019.

Downloads

Published

2024-07-26

How to Cite

Abdi, H., Guillemot, V., Liu, R., Niang, N., Saporta, G., & Yu, J.-C. (2024). From Plain to Sparse Correspondence Analysis: A Principal Component Analysis Approach. Statistica Applicata - Italian Journal of Applied Statistics, 35(3), 301–338. https://doi.org/10.26398/IJAS.0035-014

Issue

Section

Latest articles