From Plain to Sparse Correspondence Analysis: A Principal Component Analysis Approach
DOI:
https://doi.org/10.26398/IJAS.0035-014Keywords:
Sparsity, Generalized singular value decomposition, Correspondence analysis, LASSO, Penalized matrix decompositionAbstract
Correspondence Analysis (CA) is the method of choice to analyze contingency matrices and is widely applied in text analysis, psychometrics, chemometrics etc. But CA becomes difficult to interpret when the number of rows or columns is large, a configuration routinely found in contemporary statistical practice. For principal component analysis (PCA), this interpretation problem has been traditionally handled with rotation and more recently with sparsification methods such as the LASSO. Curiously, despite the strong connections between CA and PCA, sparsifying correspondence analysis remains essentially unexplored. In this paper, we derive an extension of the Penalized Matrix Decomposition (a method based on the singular value decomposition) to sparsify CA. We present some theoretical results and properties of the resulting sparse correspondence analysis and illustrate this new method with an analysis of the causes of deaths in the United States in 2019.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Statistica Applicata - Italian Journal of Applied Statistics
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.