A reaction to a challenging example in multiple regression analysis

Authors

  • Ettore Marubini Department of Clinical Sciences and Community Health, Laboratory of Medical Statistics, Epidemiology and Biometry G. A. Maccacaro, University of Milan, Milan, Italy
  • Annalisa Orienti Department of Clinical Sciences and Community Health, Laboratory of Medical Statistics, Epidemiology and Biometry G. A. Maccacaro, University of Milan, Milan, Italy

DOI:

https://doi.org/10.26398/IJAS.0029-005

Keywords:

Linear regression analysis, Regression illustrative example, Hazards in regression analysis, Robust regression, Mixture models

Abstract

In a very stimulating paper, Preece gives an artificial dataset useful to illustrate the hazard of multiple regression and challenges the reader to spot the simple inbuilt features of these data. The present note aims at finding how Preece generated the whole set of data. First of all OLS regression model is fitted to the data; after checking for model assumptions some doubts arise on the validity of OLS regression; thus robust regression estimators are considered as a proper alternative. The latter give discordant coefficient estimates, but after a deep analysis, they agree in highlighting the presence of two subsets within the dataset: 9 cases being generated by one model, and the remaining 8 cases being generated by a second model. This particular pattern of the data is recognized by the mixture model as well.

Downloads

Published

2020-02-17

How to Cite

Marubini, E., & Orienti, A. (2020). A reaction to a challenging example in multiple regression analysis. Statistica Applicata - Italian Journal of Applied Statistics, 29(1), 95–106. https://doi.org/10.26398/IJAS.0029-005

Issue

Section

Latest articles