Volume 7, Issue 2 p. 137-152
Advanced Review

Variable importance in regression models

Ulrike Grömping

Corresponding Author

Ulrike Grömping

Department II – Mathematics, Physics, Chemistry, Beuth University of Applied Sciences, Berlin, Germany

Correspondence to: [email protected]Search for more papers by this author
First published: 06 February 2015
Citations: 174
Additional Supporting Information may be found in the online version of this article.

Conflict of interest: The author has declared no conflicts of interest for this article.

Abstract

Regression analysis is one of the most-used statistical methods. Often part of the research question is the identification of the most important regressors or an importance ranking of the regressors. Most regression models are not specifically suited for answering the variable importance question, so that many different proposals have been made. This article reviews in detail the various variable importance metrics for the linear model, particularly emphasizing variance decomposition metrics. All linear model metrics are illustrated by an example analysis. For nonlinear parametric models, several principles from linear models have been adapted, and machine-learning methods have their own set of variable importance methods. These are also briefly covered. Although there are many variable importance metrics, there is still no convincing theoretical basis for them, and they all have a heuristic touch. Nevertheless, some metrics are considered useful for a crude assessment in the absence of a good subject matter theory. WIREs Comput Stat 2015, 7:137–152. doi: 10.1002/wics.1346

This article is categorized under:

  • Statistical and Graphical Methods of Data Analysis > Multivariate Analysis