Request

To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Thursday, January 8, 2009

Linear Regression Analysis and Logistic Regression Analysis

In this blog I discuss linear regression analysis, aspects of multiple regression, and logistic regression analysis, their function and differences, and SPSS regression analysis interpretation. At Statistics Solutions we hope you glean a few ideas here.

Linear Regression Analysis in SPSS

Linear regression analysis is a statistical analysis technique that assesses the impact of a predictor variable (the independent variable) on a criterion variable (a dependent variable). Importantly, the independent variable must be continuous (interval-level or ratio-level) or dichotomous. The dependent variable must be either continuous (interval-level or ratio-level). Dissertation students often have research questions that are appropriate to this technique. For example, a dissertation research question may be what the impact of smoking is on life expectancy. In this example, smoking is the predictor variable and life expectancy is the criterion variable. For Linear Regression Analysis help, CLICK HERE.

Linear Regression Analysis Assumptions

There are three primary assumptions associated with linear regression: outliers, linearity, and constant variance. Linear regression analysis is very sensitive to outliers. The easiest way to identify outliers is to standardize the scores by requesting that SPSS for the z-scores. Any score with a z-value outside of the absolute value of 3 is probably an outlier and should be considered for deletion. The assumption of linearity and constant variance can be assessed in SPSS by requesting a plot of the residuals (“z-resid” on the y-axis) by the predicted values (on “z-pred” the x-axis). If the scatter plot is not u-shaped, indicating non-linearity, or cone-shaped, indicating non-constant variance, the assumptions are considered met. For Linear Regression Analysis Assumptions Help, CLICK HERE.

Multiple Linear Regression Analysis

Multiple linear regression is a statistical analysis which is similar to Linear Regression with the exception that there can be more than one predictor variable. The assumptions of outliers, linearity and constant variance need to be met. One additional assumption that needs to be examined is multicollinearity. Multicollinearity is the extent to which the predictor variables are related to each other. Multicollinearity can be assessed by asking SPSS for the Variance Inflation Factor (VIF). While different researchers have different criteria for what constitutes too high a VIF number, VIF of 10 or greater is certainly reason for pause. If the VIF is 10 or greater, consider collapsing the variables. For Multiple Linear Regression Analysis Multicollinearity Help, CLICK HERE.

Regression Analysis Interpretation

When I speak with dissertation students about their regression analysis, there are four aspects of the SPSS output that I want to interpret. First is the ANOVA. The ANOVA tells the researcher whether the model is statistically significant; whether the F-value has an associated probability of .05 or less. The second thing to look for is the R-square value, also named the coefficient of determination. The coefficient of determination is a number between 0 and 100 which indicates what percent of the variability in the criterion variable can be accounted for by the predictor variable(s). The third regression analysis aspect to interpret is whether the beta coefficient is statistically significant. The beta’s significance can be found by examining the t-value and the associated significance level of the t-value for that particular predictor. Fourthly, you should interpret the beta, whether positive or negative. For Linear Regression Analysis Interpretation Help, CLICK HERE.

Logistic Regression Analysis in SPSS

Logistic regression, also called Binary Logistic Regression, is a statistical analysis technique that assesses the impact of a predictor variable (the independent variable) on a criterion variable (a dependent variable). As in a linear regression analysis, the independent variable must be continuous (interval-level or ratio-level) or dichotomous. The difference is that the dependent variable must be dichotomous (i.e., a binary variable). For example, a researcher may want to know whether age predicts the likelihood of going to a doctor (yes vs. no). For Logistic Regression Analysis Help, CLICK HERE.

Binary Logistic Regression Analysis Interpretation

While binary logistic regression and linear regression analyses are different in the criterion variables, there are other differences as well. In logistic regression, to assess whether the model is statistically significant, you can look at the chi-square test and whether it is statistically significant. The chi-square in logistic regression analysis is analogous to the ANOVA test in the linear regression. The next thing to examine is the Nagelkerke R-square statistic, which is somewhat analogous to the R-square value in the linear regression analysis. Next, interpret whether the Beta coefficient(s) is statistically significant. If so, look at the Exp(B) to see the likelihood that for a one-unit change in the predictor, the outcome is X more times likely to occur.. For Binary Logistic Regression Analysis Interpretation Help, CLICK HERE.