Checking for multicollinearity

What is the Variance Inflation Factor (VIF) Test?

The Variance Inflation Factor (VIF) is a tool that you can use to check for multicollinearity in your regression model. Multicollinearity occurs when two or more independent variables in a regression model are highly correlated with each other, which can lead to inaccurate and unreliable estimates of the regression coefficients.

Before you can perform the VIF test, you must:

  • Have created a linear regression model.
  • Have several independent variables that you think might be correlated with each other.

How is the test implemented?

Implementation in R with an example

In R, you can calculate the VIF using the car package:

#Install and load package
install.packages("car")
library(car)

# Create a fictional dataset
set.seed(123) x1 <- rnorm(100)
x2 <- x1 + rnorm(100, sd=0.5)
y <- 3 + 2x1 + 0.5x2 + rnorm(100)
data <- data.frame(y, x1, x2)

# Create regression model
model <- lm(y ~ x1 + x2, data=data)

#Calculate VIF
vif(model)

This code will return the VIF value for each independent variable in your model.

Implementation in SPSS

In SPSS:

Conduct a linear regression: “Analyze” > “Regression” > “Linear”.

Add the dependent variable and the independent variables.

Under “Statistics”, select “Collinearity Diagnostics”.

Click “OK”.

In the results, you will find the VIF value for each independent variable.

Implementation in JASP

In JASP:

Select “Regression” > “Linear Regression”.

Add the dependent variable and the independent variables.

Activate the “Collinearity” option.

The “Collinearity” table will show you the VIF for each variable.

How do you interpret and report the Variance Inflation Factor?

Suppose you get a VIF value of 5 for x1 and 4.5 for x2.

A VIF value of 1 indicates no multicollinearity. Generally, values above 10 are often a cause for concern, although some experts are cautious even with values above 5. In our example, the values of 5 and 4.5 would suggest that there may be multicollinearity between x1 and x2.

In APA format, you would write: “A check for multicollinearity revealed VIF values of 5 for x1 and 4.5 for x2, indicating potential multicollinearity between these variables.”

Conclusion

The Variance Inflation Factor (VIF) is an indispensable tool if you want to ensure the reliability of your regression model. Multicollinearity can distort results and lead to incorrect conclusions. It is therefore crucial to check for it and correct it if necessary. Whether in R, SPSS, or JASP, the VIF provides clarity on the presence of multicollinearity in your model.