The cause, discrimination, test and solution of Multicollinearity

Recently, in regression analysis, the sign of correlation coefficient is opposite to that of regression equation coefficient. After research, it is confirmed that it is a multicollinearity problem and the solution is explored.

Here, the related knowledge of multicollinearity is sorted out as follows.

It is possible that the two explanatory variables are highly correlated in theory, but the observed values may not be highly correlated, and vice versa. So multicollinearity is essentially a data problem.

There are several reasons for multicollinearity

1. All explanatory variables share the same time trend;

2. One explanatory variable is the lag of the other, and they tend to follow the same trend;

3. Because the basis of data collection is not wide enough, some explanatory variables may change together;

4. There is a linear relationship between some explanatory variables;

distinguish:

1. It is found that the sign of coefficient estimation is not correct;

2. Some important explanatory variables t value is low, but r square is not low

3. When an unimportant explanatory variable was deleted, the regression results changed significantly;

Inspection;

1. In correlation analysis, the correlation coefficient higher than 0.8 indicates the existence of multicollinearity, but the low correlation coefficient does not indicate the absence of multicollinearity;

2. Vif test;

3. Conditional coefficient test;

resolvent:

1. Increase data;

2. Some constraints are imposed on the model;

3. Delete one or more collinear variables;

4. Deform the model properly;

5. Principal component regression

The principle of dealing with multicollinearity is as follows

1. Multicollinearity is universal, and no measures can be taken for minor multicollinearity problems;

2. Serious multicollinearity problems can be found by experience or regression analysis. Such as the sign of influence coefficient, the important explanatory variable t value is very low. Necessary measures should be taken according to different situations.

3. If the model is only used for prediction, as long as the fitting degree is good, it can not deal with the multicollinearity problem. When the multicollinearity model is used for prediction, it often does not affect the prediction results;

Above is excerpt “econometrics intermediate course” pan Shengchu chief editor

Read More: