... | @@ -16,9 +16,9 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
... | @@ -16,9 +16,9 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
|
|
|
|
|
- **Formula**: The Pearson correlation between two variables $X$ and $Y$ is calculated as:
|
|
- **Formula**: The Pearson correlation between two variables $X$ and $Y$ is calculated as:
|
|
|
|
|
|
$$
|
|
$$
|
|
r = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum (X_i - \bar{X})^2 \sum (Y_i - \bar{Y})^2}}
|
|
r = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum (X_i - \bar{X})^2 \sum (Y_i - \bar{Y})^2}}
|
|
$$
|
|
$$
|
|
|
|
|
|
2. **Variance Inflation Factor (VIF)**:
|
|
2. **Variance Inflation Factor (VIF)**:
|
|
- **How it works**: VIF quantifies the degree of multicollinearity by measuring how much the variance of a regression coefficient is inflated due to the correlation with other predictors. A VIF greater than 5-10 indicates problematic multicollinearity.
|
|
- **How it works**: VIF quantifies the degree of multicollinearity by measuring how much the variance of a regression coefficient is inflated due to the correlation with other predictors. A VIF greater than 5-10 indicates problematic multicollinearity.
|
... | @@ -26,9 +26,9 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
... | @@ -26,9 +26,9 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
|
|
|
|
|
- **Formula**: The VIF for a predictor $X_j$ is calculated as:
|
|
- **Formula**: The VIF for a predictor $X_j$ is calculated as:
|
|
|
|
|
|
$$
|
|
$$
|
|
VIF_j = \frac{1}{1 - R_j^2}
|
|
VIF_j = \frac{1}{1 - R_j^2}
|
|
$$
|
|
$$
|
|
|
|
|
|
Where $R_j^2$ is the R-squared value obtained by regressing $X_j$ on all other predictors. A high $R_j^2$ means $X_j$ is highly correlated with the other predictors.
|
|
Where $R_j^2$ is the R-squared value obtained by regressing $X_j$ on all other predictors. A high $R_j^2$ means $X_j$ is highly correlated with the other predictors.
|
|
|
|
|
... | @@ -41,9 +41,10 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
... | @@ -41,9 +41,10 @@ Multicollinearity occurs when two or more predictors in a regression model are h |
|
4. **Tolerance**:
|
|
4. **Tolerance**:
|
|
- **How it works**: Tolerance is the reciprocal of the VIF and indicates how much of the variance in one predictor is not explained by the other predictors. Low tolerance (below 0.1) suggests multicollinearity.
|
|
- **How it works**: Tolerance is the reciprocal of the VIF and indicates how much of the variance in one predictor is not explained by the other predictors. Low tolerance (below 0.1) suggests multicollinearity.
|
|
- **Formula**:
|
|
- **Formula**:
|
|
$$
|
|
|
|
Tolerance_j = 1 - R_j^2
|
|
$$
|
|
$$
|
|
Tolerance_j = 1 - R_j^2
|
|
|
|
$$
|
|
|
|
|
|
### Common Issues
|
|
### Common Issues
|
|
|
|
|
... | | ... | |