How do you identify and deal with multicollinearity in your models?
The correlation coefficient measures the robustness of the relationship between two variables. The easiest way to find the correlation is by using function cor(DataSet Name). The value of the correlation coefficient, denoted as r, ranges from -1 to +1, which gives the strength of the relationship and whether the relationship is negative or positive. When the value of r is greater than zero, it is a positive relationship; when the value is less than zero, it is a negative relationship. A value of zero indicates that there is no relationship between the two variables.
correlation between two variable ranges from -1 to 1
If value is
>0 +ve relationship
<0 -ve relationship
=0 no relationship
As you can see, history_avg_timeonsite has a high correlation as compare to its correlation with today_sessions.
history_avg_timeonsite 0.0461329219 0.595848332
Apart from this, you may also plot correlation graph using below package
So, by looking at the correlation value you may eliminate the variables and check the model accuracy.