#regressionmodellingcoursera coursera
Explore tagged Tumblr posts
Text
Multiple regression - week3
While we now have evidence that breast cancer rate is significantly associated with urban rate and income per person. What if it’s urban rate that is responsible and not income per person.
We used multiple regression to to evaluate multiple predictors.
Looking at the confidence intervals we can rule out the possibility that association between urban rate and breast cancer rate is 0.
To check whether association is linear or curvilinear we added the polynomial term to the urban rate. From below graph we can see that straight line is the best fit.
From below regression result we can see that R-value doesn't increase much from .325 to .357 i.e. by adding the quadratic term the amount of variability in breast cancer rate increases just by 3.2%
The below residual plot show that residuals doesn't follow a straight line i.e. perfect normal distribution, which means that the association we have observed earlier in scatter plot may be fully estimated by quadratic term. There might be other explanatory variables.
The below plot describes the outlier by plotting standarized residuals (mean=0 and sd=1) for each observation. There a two values which are more than 3 standard deviations that could be the extreme outliers.
The partial regression plot below show the effect of adding urban rate as additional variable. Plot shows a linear pattern for urban rate.
The leverage plot also shows that there are a few outliers.
0 notes