September 11,2023

The Federal Information Processing Standards (FIPS ) common to all data sets shows 354 rows of data that contains information on all 3 variables %obesity, %inactivity and %diabetes. There are relatively large number of data points for both diabetes and inactivity. For the given %diabetes data has a kurtosis of approximately 4, which is slightly higher than the value of 3 for a normal distribution, and a quantile-quantile plot reveals a significant departure from normality and for the given %inactivity data has a kurtosis of about 2, which is somewhat lower the value of 3 for normal distribution. Kurtosis is critical. The linear least squares model is a technique used to identify the line that best illustrates the relationship between the variables, by minimizing the sum of the squared differences between the anticipated and observed values of the data points. In any linear model it is very important to examine residuals. Residuals represents the error or unexplained variations in the data. When the residual is examined it is noticed that the higher than 3 value of 4.07 for the kurtosis, and the quantile plot,  its indicates a deviation from normality for the residuals. This can create issue in testing for heteroscedasticity. Heteroscedasticity is important to ensure that the assumptions of the model are met and that the results are valid. Breusch-Pagan test is used to test heteroscedasticity analytically. When we plot residuals versus the predicted values from linear model, the fanning out of residuals as the fitted value gets large is an indicator that the linear model is not reliable hence it is heteroscedastic.

Leave a Reply

Your email address will not be published. Required fields are marked *