secondassignment
secondassignment
secondassignment
1 post
Don't wanna be here? Send us removal request.
secondassignment · 3 years ago
Text
Second assignment
The explanatory variable is: how many times did the participants do hobbies?, where:
0: not at all, and 1: 5 or more times a week
The response variable is: how is your general health? With five categories: 1=excellent, 2=very good, 3=good, 4=fair and 5=poor.
1) The program:
import pandas as pandas import statsmodels.formula.api as smf
pandas.set_option('display.float_format', lambda x:'%.2f'%x)
data = pandas.read_csv('addhealth_pds.csv')
data = data[(data.H1DA2 != 1) & (data.H1DA2 != 2) & (data.H1DA2 != 6) & (data.H1DA2 != 8)] data['H1DA2'] = data['H1DA2'].replace([3],1) print(data.pivot_table(columns=['H1DA2'], aggfunc='size'))
data = data[(data.H1GH1 != 6)] print(data.pivot_table(columns=['H1GH1'], aggfunc='size'))
data['H1DA2'] = pandas.to_numeric(data['H1DA2'], errors='coerce') data['H1GH1'] = pandas.to_numeric(data['H1GH1'], errors='coerce')
print ("OLS regression model for the association between urban rate and internet use rate") reg1 = smf.ols('H1GH1 ~ H1DA2', data=data).fit() print (reg1.summary())
The output:
OLS regression model for the association between urban rate and internet use rate                            OLS Regression Results                             ============================================================================== Dep. Variable:                  H1GH1   R-squared:                       0.016 Model:                            OLS   Adj. R-squared:                  0.015 Method:                 Least Squares   F-statistic:                     46.27 Date:                Wed, 16 Mar 2022   Prob (F-statistic):           1.25e-11 Time:                        19:07:35   Log-Likelihood:                -3848.3 No. Observations:                2894   AIC:                             7701. Df Residuals:                    2892   BIC:                             7713. Df Model:                           1                                         Covariance Type:            nonrobust                                         ==============================================================================                 coef    std err          t      P>|t|      [0.025      0.975] ------------------------------------------------------------------------------ Intercept      2.2429      0.024     92.242      0.000       2.195       2.291 H1DA2         -0.2314      0.034     -6.802      0.000      -0.298      -0.165 ============================================================================== Omnibus:                      131.759   Durbin-Watson:                   1.937 Prob(Omnibus):                  0.000   Jarque-Bera (JB):              114.587 Skew:                           0.420   Prob(JB):                     1.31e-25 Kurtosis:                       2.506   Cond. No.                         2.64 ==============================================================================
2) Frequency table for explanatory variable:
H1DA2 (how many times did the participants do hobbies?) 0    1416 1    1479 dtype: int64
3) Based on the linear regression doing more hobbies 5 or more hobbies a week was significantly (p < 0.001) and positively associated with a better general health
0 notes