Tumgik
dtastlyxt-blog · 7 years
Text
Week 4: Moderate Variables with SAS
Variables: urban rate, income per person, co2emissions
Since I want to find the association between urbanization and national economic development, I will conduct several tests such as ANOVA, Chi-square and correlation. For the first two tests, I will group the quantitative variables into categorical variables. And I take urban rate as exploratory variable, income per person as response variable, and assuming the co2 emissions is the moderate variable.  
1. ANOVA
H0: All urban groups have the same mean of income per person for the low co2 emissions countries. Ha: Not all urban groups have the same mean of income per person for the low co2 emissions countries.
In order to use ANOVA, I divide urban rate into 4 levels low, normal, above average, and high. Similarly, the co2 emissions can be divided into two groups low and high.
From the result, we can find that for the countries with low co2 emissions, the F value is large and there is a significant associated P-value, meaning that we accept the Ha that not all urban groups have the same mean of income per person for the low co2 emissions countries.
Tumblr media Tumblr media
When looking at countries with high co2 emissions, the hypotheses are similar to the previous ones. In the result, we still have a large F-value and small p-value. So, we accept the Ha that not all urban groups have the same mean of income per person for the high co2 emissions countries.
Tumblr media Tumblr media
Furthermore, in the means table, both countries with low co2 emissions and high co2 emissions showed that the increase of urban rates will lead to an increase of GDP per capita. And we can also find that countries with low co2 emissions experience a slower increase of GDP per capita as urban rates increases, comparing to countries with high co2 emissions. For example, in the countries with low co2 emissions, the mean GDP per capita for normal and above average are 2424.4965 and 3948.8431. While in countries with high co2 emissions, the mean GDP per capita for normal and above average are 2611.8148 and 9905.1248.
2.  Chi-Square
Tumblr media Tumblr media
In this test, I group all variables into categorical variables. In the countries with low co2 emissions, there is a significant relationship between urbanization and GDP per capita. Similarly, the relationship is also significant in countries with high co2 emissions because the p-value is small enough, say less than 0.0001.
Tumblr media Tumblr media
According to the result, for the countries with low co2 emissions, the distribution of urban level in each GDP per capita group is slightly different from the countries with high co2 emissions. For example, in the normal GDP per capita countries with low co2 emission, the above average urbanization countries have the highest percentage of GDP per capita which is 42.86. However, in the normal GDP per capita countries with high co2 emission, the percentage changes to 21.88 only. And the normal urbanization countries have the highest percentage of GDP per capita of 56.52. So the moderate variable co2 emission does have an effect on the relationship between urbanization and GDP per capita.
Tumblr media
Overall, there is a positive relationship between urbanization and GDP per capita as shown in the bar chart.
3. Correlation 
Tumblr media Tumblr media
Since all variables are quantitative, we can use them directly. Based on the result, there is a statically significant relationship between urbanization and GDP per capita no matter in countries with low co2 emissions or with high co2 emissions because the p values are fewer than 0.0001. Also, the correlation coefficients in both types of countries are similar, say 0.58758 and 0.57198. The effect of the moderate variable co2 is not significant.
4. Appendix
Tumblr media Tumblr media Tumblr media Tumblr media
0 notes
dtastlyxt-blog · 8 years
Photo
Tumblr media Tumblr media
Variables: urban rate (Exploratory Variable), co2 emissions (Response Variable), income per person (Response Variable) Measurement:Pearson Correlation In this assignment, I want to figure out the relationship among urban rate and co2 emissions, and urban rate and income per person. Since these three are quantitative variables, Pearson Correlation can be used. 
For urban rate and co2 emissions, the correlation coefficient is 0.13555 with a p-value more than 0.05. This means that the correlation coefficient is not statically significant and we cannot find a relationship. For urban rate and income per person, the correlation coefficient is 0.49009 with a p-value less than 0.0001. It shows that there is a statically significant relationship between these two variables. Moreover, the R square would be 0.49^2=0.24, which suggests that I can predict 24% of the variability I’ll see in the income per person.
0 notes
dtastlyxt-blog · 8 years
Photo
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
Variables: Urbanlevel(level of urbanization), Gdpclevel(level of Gross Domestic Product per capita)
First I divided the degree of urbanization into 2 levels, which are low urbanization and high urbanization, and I also divided the degree of Gross Domestic Product (GDP) per capita in to 4 levels, including low, normal, above average, and high income.
To see whether these two variables are independent or not, I conducted the chi-square test. Here is the result.
From the table, it seems that for the low urbanization countries, they are more likely to have a low GDP per capita, say 95.74% low GDP per capital countries are low urbanization countries. While, for the high urbanization countries, they are less likely to have a low GDP per capita, say 4.26%. Also, we can find that the p value is small enough to reject null hypothesis and accept the alternative which is there are some relationship between level of urbanization and level of Gross Domestic Product per capita.
However, so far it’s hard to tell which groups have different urbanization levels. So by using post hoc test, I can better understand the difference. According to Bonforroni Adjustment, the p-value should be changed into 0.0083 (0.05/6=0.0083) to lower the probability of type one error. In comparison, I noticed that for each group, the degree of urbanization is not independent to GDP per capita no matter in which level of GDP per capita because the p-values are all smaller than 0.0083. As a result, for each group, the degree of urbanization is likely to have some relationship with the GDP per capita.
0 notes
dtastlyxt-blog · 8 years
Photo
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
Explanatory variable: Urban level
Response variable: Income per person, CO2 emission.
First, since these variables are from Gapminder and are all quantitative variables, I grouped the “urban rate” into a new variable “Urban level” which contains four levels. Then I conducted ANOVA test.
H0: All urban groups have the same mean of income per person.
Ha: Not all urban groups have the same mean of income per person.
According to the ANOVA table, we can see the F value equals to 26.20 and the p-value is much smaller than 0.05. So, we can reject H0 and accept Ha which means that not all urban groups have the same mean of income per person. From the boxplot, it shows that group 4 has a very different variation from others but it is not significant enough to say that group 4 has a different mean from others. Then, in the Post Hoc Test, we can find that indeed group 4 has a different mean($21988) from others, and group 3 also has a different mean ($7967) from group 2($2489). The result suggested that income per person in the most urbanized countries is much higher than it is in less urbanized countries.
Second, I conducted the similar ANOVA test to Urban Level and CO2 emission.
H0: All urban groups have the same mean of CO2 emissions.
Ha: Not all urban groups have the same mean of CO2 emissions .
 In this ANOVA table, the F value is only 1.96 with a p-value of 0.1213. Hence, we cannot reject H0 and accept that all urban groups have the same mean of CO2 emission around 1.3316309E21metric tons. In the box plot, CO2 emissions have similar variations and means in four urban groups. This explains why the F value is so small.
0 notes