Text
Week 4 assignment: Creating graphs for your data
In my assignment, I am looking at two variables from the Gapminder dataset - income per person and polity score.
The first chart shows the distribution of income per person for all countries. You can see that there is substantial bunching in the low ranges and a few countries that are very much higher. The distribution is unimodal and right skewed.
The second chart shows the frequency count of polity score, which has a bimodal distribution, peaking at scores 10 and -7.
Bringing both variables together in a scatter plot, there’s a weak positive relationship between income per person and polity score. Without looking at the p value, I would posit that the relationship is too weak for any significant correlation.
The program code is appended below:
import pandas as pd import numpy import seaborn import matplotlib.pyplot as plt
data = pd.read_csv("gapminder.csv") data['incomeperperson'] = data['incomeperperson'].convert_objects(convert_numeric=True) data['polityscore'] = data['polityscore'].convert_objects(convert_numeric=True) data['employrate'] = data['employrate'].convert_objects(convert_numeric=True)
# Chart 1: Graph of frequency of polity scores # Chart 2: Scatterplot of Employment rate against Income per person
print('-----POLITY SCORE-----') print() polityd = data.polityscore.describe() #Including this as a better way of understanding the distribution than frequency polityf = data['polityscore'].value_counts(sort=True, normalize=True).sort_index() print('Frequency count of Polity score for all countries') print(polityf) print() print('Broad statistics of Polity score for all countries') print(polityd) print()
print('-----INCOME PER PERSON-----') print() incomed = data.incomeperperson.describe() #Including this as a better way of understanding the distribution than frequency print('Broad statistics of Income per Person for all countries') print(incomed) print()
scat1 = seaborn.regplot(x="incomeperperson", y="polityscore", fit_reg=True, data=data) plt.xlabel('Income per person') plt.ylabel('Polity Score') plt.title('Scatterplot for the Association Between Income per person and Polity Score')
data["polityscore"] = data["polityscore"].astype('category') #Univariate histogram for quantitative variable: seaborn.distplot(data["polityscore"].dropna(), kde=False, bins = 21, hist_kws=dict(edgecolor="k", linewidth=2)); plt.xlabel('Polity Score') plt.title('Frequency count of Polity score of all countries')
data["incomeperperson"] = data["incomeperperson"].astype('category') #Univariate histogram for quantitative variable: seaborn.distplot(data["incomeperperson"].dropna(), kde=False, hist_kws=dict(edgecolor="k", linewidth=2)); plt.xlabel('Income per person') plt.title('Histogram of Income per person for all countries')
0 notes
Text
Week 3 Assignment: Difference between Employment Rate and Female Employment Rate
For this assignment, I created a secondary variable - the difference between Employment Rate and Female Employment Rate, as a benchmark for gender gap in the workforce. Instead of displaying the raw calculation, I created a numerical classification from 0 - 5. This classification is displayed as a frequency table for count and percentage.
What’s notable is that more than half of the countries have a gender gap of 10%. One country, Burundi, actually has higher Female Employment Rate than Employment Rate. However, it is in the bottom quarter for Income per person. I used another classification for a quick label of countries’ GDP, in top quarter, middle 50% and bottom quarter. This provides an easy way to understand the economic context for labor force participation.
The output from Python is appended below, with the data subset with the secondary variable, and the three frequency tables at the bottom of the page.
OUTPUT FROM PYTHON
-- Countries' GDP ranking, Employment rate, Female Employment rate, and Difference between Employment Rate and Female Employment Rate --
Legend for Difference 5: Female Employment Rate is higher than Employment Rate 4: Employment Rate is higher than Female Employment Rate by 10% 3: Employment Rate is higher than Female Employment Rate by 20% 2: Employment Rate is higher than Female Employment Rate by 30% 1: Employment Rate is higher than Female Employment Rate by 40% 0: Employment Rate is higher than Female Employment Rate by more than 40%
Legend for GDP ranking Top quarter refers to countries with Income per person in the top 25% Bottom quarter refers to countries with Income per person in the bottom 25% Middle 50% refers to countries with Income per person in the middle 50%
country GDP employrate femaleemployrate Difference 0 Afghanistan None 55.700001 25.600000 1.0 1 Albania Middle 50% 51.400002 42.099998 4.0 2 Algeria Middle 50% 50.500000 31.700001 3.0 4 Angola Middle 50% 75.699997 69.400002 4.0 6 Argentina Top quarter 58.400002 45.900002 3.0 7 Armenia Middle 50% 40.099998 34.200001 4.0 9 Australia Top quarter 61.500000 54.599998 4.0 10 Austria Top quarter 57.099998 49.700001 4.0 11 Azerbaijan Middle 50% 60.900002 56.200001 4.0 12 Bahamas Top quarter 66.599998 60.700001 4.0 13 Bahrain Top quarter 60.400002 30.200001 1.0 14 Bangladesh Bottom quarter 68.099998 53.599998 3.0 15 Barbados Middle 50% 66.900002 60.299999 4.0 16 Belarus Middle 50% 53.400002 48.599998 4.0 17 Belgium Top quarter 48.599998 41.700001 4.0 18 Belize Middle 50% 56.799999 38.799999 3.0 19 Benin Bottom quarter 71.599998 58.200001 3.0 21 Bhutan Middle 50% 58.400002 39.900002 3.0 22 Bolivia Middle 50% 70.400002 61.599998 4.0 23 Bosnia and Herzegovina Middle 50% 41.200001 34.900002 4.0 24 Botswana Middle 50% 46.000000 38.700001 4.0 25 Brazil Middle 50% 64.500000 53.299999 3.0 26 Brunei Top quarter 63.799999 55.500000 4.0 27 Bulgaria Middle 50% 47.299999 42.099998 4.0 28 Burkina Faso Bottom quarter 81.300003 75.800003 4.0 29 Burundi Bottom quarter 83.199997 83.300003 5.0 30 Cambodia Bottom quarter 78.900002 73.400002 4.0 31 Cameroon Bottom quarter 59.099998 49.000000 3.0 32 Canada Top quarter 63.500000 58.900002 4.0 33 Cape Verde Middle 50% 55.900002 43.599998 3.0 .. ... ... ... ... ... 180 Sri Lanka Middle 50% 55.099998 39.200001 3.0 181 Sudan Bottom quarter 47.299999 27.900000 3.0 182 Suriname Middle 50% 44.700001 30.400000 3.0 183 Swaziland Middle 50% 50.900002 47.099998 4.0 184 Sweden Top quarter 60.700001 56.700001 4.0 185 Switzerland Top quarter 64.300003 57.000000 4.0 186 Syria Middle 50% 44.799999 16.700001 2.0 187 Taiwan None 54.500000 47.099998 4.0 188 Tajikistan Bottom quarter 54.599998 50.099998 4.0 189 Tanzania Bottom quarter 78.199997 76.099998 4.0 190 Thailand Middle 50% 72.000000 65.000000 4.0 191 Timor-Leste Bottom quarter 67.300003 54.700001 3.0 192 Togo Bottom quarter 63.900002 48.400002 3.0 194 Trinidad and Tobago Top quarter 61.500000 50.500000 3.0 195 Tunisia Middle 50% 41.599998 21.400000 2.0 196 Turkey Middle 50% 42.799999 21.900000 2.0 197 Turkmenistan Middle 50% 58.500000 53.900002 4.0 199 Uganda Bottom quarter 83.199997 80.000000 4.0 200 Ukraine Middle 50% 54.400002 49.400002 4.0 201 United Arab Emirates Top quarter 75.199997 37.299999 1.0 202 United Kingdom Top quarter 59.299999 53.099998 4.0 203 United States Top quarter 62.299999 56.000000 4.0 204 Uruguay Middle 50% 57.500000 46.000000 3.0 205 Uzbekistan Middle 50% 57.500000 52.599998 4.0 207 Venezuela Middle 50% 59.900002 45.799999 3.0 208 Vietnam Bottom quarter 71.000000 67.599998 4.0 209 West Bank and Gaza None 32.000000 11.300000 2.0 210 Yemen, Rep. Bottom quarter 39.000000 20.299999 3.0 211 Zambia Bottom quarter 61.000000 53.500000 4.0 212 Zimbabwe Bottom quarter 66.800003 58.099998 4.0
[178 rows x 5 columns]
Counts for Difference 1.0 6 2.0 20 3.0 52 4.0 97 5.0 3 Name: Difference, dtype: int64
Percentages for Difference 1.0 0.033708 2.0 0.112360 3.0 0.292135 4.0 0.544944 5.0 0.016854 Name: Difference, dtype: float64
Counts for GDP ranking Bottom quarter 48 Middle 50% 77 Top quarter 41 Name: GDP, dtype: int64
0 notes
Text
Submission for week 2
For this assignment, I am displaying the frequency tables for three variables: Polity Rate, Urbanization Rate and Employment Rate.
Instead of looking at the entire dataset, I have carved out two groups to focus on - countries with GDP in the top quartile and countries with GDP in the btm quartile. To do this, I used the .describe function in Pandas to find the cutoff GDP value. This was then used to segment the groups.
Next I ran values_count on both groups for each of the three variables. As a bonus, I added a .describe function for these variables, to provide more analysis in understanding how different are the two groups.
For all three variables, the difference is noticeable, except for Employment Rate, which may suggest that some of the countries’ GDP does not depend on human production activities.
Click for link to python file
OUTPUT DATA:
Analysing the Income per person variable:
count 190.000000 mean 8740.966076 std 14262.809083 min 103.775857 25% 748.245151 50% 2553.496056 75% 9379.891165 max 105147.437697 Name: incomeperperson, dtype: float64
-----POLITY SCORE-----
Frequency count of Polity score for top GDP countries 8.0 0.090909 10.0 0.696970 -7.0 0.030303 9.0 0.030303 -8.0 0.060606 -10.0 0.060606 -2.0 0.030303 Name: polityscore, dtype: float64
Broad statistics of Polity score for top GDP countries count 33.000000 mean 6.606061 std 6.878133 min -10.000000 25% 8.000000 50% 10.000000 75% 10.000000 max 10.000000 Name: polityscore, dtype: float64
Frequency count of Polity score for btm GDP countries 5.0 0.104167 7.0 0.145833 0.0 0.083333 6.0 0.104167 2.0 0.020833 -4.0 0.062500 -1.0 0.083333 -2.0 0.062500 9.0 0.020833 -7.0 0.062500 1.0 0.062500 -5.0 0.020833 8.0 0.062500 -3.0 0.062500 4.0 0.041667 Name: polityscore, dtype: float64
Broad statistics of Polity score for btm GDP countries count 48.000000 mean 1.937500 std 4.710361 min -7.000000 25% -2.000000 50% 1.500000 75% 6.000000 max 9.000000 Name: polityscore, dtype: float64
-----URBANIZATION RATE-----
Frequency count of Urbanization Rate for top GDP countries 92.00 0.020833 100.00 0.104167 88.52 0.020833 61.00 0.020833 59.46 0.020833 81.46 0.020833 77.48 0.020833 82.44 0.020833 77.88 0.020833 68.08 0.020833 88.92 0.020833 14.32 0.020833 98.32 0.020833 97.36 0.020833 73.64 0.020833 67.16 0.020833 63.30 0.020833 80.40 0.020833 30.46 0.020833 73.48 0.020833 94.22 0.020833 89.94 0.020833 83.52 0.020833 71.62 0.020833 92.26 0.020833 88.74 0.020833 61.34 0.020833 86.56 0.020833 66.48 0.020833 95.64 0.020833 74.82 0.020833 94.26 0.020833 83.70 0.020833 81.70 0.020833 77.36 0.020833 86.68 0.020833 69.90 0.020833 13.22 0.020833 77.12 0.020833 81.82 0.020833 91.66 0.020833 48.60 0.020833 84.54 0.020833 82.42 0.020833 Name: urbanrate, dtype: float64
Broad statistics of Urbanization Rate for top GDP countries count 48.000000 mean 78.204167 std 19.821586 min 13.220000 25% 71.190000 50% 82.120000 75% 91.745000 max 100.000000 Name: urbanrate, dtype: float64
Frequency count of Urbanization Rate for btm GDP countries 17.00 0.020833 41.00 0.020833 56.76 0.020833 12.54 0.020833 50.02 0.020833 17.24 0.020833 42.00 0.020833 18.34 0.020833 43.44 0.020833 29.52 0.020833 33.96 0.020833 27.84 0.020833 25.46 0.020833 21.56 0.020833 27.14 0.020833 20.72 0.020833 29.84 0.020833 36.28 0.020833 38.58 0.020833 66.60 0.020833 41.20 0.020833 56.42 0.020833 36.84 0.020833 48.36 0.020833 30.88 0.020833 27.30 0.020833 48.78 0.020833 19.56 0.020833 34.44 0.020833 26.68 0.020833 37.76 0.020833 60.14 0.020833 28.08 0.020833 42.38 0.020833 10.40 0.020833 37.34 0.020833 32.18 0.020833 46.84 0.020833 35.42 0.020833 21.60 0.020833 41.76 0.020833 30.64 0.020833 36.16 0.020833 16.54 0.020833 18.80 0.020833 12.98 0.020833 25.52 0.020833 26.46 0.020833 Name: urbanrate, dtype: float64
Broad statistics of Urbanization Rate for btm GDP countries count 48.000000 mean 33.068750 std 13.052879 min 10.400000 25% 24.495000 50% 31.530000 75% 41.340000 max 66.600000 Name: urbanrate, dtype: float64
-----EMPLOYMENT RATE-----
Frequency count of Employment Rate for top GDP countries 61.500000 0.04878 63.500000 0.02439 51.200001 0.04878 53.500000 0.04878 59.000000 0.02439 65.000000 0.04878 63.099998 0.02439 76.000000 0.02439 52.500000 0.02439 50.700001 0.02439 64.300003 0.02439 55.900002 0.02439 57.599998 0.02439 60.700001 0.02439 59.099998 0.02439 62.400002 0.02439 42.400002 0.02439 58.900002 0.02439 48.599998 0.02439 46.400002 0.02439 59.900002 0.02439 60.400002 0.02439 63.599998 0.02439 58.400002 0.02439 57.099998 0.02439 75.199997 0.02439 49.599998 0.02439 66.599998 0.02439 73.599998 0.02439 57.200001 0.02439 62.299999 0.02439 59.299999 0.02439 61.299999 0.02439 46.799999 0.02439 57.299999 0.02439 51.299999 0.02439 63.799999 0.02439 Name: employrate, dtype: float64
Broad statistics of Employment Rate for top GDP countries count 41.000000 mean 58.712195 std 7.416677 min 42.400002 25% 53.500000 50% 59.099998 75% 63.099998 max 76.000000 Name: employrate, dtype: float64
Frequency count of Employment Rate for btm GDP countries 68.000000 0.020833 81.500000 0.020833 66.000000 0.020833 83.000000 0.020833 77.000000 0.020833 51.000000 0.020833 80.699997 0.020833 71.000000 0.020833 50.900002 0.020833 39.000000 0.020833 61.000000 0.020833 60.400002 0.020833 78.900002 0.020833 46.900002 0.020833 78.199997 0.041667 66.199997 0.020833 71.800003 0.020833 58.900002 0.020833 37.400002 0.020833 59.099998 0.020833 55.900002 0.020833 59.900002 0.020833 71.300003 0.020833 66.800003 0.020833 73.199997 0.020833 81.300003 0.020833 83.199997 0.041667 65.099998 0.020833 63.900002 0.020833 67.300003 0.020833 54.599998 0.020833 65.900002 0.020833 64.900002 0.020833 68.900002 0.020833 68.099998 0.020833 71.599998 0.020833 45.700001 0.020833 47.299999 0.020833 63.799999 0.020833 61.799999 0.020833 44.299999 0.020833 56.299999 0.020833 71.699997 0.020833 65.599998 0.020833 70.400002 0.020833 79.800003 0.020833 Name: employrate, dtype: float64
Broad statistics of Employment Rate for btm GDP countries count 48.000000 mean 65.352083 std 11.949930 min 37.400002 25% 59.049999 50% 66.099998 75% 72.150002 max 83.199997 Name: employrate, dtype: float64
0 notes
Text
Submission for week 1
Peer-graded assignments for Data Management and Visualisation
For my project, I have chosen to look at the data set from Gapminder, stemming from my interest and current affairs. In particular, I would like to know what indicators suggest that a country has the propensity to adopt civic tech, and have vibrant driving digital democracy movements.
Civic tech and digital democracy movements have gained momentum in several countries, notably Taiwan, Brazil, Spain and Australia. Through the use of open platforms, citizens are engaged digitally in decision making for policy implementations. This requires wide tech adoption by a politically aware and sufficiently motivated populace, often triggered by a failure of the incumbent political system, such as corruption in a long-ruling party.
Existing literature on civic tech tends to focus on the success of each platform, whether it is the technology, user interface design, or implementation strategies. In this paper “The design of civic technology: factors that influence public participation and impact” by Andrew May &Tracy Ross, the analysis focuses on how an easy-to-use design increases civic participation.
At the recent g0v summit in Taiwan that I attended, Pablo Aragón from Spain presented his ongoing Phd research that showed practical examples of how interfaces, technical features and algorithms might have influenced behaviour in civic technology platforms like Decide Madrid, Decidim Barcelona, and Menéame,
The closest research literature which looks beyond the platform and technology, to focus on the human, is “From Civic Tech to Civic Capacity: The Case of Citizen Audits” by K. Sabeel Rahman (a1) which looks at the civic capacity within the population, rather than the tech.
The Gapminder data set allows a broader view of countries and their potential adoption of civic tech, using select variables - GDP, Employment Rate, Internet Use, and Democracy score by Polity. I believe these variables may be indicative of potential within the population for civic participation. For instance, a country which might have low GDP, low employment rate, coupled with high internet use and high democracy score, is likely to adopt civic tech. This seems to be the case anecdotally for Brazil.
Through this exercise, I will look at other countries’ data, examine if the hypothesis holds true and explore other influencing factors in recent global history. This is especially important because of the dated data set (2008) and more recent civic tech developments such as Occupy Wall Street and the Sunflower Movement.
0 notes
Video
tumblr
Now looking back at my old videos, it's amazing how far I have come in under 2 years.
1 note
·
View note
Text
I like traveling alone because I step out of my social self in the absence of familiar people. This trip to BKK was spent pretty much by myself. While I did meet up and hang out with friends, I had the luxury of exploring the city by myself for quite a fair bit. It's a confusing city and with the street signs in Thai, it's challenging getting around. I've taken a harrowing tuk tuk ride through a rain-flooded back alley. It felt like a boat ride at one point. I've also taken a scooter ride with no helmet on a 4 lane highway. Scary. If not for the driver, I would have been bursting out in panic laughter the whole time. Then again, I wouldn't be there without the driver. And finally, I've had street food. Beef noodles and roasted mushrooms. Pretty good chow and thankfully, my tummy is feeling fine :)
0 notes
Text
I am grateful for being debt-free. I remember worrying about every other purchase when I had my tuition fee loan after graduating. It was depressing to have to think about how every small saving I accrued might mitigate the ever growing interest rate. I am glad the frugality paid off. I like the freedom I am enjoying now. Not that I'm loaded, sadly, but not having a chain to my wallet does make a big difference in how I live my life. Drinks anyone?
0 notes
Text
I am grateful for MY BODY
Being able to do gymnastics at this age is truly a blessing. My body remembers the dreams of a child, all the wondrous and amazing things moves which I could only look upon in awe.
Last Saturday I found out that a friend of mine used to have a tumor. It makes me even more thankful that my body has been so very kind to me. Today he is able to do many amazing stunts.
We honor our bodies, we honor the faith we have in them. In return, I am blessed many times over, every single day.
0 notes
Video
tumblr
Not an ideal example of a front tuck, didn't jump high enough
0 notes