marc0sop
marc0sop
Coursera:Data Management and Visualization
4 posts
Don't wanna be here? Send us removal request.
marc0sop · 4 years ago
Text
Week 04: Creating graphs for your data
Week 04 Assignment 01
Variables -  ['femaleemployrate' 'urbanrate' 'lifeexpectancy']
Code: https://github.com/ankurRangi/Data-Analysis-and-Interpretation-COURSERA/blob/main/Course%2001:%20Data%20Management%20and%20Visualization/Week%2004/FreqDistribution_W4.py
Output: (Check out the Graphs folder and Output file)
https://github.com/ankurRangi/Data-Analysis-and-Interpretation-COURSERA/tree/main/Course%2001:%20Data%20Management%20and%20Visualization/Week%2004/Graphs
Graph 1 : Figure_1 (Fe_num) 2.png
This graph is unimodal, with its highest peak at the median category of (X-label: 60) 40 to 60% female employment rate. It seems to be skewed to the right as there are higher frequencies in lower categories than the higher categories. 
Graph 2: Figure_1 (lyf) 2.png
This graph is unimodal, with its highest peak at (X-label: 75) 65 to 75 years old. It seems to be skewed to the left as there are higher frequencies in the higher age ranges. 
Graph 3: Figure_1 (urb) 2.png
This graph is unimodal, with its single highest peak at the category of (X-label: 75) 25 to 75% urban rate.    
Graph 4: regplot.png
The graph above plots the urban population rate of a country to the country’s corresponding life expectancy. We can see that the scatter graph does not show a clear relationship/trend between the two variables.
0 notes
marc0sop · 4 years ago
Text
Week 03: Making Data Management Decisions
Assignment 01.
Adding the file: https://github.com/ankurRangi/Data-Analysis-and-Interpretation-COURSERA/blob/main/Course%2001:%20Data%20Management%20and%20Visualization/Week%2003/FreqDis_W3.py
Output: https://github.com/ankurRangi/Data-Analysis-and-Interpretation-COURSERA/blob/main/Course%2001:%20Data%20Management%20and%20Visualization/Week%2003/Output_A1.pdf
I have used three variables, 'Femaleemployrate', 'Urbanrate', and 'Lifeexpectancy'.
For Female-Employment Rate, it is clear that the most common response was 3 (45.07%), which means around 45% of the countries have their female employment rate between 40% to 60%.
For Urban Rate, the most common response was 2(62.44%) which explains more than half of the countries i.e. around 62%, have their urban rate ranging from 25% to 75%.
For Life Expectancy, the common response was 4 which is 35.68%. It means less than half of the countries have a life expectancy of more than 65 and less than 75 years of age.
0 notes
marc0sop · 4 years ago
Text
Getting Started with Python
Week02_Assignment01
Creating the file FrequencyDistribution_01.py. (Github link for the Code)
Output of the code. (Link)
Now, the variables used for computing the frequency are “hivrate”, “polityscore”, and “employrate”.
HIVRate - 2009 estimated HIV Prevalence % - (Ages 15-49) estimated number of people living with HIV per 100 population of age group 15-49.
PolityScore - 2009 Democracy score (Polity) overall polity score from the polity IV dataset, calculated by subtracting an autocracy score from a democracy score. The summary measure of a country's democratic and free nature. 
EmplyRate - 2007 total employees age 15+ (% of population) percentage of total population, age above 15, that has been employed during the given year.
Why only these variables out of the whole, 
The Gapminder dataset contains the dataset of all over the world in a nation wise manner so every column has a data with very distinct values and sometimes up to more than 5 decimal places. Hence those variables would not be a good option to calculate the frequency tables and I have shown the correct approach with above three variables and focused more on the process. 
Here the data focuses only on the data available in the column and ignores the data if missing in a particular cell 
0 notes
marc0sop · 4 years ago
Text
Data Analysis Experience with Coursera
Getting started with the Gapminder Data set with the research question of “CO2 emission with increasing urban population” coutrywise.
Here I have the data for the C02 emission, population, electricity consumption, and other variables which will help in determining the correlation between these variables. 
Research Question: What impact does increasing urban population have on C02 emission by country?
Hypothesis: The increasing urban population will increase the C02 emission in the country.
Variables to be considered: 
1. CO2 emission, in metric tons (based on the year of 2006)
2. Urban population, wrt total population  (data based on the year of 2008)
3. Residential electricity consumption per person, in kWh  (data based on the year of 2008)
4. Employment rate, wrt total population (data based on the year of 2007)
5. Gross Domestic Product per capita in constant 2000 US$ (data based on the year of 2010)
6. Oil Consumption per capita, tones per year and person (data based on the year of 2010)
I did a research on various articles but no other research was done considering all these variable on the nation wide scale and the below references would provide me some instances about how such research was conducted so that it could be taken on a high scale project.
Also some other source from the feedback might be helpful in the future, The National Institutes of Health, for fetching some datasets and literature for conducting and taking the research forward on the topic.
References:
Cui, P., Xia, S., & Hao, L. (2019, August 28). Do different sizes of urban population matter differently to CO2 emission in different regions? Evidence from electricity consumption behavior of urban residents in China. Retrieved April 14, 2021, from https://www.sciencedirect.com/science/article/abs/pii/S095965261933077X
CO2 emission of urban passenger transportation in China from 2000 to 2014. (n.d.). ScienceDirect.com | Science, health and medical journals, full text articles and books. https://www.sciencedirect.com/science/article/pii/S1674927818301138
1 note · View note