dsabtblog
dsabtblog
DSA-BT Training
10 posts
Don't wanna be here? Send us removal request.
dsabtblog · 5 years ago
Text
Week 4 - Bivariate graph
Graph showing the association between explanatory and response variables
Median age is the explanatory (predictor) variable. Cellphones per 100 people is the response (outcome) variable
Tumblr media
  Code
# -*- coding: utf-8 -*- """ """
import pandas import numpy
import seaborn import matplotlib.pyplot as plt # any additional libraries would be imported here
data = pandas.read_csv("./gapminder2018year2.csv", low_memory=False)
#set PANDAS to show app columns in Dataframe pandas.set_option("display.max_columns", None) #set PANDAS to show app rowsin Dataframe pandas.set_option("display.max_rows", None)
print (len(data)) #number of observations (rows) print (len(data.columns)) # number of variables (columns)
data.info() #setting variables you will be working with to numeric data['Cellphones100'] = pandas.to_numeric(data['Cellphones100']) data['IncomePPP'] = pandas.to_numeric(data['IncomePPP']) data['Population'] = pandas.to_numeric(data['Population']) data['Lifeexpect'] = pandas.to_numeric(data['Lifeexpect'])
data['median_age'] = pandas.to_numeric(data['median_age'])
#counts and percentages (i.e. frequency distributions) for each variable
print("counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people") c1 = data['Cellphones100'].value_counts(sort=False, dropna=False) print (c1)
print("percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people") p1 = data['Cellphones100'].value_counts(sort=False, normalize=True) print (p1)
print("counts for IncomePPP - Number of countries with this Income per Person per Year") c2 = data['IncomePPP'].value_counts(sort=False, dropna=False) print (c2)
print("percentage for IncomePPP - Percentage of countries with this Income per Person per Year") p2 = data['IncomePPP'].value_counts(sort=False, normalize=True) print (p2)
print("counts for Population - Number of countries with this Population") c3 = data['Population'].value_counts(sort=False, dropna=False) print (c3)
print("percentage for Population - Percentage of countries with this Population") p3 = data['Population'].value_counts(sort=False, normalize=True) print (p3)
print("counts for Life Expectancy- Number of countries with this Life expectancy") c4 = data['Lifeexpect'].value_counts(sort=False, dropna=False) print (c4)
print("percentage for Life Expectancy- Percentage of countries with this Life expectancy") p4 = data['Lifeexpect'].value_counts(sort=False, normalize=True) print (p4)
print("counts for Median Age - Number of countries with this Median Age") c5 = data['median_age'].value_counts(sort=False, dropna=False) print (c5)
print("percentage for Median Age - Percentage of countries with this Median Age") p5 = data['median_age'].value_counts(sort=False, normalize=True) print (p5)
sub1 = data[(data["median_age"]>=14) & (data["median_age"]<=49)] sub2 = sub1.copy() print("median_age - 4 categories - quartiles") sub2["median_agegroup4"]=pandas.qcut(sub2.median_age, 4, labels=["1=25%tile","2=50%tile","3=75%tile","4=100%tile"]) c9 = sub2["median_agegroup4"].value_counts(sort=False, dropna=True) print(c9)
c6= sub2.groupby("Cellphones100").size()
print(c2)
sub2["Cellphones100"] = pandas.cut(sub2.Cellphones100, [15, 20, 25, 30, 35, 40])
# change format from numaric to catigorical #sub2['Lifeexpect'] = sub2['Lifeexpect'].astype("catogory")
print("Describe Phones per 100") desc3 = sub2['median_age'].describe() print (desc3)
sub2['median_age'] = pandas.to_numeric(sub2['median_age'])
#bivariable bar graph seaborn.catplot(x="Cellphones100", y="median_age", data=sub2, kind="bar", ci=None) plt.xlabel("Median Age") plt.ylabel("Cellphones per 100")
0 notes
dsabtblog · 5 years ago
Text
Week 4 - Univariate graph
Univariate graph displaying a skewed-left bimodal distribution of the relationship of Cellphones per 100 individuals in a population to a country’s Life Expectancy. This graph shows countries with a higher life expectancy increases cellphone ownership by the population.
Graph
Tumblr media
Code
# -*- coding: utf-8 -*- """ import pandas
import numpy
import seaborn import matplotlib.pyplot as plt # any additional libraries would be imported here
data = pandas.read_csv("./gapminder2018year2.csv", low_memory=False)
#set PANDAS to show app columns in Dataframe pandas.set_option("display.max_columns", None) #set PANDAS to show app rowsin Dataframe pandas.set_option("display.max_rows", None)
print (len(data)) #number of observations (rows) print (len(data.columns)) # number of variables (columns)
data.info() #setting variables you will be working with to numeric data['Cellphones100'] = pandas.to_numeric(data['Cellphones100']) data['IncomePPP'] = pandas.to_numeric(data['IncomePPP']) data['Population'] = pandas.to_numeric(data['Population']) data['Lifeexpect'] = pandas.to_numeric(data['Lifeexpect'])
data['median_age'] = pandas.to_numeric(data['median_age'])
#counts and percentages (i.e. frequency distributions) for each variable
print("counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people") c1 = data['Cellphones100'].value_counts(sort=False, dropna=False) print (c1)
print("percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people") p1 = data['Cellphones100'].value_counts(sort=False, normalize=True) print (p1)
print("counts for IncomePPP - Number of countries with this Income per Person per Year") c2 = data['IncomePPP'].value_counts(sort=False, dropna=False) print (c2)
print("percentage for IncomePPP - Percentage of countries with this Income per Person per Year") p2 = data['IncomePPP'].value_counts(sort=False, normalize=True) print (p2)
print("counts for Population - Number of countries with this Population") c3 = data['Population'].value_counts(sort=False, dropna=False) print (c3)
print("percentage for Population - Percentage of countries with this Population") p3 = data['Population'].value_counts(sort=False, normalize=True) print (p3)
print("counts for Life Expectancy- Number of countries with this Life expectancy") c4 = data['Lifeexpect'].value_counts(sort=False, dropna=False) print (c4)
print("percentage for Life Expectancy- Percentage of countries with this Life expectancy") p4 = data['Lifeexpect'].value_counts(sort=False, normalize=True) print (p4)
print("counts for Median Age - Number of countries with this Median Age") c5 = data['median_age'].value_counts(sort=False, dropna=False) print (c5)
print("percentage for Median Age - Percentage of countries with this Median Age") p5 = data['median_age'].value_counts(sort=False, normalize=True) print (p5)
sub1 = data[(data["median_age"]>=14) & (data["median_age"]<=49)] sub2 = sub1.copy() #sub2.info() # quartile split (use qcut funtion & ask for 4 groups - gives you quartile split) print("median_age - 4 categories - quartiles") sub2["median_agegroup4"]=pandas.qcut(sub2.median_age, 4, labels=["1=25%tile","2=50%tile","3=75%tile","4=100%tile"]) c9 = sub2["median_agegroup4"].value_counts(sort=False, dropna=True) print(c9)
seaborn.distplot(sub2["Lifeexpect"].dropna(), kde=False); plt.xlabel("Cellphones per 100") plt.xlabel("Life expectancy")
0 notes
dsabtblog · 5 years ago
Text
Week 4 - Visualizing Data
Instructions STEP 1: Create univariate graphs Show center and spread. STEP 2: Create bivariate graph show association between your explanatory and response variables *Output should be organized and labeled
What to Submit: • Create a blog entry • Post program and graphs • Write a few sentences describing what your graphs reveal in terms of your individual variables and the relationship between them.
Review criteria • Was a univariate graph created for each of the selected variables? (2 points) • Was a bivariate graph created for the selected variables? (2 points) • Did the summary describe what the graphs revealed in terms of the individual variables and the relationship between them? (2 points)
0 notes
dsabtblog · 5 years ago
Text
Week 3 - Program, Output and Summary
Program
# -*- coding: utf-8 -*-
import pandas import numpy # any additional libraries would be imported here
data = pandas.read_csv("./gapminder2018year2.csv", low_memory=False)
print (len(data)) #number of observations (rows) print (len(data.columns)) # number of variables (columns)
data.info() #setting variables you will be working with to numeric data['Cellphones100'] = pandas.to_numeric(data['Cellphones100']) data['IncomePPP'] = pandas.to_numeric(data['IncomePPP']) data['Population'] = pandas.to_numeric(data['Population']) data['Lifeexpect'] = pandas.to_numeric(data['Lifeexpect'])
data['median_age'] = pandas.to_numeric(data['median_age'])
#counts and percentages (i.e. frequency distributions) for each variable
print("counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people") c1 = data['Cellphones100'].value_counts(sort=False, dropna=False) print (c1)
print("percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people") p1 = data['Cellphones100'].value_counts(sort=False, normalize=True) print (p1)
print("counts for IncomePPP - Number of countries with this Income per Person per Year") c2 = data['IncomePPP'].value_counts(sort=False, dropna=False) print (c2)
print("percentage for IncomePPP - Percentage of countries with this Income per Person per Year") p2 = data['IncomePPP'].value_counts(sort=False, normalize=True) print (p2)
print("counts for Population - Number of countries with this Population") c3 = data['Population'].value_counts(sort=False, dropna=False) print (c3)
print("percentage for Population - Percentage of countries with this Population") p3 = data['Population'].value_counts(sort=False, normalize=True) print (p3)
print("counts for Life Expectancy- Number of countries with this Life expectancy") c4 = data['Lifeexpect'].value_counts(sort=False, dropna=False) print (c4)
print("percentage for Life Expectancy- Percentage of countries with this Life expectancy") p4 = data['Lifeexpect'].value_counts(sort=False, normalize=True) print (p4)
print("counts for Median Age - Number of countries with this Median Age") c5 = data['median_age'].value_counts(sort=False, dropna=False) print (c5)
print("percentage for Median Age - Percentage of countries with this Median Age") p5 = data['median_age'].value_counts(sort=False, normalize=True) print (p5)
sub1 = data[(data["median_age"]>=14) & (data["median_age"]<=49)] sub2 = sub1.copy() #sub2.info() # quartile split (use qcut funtion & ask for 4 groups - gives you quartile split) print("median_age - 4 categories - quartiles") sub2["median_agegroup4"]=pandas.qcut(sub2.median_age, 4, labels=["1=25%tile","2=50%tile","3=75%tile","4=100%tile"]) c9 = sub2["median_agegroup4"].value_counts(sort=False, dropna=True) print(c9)
Output 
RangeIndex: 175 entries, 0 to 174 Data columns (total 6 columns): #   Column         Non-Null Count  Dtype   ---  ------         --------------  -----   0   Country_2018   175 non-null    object 1   median_age     174 non-null    float64 2   Lifeexpect     173 non-null    float64 3   Population     175 non-null    int64   4   IncomePPP      175 non-null    int64   5   Cellphones100  174 non-null    float64 dtypes: float64(3), int64(2), object(1) memory usage: 8.3+ KB counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people 66.0     1 126.0    4 111.0    1 45.0     2 141.0    1        .. 140.0    1 163.0    1 61.0     1 209.0    1 80.0     2 Name: Cellphones100, Length: 97, dtype: int64 percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people 66.0     0.005747 126.0    0.022989 111.0    0.005747 45.0     0.011494 141.0    0.005747
140.0    0.005747 163.0    0.005747 61.0     0.005747 209.0    0.005747 80.0     0.011494 Name: Cellphones100, Length: 96, dtype: float64 counts for IncomePPP - Number of countries with this Income per Person per Year 10500    1 24100    1 2570     1 12300    2 2830     1        .. 2810     1 7420     1 47300    1 6040     1 28700    1 Name: IncomePPP, Length: 163, dtype: int64 percentage for IncomePPP - Percentage of countries with this Income per Person per Year 10500    0.005714 24100    0.005714 2570     0.005714 12300    0.011429 2830     0.005714
2810     0.005714 7420     0.005714 47300    0.005714 6040     0.005714 28700    0.005714 Name: IncomePPP, Length: 163, dtype: float64 counts for Population - Number of countries with this Population 2880000      1 160000000    1 19200000     1 128000000    1 25600000     1            .. 58100        1 9790000      1 15000000     1 10300000     1 33100000     1 Name: Population, Length: 163, dtype: int64 percentage for Population - Percentage of countries with this Population 2880000      0.005714 160000000    0.005714 19200000     0.005714 128000000    0.005714 25600000     0.005714
58100        0.005714 9790000      0.005714 15000000     0.005714 10300000     0.005714 33100000     0.005714 Name: Population, Length: 163, dtype: float64 counts for Life Expectancy- Number of countries with this Life expectancy 64.0     4 78.0    10 65.0     7 77.0    12 76.0     7 83.0     9 82.0    12 71.0    10 74.0     8 80.0     5 73.0     8 70.0     4 75.0    10 62.0     4 63.0     7 52.0     1 60.0     2 69.0     9 79.0     8 81.0     7 68.0     3 66.0     5 61.0     2 72.0     6 84.0     2 55.0     1 67.0     5 59.0     4 NaN      2 85.0     1 Name: Lifeexpect, dtype: int64 percentage for Life Expectancy- Percentage of countries with this Life expectancy 64.0    0.023121 78.0    0.057803 65.0    0.040462 77.0    0.069364 76.0    0.040462 83.0    0.052023 82.0    0.069364 71.0    0.057803 74.0    0.046243 80.0    0.028902 73.0    0.046243 70.0    0.023121 75.0    0.057803 62.0    0.023121 63.0    0.040462 52.0    0.005780 60.0    0.011561 69.0    0.052023 79.0    0.046243 81.0    0.040462 68.0    0.017341 66.0    0.028902 61.0    0.011561 72.0    0.034682 84.0    0.011561 55.0    0.005780 67.0    0.028902 59.0    0.023121 85.0    0.005780 Name: Lifeexpect, dtype: float64 counts for Median Age - Number of countries with this Median Age 18.0    10 36.0     2 29.0     5 17.0     6 32.0     8 35.0     4 38.0     8 44.0     4 33.0     5 28.0    10 41.0     6 40.0     5 42.0     8 26.0     7 19.0    11 43.0     8 24.0     7 34.0     4 45.0     4 31.0     6 20.0     8 37.0     2 27.0     4 25.0     2 22.0     6 23.0     4 46.0     3 30.0     6 21.0     5 47.0     1 48.0     1 16.0     1 NaN      1 39.0     2 15.0     1 Name: median_age, dtype: int64 percentage for Median Age - Percentage of countries with this Median Age 18.0    0.057471 36.0    0.011494 29.0    0.028736 17.0    0.034483 32.0    0.045977 35.0    0.022989 38.0    0.045977 44.0    0.022989 33.0    0.028736 28.0    0.057471 41.0    0.034483 40.0    0.028736 42.0    0.045977 26.0    0.040230 19.0    0.063218 43.0    0.045977 24.0    0.040230 34.0    0.022989 45.0    0.022989 31.0    0.034483 20.0    0.045977 37.0    0.011494 27.0    0.022989 25.0    0.011494 22.0    0.034483 23.0    0.022989 46.0    0.017241 30.0    0.034483 21.0    0.028736 47.0    0.005747 48.0    0.005747 16.0    0.005747 39.0    0.011494 15.0    0.005747 Name: median_age, dtype: float64 median_age - 4 categories - quartiles 1=25%tile     48 2=50%tile     39 3=75%tile     45 4=100%tile    42 Name: median_agegroup4, dtype: int64
Summary
1. Data was managed by using dropna=True function to eliminate presence of data missing for a countries median age variable
2. The frequency distribution values divided the median age in to four percentile categories for the countries selected.
48 countries between the age of 15 - 22
39 countries between the age of 23 - 31 
45 countries between the age of 32 - 39
42 countries between the age of 40 - 48 
0 notes
dsabtblog · 5 years ago
Text
Week 3 - Managing Data
In this session, make and implement decisions with data.
Tasks
STEP 1: Make and implement data management decisions for the variables you selected.
STEP 2: Run frequency distributions for your chosen variables and select columns, and possibly rows. Your output should be interpretable (i.e. organized and labeled).
Review Criteria Was the program output interpretable (i.e. organized and labeled)? (1 point)
Does the program output display three data managed variables as frequency tables? (1 point)
Did the summary describe the frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.? (2 points)
0 notes
dsabtblog · 5 years ago
Text
Python Program
Program
#
# -*- coding: utf-8 -*-
"""
Created on Tue Aug  4 12:59:56 2020
import pandas
import numpy
# any additional libraries would be imported here
data = pandas.read_csv("./gapminder2018year.csv", low_memory=False)
print (len(data)) #number of observations (rows)
print (len(data.columns)) # number of variables (columns)
data.info()
#setting variables you will be working with to numeric
data['Cellphones100'] = pandas.to_numeric(data['Cellphones100'])
data['IncomePPP'] = pandas.to_numeric(data['IncomePPP'])
data['Population'] = pandas.to_numeric(data['Population'])
data['Lifeexpect'] = pandas.to_numeric(data['Lifeexpect'])
data['median_age'] = pandas.to_numeric(data['median_age'])
#counts and percentages (i.e. frequency distributions) for each variable
print("counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people")
c1 = data['Cellphones100'].value_counts(sort=False, dropna=False)
print (c1)
print("percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people")
p1 = data['Cellphones100'].value_counts(sort=False, normalize=True)
print (p1)
print("counts for IncomePPP - Number of countries with this Income per Person per Year")
c2 = data['IncomePPP'].value_counts(sort=False, dropna=False)
print (c2)
print("percentage for IncomePPP - Percentage of countries with this Income per Person per Year")
p2 = data['IncomePPP'].value_counts(sort=False, normalize=True)
print (p2)
print("counts for Population - Number of countries with this Population")
c3 = data['Population'].value_counts(sort=False, dropna=False)
print (c3)
print("percentage for Population - Percentage of countries with this Population")
p3 = data['Population'].value_counts(sort=False, normalize=True)
print (p3)
print("counts for Life Expectancy- Number of countries with this Life expectancy")
c4 = data['Lifeexpect'].value_counts(sort=False, dropna=False)
print (c4)
print("percentage for Life Expectancy- Percentage of countries with this Life expectancy")
p4 = data['Lifeexpect'].value_counts(sort=False, normalize=True)
print (p4)
print("counts for Median Age - Number of countries with this Median Age")
c5 = data['median_age'].value_counts(sort=False, dropna=False)
print (c5)
print("percentage for Median Age - Percentage of countries with this Median Age")
p5 = data['median_age'].value_counts(sort=False, normalize=True)
print (p5)
Output
runfile('C:/Downloadsf/Training/Data Science Academy/CodeSet/PythonCode/Week2code2.py', wdir='C:/Downloads/Training/Data Science Academy/CodeSet/PythonCode') 175 12 <class 'pandas.core.frame.DataFrame'> RangeIndex: 175 entries, 0 to 174 Data columns (total 12 columns): #   Column             Non-Null Count  Dtype   ---  ------             --------------  -----   0   Country_2018       175 non-null    object 1   median_age         174 non-null    float64 2   Lifeexpect         173 non-null    float64 3   Population_growth  174 non-null    float64 4   Population         175 non-null    object 5   IncomePPP          175 non-null    int64   6   Inflation          170 non-null    float64 7   Investments        158 non-null    float64 8   Fixedline100       171 non-null    float64 9   BroadbandS100      172 non-null    float64 10  Cellphones100      174 non-null    float64 11  internetuser       174 non-null    float64 dtypes: float64(9), int64(1), object(2) memory usage: 16.5+ KB counts for Cellphones100 - Number of countries with this number of Cell phones per 100 people 126.0    4 111.0    1 141.0    1 118.0    3 112.0    3        .. 83.3     1 79.9     1 84.2     1 98.9     1 47.6     1 Name: Cellphones100, Length: 116, dtype: int64 percentage for Cellphones100 - Percentage of countries with this number of Cell phones per 100 people 126.0    0.022989 111.0    0.005747 141.0    0.005747 118.0    0.017241 112.0    0.017241
83.3     0.005747 79.9     0.005747 84.2     0.005747 98.9     0.005747 47.6     0.005747 Name: Cellphones100, Length: 116, dtype: float64 counts for IncomePPP - Number of countries with this Income per Person per Year 10500    1 24100    1 2570     1 12300    2 2830     1        .. 2810     1 7420     1 47300    1 6040     1 28700    1 Name: IncomePPP, Length: 163, dtype: int64 percentage for IncomePPP - Percentage of countries with this Income per Person per Year 10500    0.005714 24100    0.005714 2570     0.005714 12300    0.011429 2830     0.005714
2810     0.005714 7420     0.005714 47300    0.005714 6040     0.005714 28700    0.005714 Name: IncomePPP, Length: 163, dtype: float64 counts for Population - Number of countries with this Population 3,440,000     1 424,000       1 5,710,000     1 37,600,000    1 285,000       1             .. 1,260,000     2 38,000,000    1 9,850,000     1 5,510,000     1 60,700,000    1 Name: Population, Length: 163, dtype: int64 percentage for Population - Percentage of countries with this Population 3,440,000     0.005714 424,000       0.005714 5,710,000     0.005714 37,600,000    0.005714 285,000       0.005714
1,260,000     0.011429 38,000,000    0.005714 9,850,000     0.005714 5,510,000     0.005714 60,700,000    0.005714 Name: Population, Length: 163, dtype: float64
Description Summary
The number of Cell phones per 100 people were based on comparing 175 countries against their population, income, life expectancy and median age during 2018. The frequency distribution varied on the presence of missing data and how often a country had the same value for each variable.
0 notes
dsabtblog · 5 years ago
Text
Week 2 - Writing your first program - SAS or Python
In this session select programming tool and write first program.
Tasks
Selection: Python, due to it being a free, open source general purpose programming language that can be useful for all types of programming work.
STEP 1: Run your first program
STEP 2: Run frequency distributions for chosen variables and select columns, and possibly rows. Output should be interpretable (i.e. organized and labeled).
STEP 3: Post program to blog  
Step 4: Post the output that displays three of your variables as frequency tables 
Step 5: Post  a few sentences describing your frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.
Review Criteria
Was the program output interpretable (i.e. organized and labeled)? (1 point)
Does the program output display three data managed variables as frequency tables? (1 point)
Did the summary describe the frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.? (2 points)
0 notes
dsabtblog · 5 years ago
Text
Research Project
Selecting a research question:  
Is a country’s cell phone usage primarily based on income per person, population or median age?
After looking through the code books provided I decided to use a custom Gapminder data set and code book.  
I am particularly interested in the relationship of  how communication usage shift over time for different country's.  I selected a variety of variables I expect to be related a country's communications usage type and will include all of the relevant variables in my personal code book.
STEP 1: Choose a data set: 
Custom Gapminder data set (https://www.gapminder.org/data/)
STEP 2. Identify a specific topic of interest: 
cell phone usage in a country
STEP 3. Prepare a codebook (completed)
STEP 4. Identify a second topic to explore in terms of its association with your original topic.
While general cell phone usage is a good starting point I'm interested if its related to life expectancy and income.  I believe some countries will differ depending on life expectancy and income. 
STEP 5. Add questions/items/variables documenting this second topic personal codebook.
Questions
Does cell phone usage change based on life expectancy?
Does income per person impact cell phone usage?
Personal Codebook Variables
Broadband subscribers (per 100 people)
Cell phones (per 100 people)
Internet user
Fixedline  (per 100 people)
Inflation (annual %)
Investments (% of GDP)
Population
Population growth (annual %)
Life expectancy (years)
Income per person (GDP/capita, PPP$ inflation-adjusted)
STEP 6. Perform a literature review for research previously done on topic.
Title of Article: Information and Communication Technology Use and Economic Growth Title of Journal:US National Library of Medicine National Institutes of Health Author: Maryam Farhadi, Rahmah Ismail, Masood Fooladi Date of Publication: November  12, 2012 URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3495961/
Title of Article: Mobile Divides in Emerging Economies Title of Journal: Pew Research Center Author: LAURA SILVER, EMILY A. VOGELS, MARA MORDECAI, JEREMIAH CHA, RAEA RASMUSSEN AND LEE RAINIE Date of Publication: November 20, 2019 URL: https://www.pewresearch.org/internet/2019/11/20/mobile-divides-in-emerging-economies/
Title of Article: Conceptualizing and Researching the Adoption of ICT and the Impact on Socioeconomic Development Title of Journal: Information Technology for Development Author: Narcyz Roztocki, Roland Weistroffer Volume and Issue Number: Volume 22, 2016 - Issue 4 Date of Publication: November, 23 2016 URL: https://www.tandfonline.com/doi/full/10.1080/02681102.2016.1196097
STEP 7. Based on literature review, develop a hypothesis about what you believe the association might be between these topics. Integrate the specific variables you selected into the hypothesis.
Hypothesis: A country's information and communication technology(ICT) usage type will vary based on income, population and life expectancy.
0 notes
dsabtblog · 5 years ago
Text
Data Science Academy Pilot
Data Management and Visualization
Wesleyan University
Instructor: Lisa Dierker,  Professor Psychology
Cohort Leader: Scott Severance
0 notes
dsabtblog · 5 years ago
Text
WEEK 1 - Selecting a research question
In this session discuss the basics of data analysis. 
Tasks: 1. Select a data set to work with, create code book to develop research question. 2, Set up a Tumblr blog to reflect experiences, submit assignments and share your work with others throughout the course.
Review Criteria
Has the learner selected a data set and indicated that selection? (1 point)
Has the learner clearly stated a research question and hypothesis? (2 points)
Does the literature review include clear information about search terms used? (1 point)
Does the literature review clearly identify references used? (2 points)
Does the literature review clearly present a summary of findings (e.g., variables considered, patterns of findings, etc.)? (2 points)
0 notes