braincourse
braincourse
Data Management and Visualization
3 posts
Don't wanna be here? Send us removal request.
braincourse · 1 year ago
Text
Week 4 Assignment - Creating Graphs for Your Data
STEP 1: Create graphs of your variables one at a time (univariate graphs). Examine both their center and spread. STEP 2: Create a graph showing the association between your explanatory and response variables (bivariate graph). Your output should be interpretable (i.e. organized and labeled).
WHAT TO SUBMIT: Once you have written a successful program that creates univariate and bivariate graphs, create a blog entry where you post your program and the graphs that you have created. Write a few sentences describing what your graphs reveal in terms of your individual variables and the relationship between them.
Submission:
Program
Tumblr media
1. Univariate Graph generated for %"BLOOD/NATURAL MOTHER EVER DEPRESSED" and "SEX" Distributions, separately - Categorical 'Discrete' Variable visualized with single variable bar chart.
-> ONSET AGEGROUP: 73% reported NO maternal family history of major depression, out of which was the case for 15% more Females than Males.
Tumblr media Tumblr media
Univariate Graph generated for %NUMBER OF EPISODES Distribution - Quantitative Variable visualized with single variable bar chart. -> NUMBER OF EPISODES:
To see more realistic data distribution, excluded code 101 that was manually generated to count recovered valid information for N/A responses.
Results revealed high concentration of ~50% for reported cases of individuals whom experienced only one episode (lasted more than 2 weeks in their lifetime); ~ 16% and 10% reported 2 and 3 episodes; the numbers decreased/fluctuated incrementally for adults who reported 4-98 episodes.
Modality/Peakness: Skewed-Right Distribution - majority of reported cases were applicable to individuals having only 1-3 episodes.
Spread/Variability: Approximate Minimum: 1 (1 episode) Approximate Maximum: 98 (98 episodes) Center: (98 - 1) /2 = 48.5 Approximate Range: (98 - 1) = 97
Tumblr media
3. Calculated Standard Deviations using PROC Univariate Syntax - only appropriate to use for quantitative variables.
A. Examined AGE AT ONSET OF FIRST EPISODE as the Quantitative Variable for SAS to provide us tables with univariant statistics for the variable
e.g. Average AGE AT ONSET OF FIRST EPISODE = Mean (31) -/+ Std Deviation (15) which indicate there is ~ 50% range in terms of ONSET AGE (16 - 46)
A lot of Variability (Variance = 271)   
Other values: Median, Mode, Range, Cut points for specific %s on the variable, Extreme and missing values.
Tumblr media
B. Examined NUMBER OF EPISODES as the Quantitative Variable for SAS to provide us tables with univariant statistics for the variable
e.g. Average ANUMBER OF EPISODES = Mean (4.8) -/+ Std Deviation (12.5) which indicate there is noticeable range in terms of NUMBER OF EPISODES deviation from the Average.
A lot of Variability (Variance = 156.5)   
Other values: Median, Mode, Range, Cut points for specific %s on the variable, Extreme and missing values.
Tumblr media
4. I'm selecting to create a Bivariate Graph to show the association between "NUMBER OF EPISODES" (Quantitative Explanatory variable) and "BLOOD/NATURAL MOTHER EVER DEPRESSED" (Categorical Response variable) 
Similar %s represented for Females with maternal history of depression and without.
Tumblr media
5. I'm selecting to create a Bivariate Graph to show the association between "NUMBER OF EPISODES" (Quantitative Explanatory variable) and "BLOOD/NATURAL MOTHER EVER DEPRESSED" (Categorical Response variable)A. Binning/Collapsing Explanatory Quantitative Variable into 2 Categories for (Quantitative Variable (NUMBER OF EPISODES) -> Sub-Quantitative Variable (NOEPISPERYR=NUMBER OF EPISODES PER YEAR) -> Categorial Variable (MAJDEPRESSCAT=Major DEPRESSION CATEGORY/CLASSIFICATION))
Tumblr media
B. Make C -> C (Category -> Category) Bar Chart
Among those with "minimal to none/Mild/Moderate" depression category, 2%-2.5% have maternal family history of major depression, while individuals with higher-risk depression represent 1.8%.
Tumblr media
0 notes
braincourse · 1 year ago
Text
Week 3 Assignment - Making Data Management Decisions
1) Program
Tumblr media
2) Results/output that display at least 3 data managed variables as frequency distributions.
Tumblr media Tumblr media Tumblr media Tumblr media
3) Write a few sentences describing these frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.
I applied data management conditions below:
Subsetting the data to include only Adults with AGE category 18-40 years old (Category applicable across all tables).
Grouping variables further within individual variables for AGE category (Quantitive Variables -> 4 Categorial Variables)
Setting aside missing data (Unknown - coded as 99):
a- S4AQ6A (AGE AT ONSET OF FIRST EPISODE)
b- S4AQ7 (NUMBER OF EPISODES)
Recovering valid information for "NA" responses, previously set as BLank (BL=. in SAS), by creating a dummy code (98 and 97)
Findings:
AGE AT ONSET OF FIRST EPISODE: Close to 77% was the product of recovered valid information for "NA" responses; 8% of responses were clustered around individuals who fall within the age range of 15-21 yrs. Only 7.7% frequencies missing due to "Unknown" missing data.
NUMBER OF EPISODES: Close to 78% was the product of recovered valid information for "NA" responses; 11% of responses were clustered around individuals who experienced 1 episode only and approximately 8% reported 2-5 episodes (Separate times lasting at least 2 weeks in their lifetime). Only 8% of responses missing due to "Unknown" missing data.
BLOOD/NATURAL MOTHER EVER DEPRESSED: 20% reported maternal family history of depression out of which was the case for 13% more Females than Males (SEX).
AGE and AGEGROUP: Results showed almost 50% of the responses fell under AGEGROUP 4 (31-40 year olds); Equal distribution of 20% for AGEGROUP 2 & 3 (21-30 year olds) and 12% for AGEGROUP 1 (18-20 year olds).
0 notes
braincourse · 1 year ago
Text
Project Title:
Major Depression in Adults - Symptoms, Genetic Influences and Familial Aggregation 
Research Statement / Description:
I have reviewed the 'National Epidemiologic Survey of Drug Use and Health' Codebook and Data Set! I find myself particularly interested in conducting my research on the subject of Major Depression (Low Mood I); Furthermore, I am interested in introducing Family History (III) of Major Depression as 2nd concept to include in the report to understand association of heredity and Genetics to Major Depression (Low Mood I).
Research Question:
What is the correlation and extent of Family History on Major Depression?
literature review:
I navigated Google and Google Scholar to search detailed information on Major Depression and Genetics! I reviewed findings and results of multiple researches and studies.
Hypothesis:
Major depressive disorder (MDD) can be triggered by genetics as well as environmental factors.
MDD symptoms can be identified as depressed mood, lack of interests, impaired cognitive function and vegetative symptoms.
Genetic similarities/clustering within a given family represent the most significant elements to ancestral aggregation of major depressive disorder.
Inheritable major depression ranges from 31% to considerably higher depending on diagnoses accuracy for major depression, or if it's the case of subtype disorder like recurrent major depression.
In most cases, heritability could be higher than 40-50% for severe depression; Meaning, around half of the cause is genetic.
More number of early-onset patients are the ones with family history of mood disorders.
MDD including, positive familial history of major depression is likely transmissible as twice as many in women as in men and the same is associated with younger age of onset in women where they experience the first symptoms.
Depression is a complicated attribute. Anatomy of genes and depressive disorders demand rigorous efforts to identify genetic triggers.
Sources:
https://www.nature.com/articles/nrdp201665
https://onlinelibrary.wiley.com/doi/abs/10.1002/da.20613
Genetic Epidemiology of Major Depression: Review and Meta-Analysis
Family history of mood disorder and characteristics of major depressive disorder
Major Depression and Genetics
Genetics and genomics of depression
1 note · View note