Don't wanna be here? Send us removal request.
Text
For testing a basic linear regression, Titanic data set was used. The response variable is ‘Survived’ which is a categorical variable indicating that if the person survived(=1) and if the person did not survive(=0). The primary explanatory variable is ‘Gender’ which is also a categorical variable indicating male(=1) and female(=0). Here we want to test how the gender factor affected the survival of the people on the Titanic ship.
From the tables and scatter plot(mentioned in the earlier link) of the regression model, it is clear that the impact of gender on survival is significant since p values are 0.000. Also gender explains approximately 30% of the reason for people to survive. Also from the coefficients, we get the regression equation as:
y^ = 0.74 - 0.55x
From the plotting of the regression line it is clear that more females survived than men since the line is downward sloping from 0 to 1. Hence more females survived compared to men.
0 notes
Link
Python codes for testing a basic linear regression are available here.
0 notes
Text
Test for Basic Linear Regression
import pandas as pd
df = pd.read_csv('titanic.csv')
df.head()
df.describe()
df['Gender'] = 0 df.head()
df.loc[df['Sex'] == 'male', 'Gender'] = 1 df.head()
import numpy as np import matplotlib.pyplot as plt import statsmodels.api as sm import statsmodels.formula.api as smf
modelname = smf.ols('Survived ~ Gender', data=df).fit()
print(modelname.summary())
import seaborn as sns
sns.regplot(x="Gender", y="Survived", data=df)
0 notes
Text
Sample
The sample is from the National Stock Exchange’s leading Index, NIFTY 50 since this index is used by most of the investors in India. 41 companies have been studied out of the 50 companies since these companies have consistently been a part of NIFTY 50 over the years. Observations (N=1978) represented the OHLC (opening, high, low and closing) of stock prices for these 41 companies, which included 1025 days of Bullish period and 953 days of Bearish period. The data analytic sample for this study included stock prices for 1978 days from 1st April 2008 to 31st March 2016 for the 41 companies under Bearish and Bullish periods.
Procedure
Data were collected from (https://www1.nseindia.com/products/content/equities/equities/eq_security.htm), the website of National Stock Exchange of India Ltd. on daily stock prices of 41 companies out of NIFTY 50 companies from 1st April 2008 to 31st March 2016.
Measures
Here the response variable is Herd Mentality in Indian stock market data using explanatory variable Cross-Sectional Absolute Deviation (CSAD). Herd Mentality was assessed on the basis of daily stock market data of NIFTY 50 companies during Bearish and Bullish period for the given time period. Least Squares technique was applied to find the corresponding p values for Bearish, Bullish and the whole period using statmodels in Python. Also, by using matplotlib in Python, a non-increasing non-linear relationship between CSAD and absolute values on market return was observed in a scatter plot.
1 note
·
View note