bad-statistics
bad-statistics
Bad Statistics Blog
17 posts
The purpose of this blog is to humiliate powerful liars.
Don't wanna be here? Send us removal request.
bad-statistics · 7 years ago
Link
3 notes · View notes
bad-statistics · 8 years ago
Photo
Tumblr media Tumblr media
how I read figure 10 (N of around 1000 or 2000)
I don't believe the voucher/tax-credit differences are generally big enough to mean much.
(Especially with this small sample size --- by which I mean a low likelihood of having caught different groups, not ability to estimate one Gaussian population.)
The differences between 8, 6, and 2 are probably real (especially the difference between 8 and 2). So
religious/moral instruction is more often the reason Hoosier voucher parents say they sent their kids elsewhere than extracurricurriculars or neighborhood convenience
and
academic quality and school safety are probably less of the reason those parents say they sent their
The religious/moral > extracurricular/convenience result could be overturned by sampling wider across Indiana and catching more groups---I don't know what the distribution of counties / races / etc is there or where their respondents came from w.r.t. those differences.
0 notes
bad-statistics · 8 years ago
Text
30±5   −   20±5      ≠      10
0 notes
bad-statistics · 8 years ago
Link
The data is
emailing 1500 people
The analysis is
40 pages with 20 bar charts.
The authors:
Evan Rhinesmith is a Ph.D. candidate and Doctoral Academy Fellow in theUniversity of Arkansas’s Department of Education Reform.
Andrew D. Catt is the director of state research and policy analysis for the FriedmanFoundation for Educational Choice. In that role, Drew conducts analyses on privateschool choice programs, conducts surveys of private school leaders, and supportsquality control as the Foundation’s research and data verifier. Prior to joining theFriedman Foundation in May 2013, Drew served as the program associate for TheClowes Fund, a private family foundation located in Indianapolis that awardsgrants to nonprofits in Seattle, Greater Indianapolis, and Northern New England.Drew graduated from Vanderbilt University in 2008 with a bachelor’s degreein Human and Organizational Development, specializing in Leadership andOrganizational Effectiveness. While at Vanderbilt, Drew served as research assistantfor North Star Destination Strategies, a community branding organization. Duringthat time, Drew also researched the effects of homeschooling on socialization.Drew received his Master of Public Affairs in Nonprofit Management at IndianaUniversity’s School of Public and Environmental Affairs in Indianapolis. He alsoreceived his Master of Arts in Philanthropic Studies through the Lilly FamilySchool of Philanthropy. While in graduate school, Drew’s research focused onteacher performance incentives and cross-sector collaboration. Drew is currentlypursuing a graduate certificate in Geographic Information Science (GIS) at IUPUI.Drew is a native of central Indiana and currently resides in downtown Indianapoliswith his wife Elizabeth.
0 notes
bad-statistics · 8 years ago
Text
Wasserman writes with too many symbols. Both in his bootstrap paper and All of Statistics.
0 notes
bad-statistics · 8 years ago
Text
A professor who wrote a bullshit statistics paper successfully prevents his critics from criticizing him in a scholarly journal.
Bollen: 1; Worm: 0.
19 notes · View notes
bad-statistics · 8 years ago
Link
Besides the degeneracy of calling something "inevitable" after the fact,
a N=1000 YouGov poll is not sufficient to conclude much of anything.
In order to understand the Trump phenomenon, we commissioned YouGov to carry out a Web-based survey of a national sample of 1,000 Republicans and independents. Of this initial sample, 688 respondents identified themselves as certain to vote in a Republican primary. The surveys were carried out during the two weeks surrounding the Iowa caucuses in early February but results from those interviewed before and after the caucuses were almost identical.
We asked our respondents to rank the eleven major candidates in the GOP race at the time of the Iowa caucuses: Jeb Bush, Ben Carson, Chris Christie, Ted Cruz, Carly Fiorina, Mike Huckabee, John Kasich, Rand Paul, Marco Rubio, Rick Santorum, and Trump. Remarkably almost everyone ranked all eleven candidates,
At least Ronald Rapoport
a) discusses methodology
b) admits his small N
c) can think critically about numbers.
But, NYRB, you are all fucking professors or some nonsense. One of you, at least one of you, has to know a stats professor who can tell you that
a N=1000 YouGov poll does not give you small enough error bars for 48% ±error to actually be different from 42% ± error.
0 notes
bad-statistics · 8 years ago
Link
Wharton's examples of "successful applications" of analytics are:
nielsen
google analytics
moneyball
an analytics company that calls itself "the leading analytics company [such-and-such vertical]"
One of the speakers was a recent Wharton MBA.
Previous years had
Wal-Mart
that one other thing that everybody has heard about
This is the best Wharton can do, and they are doing their best to sell you on data science / analytics.
Everyone has the opportunity to choose to live a life of integrity.
0 notes
bad-statistics · 8 years ago
Link
Wharton graduated Raj Rajuratnam and Donald J Trump.
0 notes
bad-statistics · 8 years ago
Link
Wharton graduated Raj Rajuratnam and Donald J Trump.
0 notes
bad-statistics · 8 years ago
Video
youtube
A regional dairy whose chief data scientist has an MBA. This is the best ad SAS could come up with proving the value-added of analytics? "We looked at stuff like weather."
OK.
I hope leveraging those analytics drove sales growth and shareholder value.
0 notes
bad-statistics · 8 years ago
Link
Cathy went to Harvard. She wants you to know this. She also reads the New York Times, and thinks thoughts about it. Her thoughts are important because she went to Berkeley, then Harvard, then worked at D. E. Shaw for at least 12 months.
Cathy is a Professor of Data Science who gives speeches for money.
Aaron Brown gave her book 1 star.
0 notes
bad-statistics · 8 years ago
Link
One of very few popular articles I've seen that goes into the data sources. Although it's much more convenient and fun to simply make shit up that you think is true, collecting data means looking at the world and writing down things about it systematically.
There is no perfect way to do this. Doing it well may cost lots of money. Also importantly, it will never be convenient or obvious how to combine different data sources which asked different questions of different people.
However, that is how we get quantitative knowledge about the world.
1 note · View note
bad-statistics · 8 years ago
Text
Selling out
Most people will sell out for very cheap. Give them $50k, $100k and---crucially---a pat on the back, and they will say whatever you want them to.
1 note · View note
bad-statistics · 8 years ago
Text
Graph Models
I owe you a post on the rhetoric of PGM's.
In a 1980's volume Connectionist Symbol Processing, Geoff Hinton drew a graphical model with convenient variable names and cute pictures. Eg rained yesterday → rains today. Not that "whatever A and B I stick into my graph shaped like this, it will make sense".
In chapter 19 of All of Statistics Larry Wasserman does the same thing.
iirc L S Paul's book on causality and Judea Pearl's do the same thing.
Deborah Mayo may or may not be guilty. (I'm guessing not.)
Elliott Sober Parsimony and Prediction draws a convenient looking curve. It hides the difficulties you would knock into if you considered parameterized families of curves.
You can't just conveniently title your variables. That is not an argument.
The structure itself needs to actually work for what you are saying it will. Be it Fourier, Markov, Lagrange, Fulton, etc.
0 notes
bad-statistics · 8 years ago
Link
Other probabilities are obtained by experiment and are thus approximations which are typically expressed to three significant digits unless there are compelling reasons for more or less precision.
Who the fuck says this is "typical"? More importantly, there are usually good reasons to use fewer significant figures. As in, most studies cited in the news (Pew, Gallup, etc) have N ≈ 1000. That number was chosen by Gallup to minimize cost, giving a just-barely-reportable number.
http://www.gallup.com/178685/methodology-center.aspx
Just-barely-reportable numbers cannot be added and subtracted like regular numbers.
The distance between 30±5 and 35±10 is not 5. It's anywhere between 15 and −10 the other way (=opposite conclusion)
When I read the the New York Times I mentally add ±half of the first sigfig or more. Even ±1 to the first sigfig.
That wipes out most comparisons I have seen in the NYT.
Also note that ±5 in the statistical sense means ±5, ±10, or ±15. That ±sampling error refers to the stdev of a Gaussian or student distribution. (We use Gaussians because the sampling distribution of the mean might be Gaussian even when the data are not.) Gaussians with stdev=5 have a 85% chance of being inside ±5, 95% of being inside ±10, and 98% chance . (That was from memory but I just checked with R's pnorm and qnorm and I'm not far off. You can also check with pt and qt, but since part of my critique is imperfect sampling I would always be far, far more pessimistic than 98% certainty inside ±3σ.
http://www.r-fiddle.org/#/fiddle?id=pA2BoEHg&version=4
0 notes
bad-statistics · 8 years ago
Text
Can only Ivy League grads with **international experience** get paid to write this?
Writing in ARTSY.net, Anna Louie Sussman (self-titled “outstanding writer” with degrees from Brown, NYU, and the LSE) asks:
Can only rich kids afford to work in the art world?
Every Brooklyn resident instinctively knows this is true, so the answer was and will be “yes” before and after the evidence, which will never be examined.
Unlike some of her peers, Guerrero wasn’t able to fall back on a crucial resource: help from Mom and Dad.
But it’s the age of Big Data so let’s ask a statistician. Or actually, the New York Times Upshot.
A recent report in the New York Times showed 22-, 23-, and 24-year-olds aspiring to work art and design are the most likely to receive financial assistance from their parents, with 53% reporting some help, compared with 40% of twenty-somethings overall. They also received the most money, an average of $3,600 a year, compared with an average of $3,000 for their peers in other fields.
First of all notice that $3600 per year is $300 per month. This is important because later Quoctrung Bui will compare $3300 to $3600, which sounds like a more meaningful difference than comparing $300 to $275.
No sampling errors are provided.
I believe the sample error will easily swamp the visual comparisons we are invited to make here.
How Much Do They Receive? Average annual amount of parental support, by desired field Art and Design Professional Services Health Computer science Education and Social Work Personal Services Blue Collar and Military $3.6k $3.5k $3.3k $3.3k $3.0k $2.2k $1.4k
Notice that
Kids who majored in computer science receive almost as much money as kids who majored in art.
We don’t have a convenient stereotype about this though. (OK, I do, but the authors might not.)
These numbers are also small in the face of NYC rents. Maybe Mom and Dad (the villains of the piece?) paid two months’ rent, a Christmas & birthday present / paid for travel home / etc.
This doesn’t demonstrate the hoped-for
urban/rural divide
spoiled art brat
spoiled millennial vis-a-vis 1970’s PSID sample (which was who, anyway?)
How big is that error bandwidth, maybe?
https://psidonline.isr.umich.edu/CDS/TAS13_UserGuide.pdf
Results of Data Collection Effort The TAS-2013 sample of 2,156 individuals was released to the field for interviewing. During data collection, 34 cases were determined to be ineligible (including 3 completed interviews), bringing the total eligible sample to 2,122. Of these, 1,804 provided complete interviews, yielding a 90% response rate for the TAS-2013 fieldwork effort. Table 1 provides the final dispositions for the total sample of 2,122 cases.
Table 1: Sample Disposition Sample Count Description 2,122 Total TAS-2013 sample 1,804 Completed interview with an eligible TAS-2013 sample individual 30 Sample individual incarcerated or in a youth, group, or detention home/center: ineligible for interview contact 5 Sample individual away on military leave, in job corps, or in a non-detention facility 5 Sample individual incapacitated, had a permanent health condition, or institutionalized for health or psychological reasons 4 Sample individual deceased after PSID interview completed but before TAS interview: ineligible for interview contact 85 Refusal by the sample individual; partial/passive refusal; deliberate avoidance of interviewer (e.g., always too busy, repeated broken appointments, or failure to return calls) 56 Refusal by someone other than the sample individual 21 Sample individual lost; tracking efforts exhausted 43 Some household member contacted, but eligible respondent not available to do interview; appointment broken, but no evidence of deliberately avoiding interview 10 Sample individual resided outside of US or in a remote area and uncontactable (e.g., no telephone) 58 Sample individual was initially thought to be ineligible because of nonresponse but discovered to be a resident in a response sample family after the interviewing period had ended 1 Office error – study ended, insufficient or inappropriate calls made, no mention of refusal • Average interview length: 63.62 minutes • Completed interviews: 1,807* of 2,122 released sample cases * Three cases were found to be ineligible after completion of the interview o Sample members who still resided with core PSID family as an “other family unit member” but lived at college: 207
Completed interviews: 1,807* of 2,122 released sample cases * Three cases were found to be ineligible after completion of the interview o Sample members who still resided with core PSID family as an “other family unit member” but lived at college: 207 o Sample members who still resided with core PSID family as an “other family unit member” living with parents: 817 o Sample members who had formed independent PSID family units as Head/Wife/“Wife”: 783 • Data collection response rate: 90%
Chapter 5 – The TAS-2013 Sample Weight To account for differential probabilities of selection due to the original CDS sample design and subsequent attrition, the TAS-2013 data are provided with a sample weight. The construction of the TAS-2013 sample weight is described in this chapter.
In other words:
A statistical survey in Michigan asked 1000-2000 people some questions.
NYT Upshot wrote an article stating that this proves a stereotype.
And a self-styled "outstanding writer" with lots of degrees who has lived in Morocco wrote a long article complaining about Cooper Union†, NYC rent, ihow little Sotheby’s pays interns, and the general unfairness of it all.
†Cooper Union used to be free. They were endowed by a super rich guy a century ago. They lost their endowment in the crash of 2008. Cooper Union is no longer free. This has nothing to do with privileged art kids.
The 2013 supplement to “transition to adulthood” is here: https://psidonline.isr.umich.edu/CDS/TAS13_UserGuide.pdf
user guide here: https://psidonline.isr.umich.edu/CDS/TA05-UserGuide.pdf
Quoctrung does link to http://www.psc.isr.umich.edu/pubs/pdf/rr13-801.pdf.
Like
Where do I come down on the original question? I don’t know. I certainly have the stereotype. I personally hang out with a lot of art students and art management students, as well as artists. The main thing I noticed when I first met a group of sculpture BFA friends
The usual honest answer you give as a statistician is I dunno. In fact this is why statistics was invented: people didn’t know (the default state) and wanted to know when they could stop researching because they had found out enough to draw a conclusion.
My opinion at the end of these two articles is that credentialed people get nice jobs. (Also a stereotype I held beforehand.)
1 note · View note