A Note About Statistical Significance

Before your eyes glaze over (or you sharpen your claws, depending on your orientation towards such things) I’ll ask you a little indulgence.
Several incidents employing statistical methods have happened in my small corner of the world in the past three days. They have forced me to articulate some of my issues with how we, academics and lay people, employ maths and statistics in our daily discourse.

The first is an excellent exchange between Sara Goldrick-Rab and Matt Chingos and Stuart Buck. They are engaging Sara’s critiques of statistical methods used in a randomized controlled study of school vouchers (again, stay with me here!). It’s interesting on many levels – from scholarly to politically – but there’s a turn in the exchange towards the end that is of particular interest to me.

The second incident involved a discussion on Mother Jones on the extent of public mass shootings in the U.S. In the comments (I know, I know) there was a definite trend of statistical bombing happening. The majority of those doing that statistical bombing were attempting to show that mass shootings are actually rare because they are statistically insignificant when our population and gun ownership are accounted for.

Between the two incidents I blew a small vestigial gasket.

I’d like to talk about how we use statistics as weapons.

In the first exchange the conversation devolves into some mild academic mud throwing on the rigor of the proffered statisticians called upon during the debate. There is, I think, a decidedly patriarchal tinge to this turn. Here’s a woman calling out the methodology of a report being used in some pretty politically tinged debates about the privatization of primary education. Challenges are always uncomfortable but I suspect they are even more uncomfortable when there is some genderized beliefs about who does real stats and who doesn’t.

You’ll see it in graduate school where it is assumed that all the women do qualitative methods. Or, at academic conferences when it becomes clear that there is some gender imbalance in methods workshops. And, let’s not even add race. I spent a whole semester in a quant class once having all of my correct answers to problems being credited to the sole male Asian student in the class. I mean, it was so obvious it became a class joke. I could answer a question about multicollinearity and the professor would thank the Asian guy.

There’s some definite bias about who does real statistics.

And in that bias — or perhaps as a result of it — there’s a bias about the rational superiority of a statistic.

Since rational men, often white, dominate statistics it must be the superior knowledge.

And there’s the next incident.

If 250 people die this year from mass shootings that is, as several commenters pointed out, statistically insignificant in light of our population.

But statistical significance doesn’t have jack, er, nothing to do with practical significance.

A significance level in a statistic is just a determination about how much uncertainty we’re willing to accept while simultaneously accepting the validity of the statistic. That’s all. It’s a JUDGEMENT CALL. And like all judgement calls it is made by people. And people are, not to slay your unicorn here, sometimes motivated by prejudices and biases!

So, yes, there’s a convention about acceptable significance in a statistic — if we’ll accept a .05 or a .01, for instance. But conventions are shaped over time by people and, in the case of statistics, most often by men in privileged positions of academe and society.

That’s not to say it’s useless or in itself politically motivated. But neither is it a bastion of neutrality and superiority.

It’s just a statistic. No more, no less.

That’s why careful, honest researchers will spend time telling you the limitations of their study, their data, their methods and their statistics. We assume our audience has the training to judge all of that and the quality of our overall findings.

On the flip side we know that most of our society isn’t trained to be good readers of data. There are too many reasons for that to get into now. The fact is that some of us try to use that ignorance (or, unawareness) as a weapon. And that is when I start to blow steam.

If you are willing to accept 250 dead from mass shootings because they total fewer than 5% of the U.S. population? Fine. That’s between you and your god or moral compass.

But you can’t tell me that you are superior and unassailably right to make that determination because of an arbitrary level of “significance”.

I happen to be uncomfortable with 250 people shot dead in public spaces by an armed gunman. For me, the practical significance of 250 dead people is sufficient for discourse on gun control. I want a .00 significance level on mass shootings. That is certainly as rational and logical as any other level of significance.

Similarly, I am uncomfortable with the idea that a randomized trial with flaws beats the best observational studies. A hierarchy of knowledge is always political. There is rigor and there is lack of rigor. But, the minute you start telling me that your method is superior because it gets closer to some ultimate truth I am going to respectfully call BS.

Hiding in numbers is a way to evade the consequences of what you are really saying.

You’re saying that your math is better because you are the one doing it and you are superior.

You are saying that your controlled experiment is superior because it reduces complexities to rational numbers.

You are saying that 250 dead people is not enough for you to consider giving up your annual hunting trip.

Say what you really mean but don’t try to bludgeon people with a statistic you assume they are too untrained or too stupid to challenge. That’s employing an innocuous statistic as a weapon, designed to shut down the validity of opposing viewpoints and not move closer to understanding.

I’ve got about a .00 significance level set for that, too…and I did OK in econometrics. Ask the Asian guy.

7 thoughts on “A Note About Statistical Significance

  1. It seems to me the bit about “significance” in shootings is really poorly trying to do is invoke cost/benefit or diminishing returns language, which here is awful. They are saying as you point out that their hunting trip is worth more than the deaths from shooting. On the other hand, despite how awful the Breivik shooting was, I can see Norweigans saying, shootings are rare enough that more measures to reduce them would be ineffective or problematic. I certainly feel that US foreign security actions are inflicting more harm (both on foreigners and US citizens) than the benefits they provide in increased security.

    1. Agreed. Althought cost-benefit actuarial language is more of the same false neutrality. And the idea that we shouldn’t move the ball on gun control at all because of some arbitrary significance is indeed very different from a country like Norway where stricter control would really be marginal. What more could they do?! Compare that with how much we could do. We have so much room on the gun control spectrum that applying the same metric as other countries is actually quite absurd.

      Thanks for commenting!

  2. Whoever wrote that 250 was a statistically insignificant number is completely misusing the term. If I took a random sample of 300 Americans and 1 of them had died in a mass shooting that year, I could infer that 1,000,000 American died in mass shootings that year. The stats program would then spit out a significance level that told me how certain I could be of that one million deaths “statistic.” It could say 60%, 10% or 96%. We generally consider the last figure to be significant (“statistically significant”) because we are more than 95% sure that million is right. Being only 60% sure of that number means that I can’t use the one million figure in publications, or even conversation, because there’s a 40% chance that its wrong. I am only sure that, of the 300 people I studied, one died in a mass shooting.

    But if 250 people have died in mass shootings this year (that seems REALLY high) then 250 people have died in mass shootings – that’s a number, not a statistic.

    To Tressie’s point, that number SHOULD be morally significant. The idea that we can arbitrarily use the word “statistic” to try and make it insignificant is hogwash. 3,000 people died in 9/11. That’s only .001% of the US population. But I have to take my shoes off at the airport, can’t privately take a book out of the library and have to submit to an identity check to open a bank account. And conservatives wrote those laws. Incidentally, that number is also “statistically insignificant” if I am trying to use 2001 as a random sample of how many people die in terrorist attacks every year. But no one would argue that the event was, itself, insignificant.

    1. LOL Thank you, Nati. I wanted to talk about what a real statistic would look like but I thought that was getting too complicated. You do a great summation. 🙂

  3. This is tangential to the main point of your post, but…

    The person you replied to on Mother Jones seems to be saying saying that the number of shootings per year is “statistically insignificant” because it’s less than some percentage of the total number of murders. That’s not how statistical significance works AT ALL. It’s meaningless to talk about whether an observation is significant except in reference to some particular hypothesis, and that hypothesis has to be factored into the calculation. The 0.05 (or 0.01) cutoff is supposed to apply not to the observed percentage, but to the p-value, which is the probability that you’d observe an effect at least as big as what you’ve seen if the hypothesis were false. In this case, the hypothesis seems to be that shootings are a problem in our culture – and if any of them have occurred, we can reasonably conclude that the hypothesis is true with certainty. p = 0.00.

    Someone could dispute that conclusion, I’m sure. But to do so using statistical significance, they’d need to know the probability that 250 shootings would occur if it’s true that that shootings are not a problem. That is not the same as setting a cutoff for how many shootings constitute a “problem,” and not something that anyone could compute without totally making things up.

    1. “That’s not how statistical significance works AT ALL.”

      No it is not. It’s conflating significance in the general sense with the statistical definition in an attempt to take some high intellectual ground.

  4. “A significance level in a statistic is just a determination about how much uncertainty we’re willing to accept while simultaneously accepting the validity of the statistic. That’s all. It’s a JUDGEMENT CALL. And like all judgement calls it is made by people. And people are, not to slay your unicorn here, sometimes motivated by prejudices and biases!”

    I think you’re conceding too much. It’s not even a judgment call about uncertainty in general, but a particular type of uncertainty. It tells you nothing about proper specification, proper measurement, asking the right questions, etc. (Pointing this out repeatedly in my graduate classes did not make me very popular, FWIW). “Significance” is a term that ought to be sent to the dustbin. It won’t solve the many problems you mention, but it might help.

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous:

From the end-user side the meta-deta and format comments on this great post make the most sense. Like many people, I rely upon internet databases to do research. Academic edited volumes rarely have per chapter tagging. So, the book may not be tagged according to my interest or search because it is too broad. AndRead More

Next: