The difference between models, drive-time vs fatality edition

Easily one of the most common critiques I make when reviewing peer reviewed papers is the concept, the difference between statistically significant and not statistically significant is not itself statistically significant (Gelman & Stern, 2006).

If you cannot parse that sentence, the idea is simple to illustrate. Imagine you have two models:

Model     Coef  (SE)  p-value
  A        0.5  0.2     0.01
  B        0.3  0.2     0.13

So often social scientists will say “well, the effect in model B is different” and then post-hoc make up some reason why the effect in Model B is different than Model A. This is a waste of time, as comparing the effects directly, they are quite similar. We have an estimate of their difference (assuming 0 covariance between the effects), as

Effect difference = 0.5 - 0.3 = 0.2
SE of effect difference = sqrt(0.2^2 + 0.2^2) = 0.28

So when you compare the models directly (which is probably what you want to do when you are describing comparisons between your work and prior work), this is a bit of a nothing burger. It does not matter that Model B is not statistically significant, a coefficient of 0.3 is totally consistent with the prior work given the standard errors of both models.

Reminded again about this concept, as Arredondo et al. (2025) do a replication of my paper with Gio on drive time fatalities and driving distance (Circo & Wheeler, 2021). They find that distance (whether Euclidean or drive time) is not statistically significant in their models. Here is the abstract:

Gunshot fatality rates vary considerably between cities with Baltimore, Maryland experiencing the highest rate in the U.S.. Previous research suggests that proximity to trauma care influences such survival rates. Using binomial logistic regression models, we assessed whether proximity to trauma centers impacted the survivability of gunshot wound victims in Baltimore for the years 2015-2019, considering three types of distance measurements: Euclidean, driving distance, and driving time. Distance to a hospital was not found to be statistically associated with survivability, regardless of measure. These results reinforce previous findings on Baltimore’s anomalous gunshot survivability and indicate broader social forces’ influence on outcomes.

This ends up being a clear example of the error I describe above. To make it simple, here is a comparison between their effects and the effects in my and Gio’s paper (in the format Coef (SE)):

Paper     Euclid          Network       Drive Time
Philly     0.042 (0.021)  0.030 (0.016)    0.022 (0.010)
Baltimore  0.034 (0.022)  0.032 (0.020)    0.013 (0.006)

At least for these coefficients, there is literally nothing anomalous at all compared to the work me and Gio did in Philadelphia.

To translate these coefficients to something meaningful, Gio and I estimate marginal effects – basically a reduction of 2 minutes results in a decrease of 1 percentage point in the probability of death. So if you compare someone who is shot 10 minutes from the hospital and has a 20% chance of death, if you could wave a wand and get them to the ER 2 minutes faster, we would guess their probability of death goes down to 19%. Tiny, but over many such cases makes a difference.

I went through some power analysis simulations in the past for a paper comparing longer drive time distances as well (Sierra-Arévalo et al. 2022). So the (very minor) differences could also be due to omitted variable bias (in logit models, even if not confounded with the other X, can bias towards 0). The Baltimore paper does not include where a person was shot, which was easily the most important factor in my research for the Philly work.

To wrap up – we as researchers cannot really change broader social forces (nor can we likely change the location of level 1 emergency rooms). What we can change however are different methods to get gun shot victims to the ER faster. These include things like scoop-and-run (Winter et al., 2022), or even gun shot detection tech to get people to scenes faster (Piza et al., 2023).

References

Leave a comment

1 Comment

  1. SemihCebraiL's avatar

    the first book i read while learning data science discussed this misconception and the difference between regression and correlation. also, GSD technology reminded me of an episode in the tv series “person of interest” (:

    Reply

Leave a comment