# The Failed Idea Bin: Temporal Aggregation and the Crime/Stop Relationship

A recent paper by the Hipp/Kim/Wo trio analyzing robbery at very fine temporal scales in NYC reminded me on a failed project I never quite worked out to completion. This project was about temporal aggregation bias. We talk about spatial aggregation bias quite a bit, which I actually don’t think is that big of deal for many projects (for reasons discussed in my dissertation).

I think it is actually a bigger deal though when dealing with temporal relationships, especially when we are considering endogenous relationships between crime and police action in response to crime. This is because they are a countervailing endogenous relationship – most endogenous relationships are positively correlated, but here we think police do more stuff (like arrests and stops) in areas with more crime, and that crime falls in response.

I remember the first time I thought about the topic was when I was working with the now late Dennis Smith and Robert Purtell as a consultant for the SQF litigation in NYC. Jeff Fagan had some models predicting the number of stops in an area, conditional on crime and demographic factors at the quarterly level. Dennis and Bob critiqued this as not being at the right temporal aggregation – police respond to crime patterns much faster than at the quarterly level. So Jeff redid his models at the monthly level and found the exact same thing as he did at the quarterly level. This however just begs the question of whether monthly is the appropriate temporal resolution.

So to try to tackle the problem I took the same approach as I did for my dissertation – I pretend I know what the micro level equation looks like, and then aggregate it up, and see what happens. So I start with two endogenous equations:

``````crime_t1 = -0.5*(stops_t0) + e_c
stops_t1 =  0.5*(crime_t0) + e_s``````

And then aggregation is just a sum of the micro level units:

``````Crime_T = (crime_t1 + crime_t0)
Stops_T = (stops_t1 + stops_t0)``````

And then what happens when we look at the aggregate relationship?

``Crime_T = Beta*(Stops_T)``

Intuitively here you may see where this is going. Since crime and stops have the exact same countervailing effects on each other, they cancel out if you aggregate up one step. I however show in the paper if you aggregate up more than two temporal units in this situation the positive effect wins. The reason is that back substitution for prior negative time series relationships oscillates (so a negative covariance at t-1 is a positive covariance at t-2). And in the aggregate the positive swamps the negative relationship. Even estimating `Crime_T = Beta*(Stops_T-1)` does not solve the problem. These endogenous auto-regressive relationships actually turn into an integrated series quite quickly (a point that cannot be credited to me, Clive Granger did a bunch of related work).

So this presented a few hypotheses. One, since I think short run effects for stops and crime are more realistic (think the crackdown literature), the covariance between them at higher resolutions (say monthly) should be positive. You should only be able to recover the deterrent effect of stops at very short temporal aggregations I think. Also crime and stops should be co-integrated at large temporal aggregations of a month or more.

Real life was not so convenient for me though. Here I have the project data and code saved. I have the rough draft of the theoretical aggregation junk here for those interested. Part of the reason this is in the failed idea bin is that neither of my hypotheses appears to be true with the actual crime and stop data. For the NYC citywide data I broke up stops into radio-runs and not-radio-runs (less discretion for radio runs, but still should have similar deterrent effects), and crimes as Part 1 Violent, Part 1 Non-Violent, and Part 2. More recently I handed it off to Zach Powell, and he ran various vector auto-regression models at the monthly/weekly/daily/hourly levels. IIRC it was pretty weak sauce evidence that stops at the lower temporal aggregations showed greater evidence of reducing crime.

There of course is a lot going on that could explain the results. Others have found deterrent effects using instrumental variable approaches (such as David Greenberg’s work using Arellano-Bond, or Wooditch/Weisburd using Bartik instruments). So maybe my idea that spatial aggregation does not matter is wrong.

Also there is plenty of stuff going on specifically in NYC. We had the dramatic drop in stops due to the same litigation. Further work by MacDonald/Fagan/Geller have shown stops that met a higher reasonable suspicion standard based on the reported data have greater effects than others (essentially using Impact zones as an instrument there).

So it was a question I was never able to figure out – how to correctly identify the right temporal unit to examine crime and deterrence from police action.