Co-author networks in Criminology

In my bin of things I will never finish at this point, I started a manuscript looking at co-author networks in criminology using web of science data. I recruited several folks over the years (grad students at the time Jen Laprade and Richard Hernendez, and Marie Oullet), but I was never able to put in the last bit of time to finish it off. Exploratory work is hard, as there is no end goal to work towards. So I was never able to get it to a point I was happy with.

The shamble of the current paper is here, which will contain more details than this post. But basically I downloaded all of the Web of Science data that had the CJ/Crim label attached up to 2016, then turned that into a co-author network.

So the way it works is if I co-authored an article with Rob Worden & Sarah McLean, and Rob Worden & Sarah McLean co-authored a paper with Chris Harris, me and Chris are not directly connected, but are just 1 degree apart. After doing this, I wanted to see if we clustered into different groups. The answer to that is yes, I can get the computer to spit out clusters (colored below), but we are still definately small world (everyone is connected to everyone one with only a few hops).

I had a really hard go at it to get the networks to layout nicely (a typical problem with big, interconnected networks). I’ve posted an interactive version here. You can zoom in, look at the clusters, and look yourself up.

Here is a GIF showing surfing the network. I look up Beth Huebner (I would say Beth is part of the Michigan State/CJ folks Cluster), see she is attached to Scott Decker (who is in another blue cluster that has a pretty big array of folks, it has many Arizona but also Alex Piquero, Dan Nagin, and Shawn Bushway), then go onto Scott Wolfe etc.

I figured the clusters would be by topical area (which is true to a certain extent), but they were also by University clusters. Here was my attempt to give some meaning to the clusters, by pulling out the top 3 authors/journals. There are some 40 clusters in the excel file in the paper folder shared earlier. (There are more clusters than that even, but they are the 40 biggest in terms of authors/articles.)

So that gives some face validity to the clusters, but like I said it is small world, so maybe that isn’t worth noting at all anyway. One of the things I noticed was that the clusters had a big seperation between USA folks and international folks.

So if someone wants to take this over let me know. I didn’t share a link to the data directly (I imagine that violates the Web of Science terms of service.) But will share offline plus my code if someone wants it. (It is already 3+ years old data, I don’t even want to think about updating the work. Jen and Richard did a bunch of grunge work to clean the names for me to make the network.)

Coauthorship over time

One thing I noted was the change in co-authorship over time. It is a perpetual question about how to evaluate folks by solo-authorship. I can’t answer that question, but we can observe how it is changing over time. Here are graphs of proportion solo over time, as well as the mean number of authors over time (with error intervals, much more data in recent years than past).

This holds true the same for our top journals (the WOS data is quite a hodge podge, including forensic pysch, some trade magazines, etc.).

Citations Over Time

Another example bit of data analysis I did with this dataset is you can look at citations over time as well. Here is the mean of citations in well known crim/cj journals over time.

And here is a scatterplot of the individual papers. I’ve posted an interactive version of this as well.

So more stuff than I can handle zipping around this data. (I tried to make some sense of keywords for articles at one point, but that would take some more serious semantic reduction of like words.)

The Failed Idea Bin: Temporal Aggregation and the Crime/Stop Relationship

A recent paper by the Hipp/Kim/Wo trio analyzing robbery at very fine temporal scales in NYC reminded me on a failed project I never quite worked out to completion. This project was about temporal aggregation bias. We talk about spatial aggregation bias quite a bit, which I actually don’t think is that big of deal for many projects (for reasons discussed in my dissertation).

I think it is actually a bigger deal though when dealing with temporal relationships, especially when we are considering endogenous relationships between crime and police action in response to crime. This is because they are a countervailing endogenous relationship – most endogenous relationships are positively correlated, but here we think police do more stuff (like arrests and stops) in areas with more crime, and that crime falls in response.

I remember the first time I thought about the topic was when I was working with the now late Dennis Smith and Robert Purtell as a consultant for the SQF litigation in NYC. Jeff Fagan had some models predicting the number of stops in an area, conditional on crime and demographic factors at the quarterly level. Dennis and Bob critiqued this as not being at the right temporal aggregation – police respond to crime patterns much faster than at the quarterly level. So Jeff redid his models at the monthly level and found the exact same thing as he did at the quarterly level. This however just begs the question of whether monthly is the appropriate temporal resolution.

So to try to tackle the problem I took the same approach as I did for my dissertation – I pretend I know what the micro level equation looks like, and then aggregate it up, and see what happens. So I start with two endogenous equations:

crime_t1 = -0.5*(stops_t0) + e_c
stops_t1 =  0.5*(crime_t0) + e_s

And then aggregation is just a sum of the micro level units:

Crime_T = (crime_t1 + crime_t0)
Stops_T = (stops_t1 + stops_t0)

And then what happens when we look at the aggregate relationship?

Crime_T = Beta*(Stops_T)

Intuitively here you may see where this is going. Since crime and stops have the exact same countervailing effects on each other, they cancel out if you aggregate up one step. I however show in the paper if you aggregate up more than two temporal units in this situation the positive effect wins. The reason is that back substitution for prior negative time series relationships oscillates (so a negative covariance at t-1 is a positive covariance at t-2). And in the aggregate the positive swamps the negative relationship. Even estimating Crime_T = Beta*(Stops_T-1) does not solve the problem. These endogenous auto-regressive relationships actually turn into an integrated series quite quickly (a point that cannot be credited to me, Clive Granger did a bunch of related work).

So this presented a few hypotheses. One, since I think short run effects for stops and crime are more realistic (think the crackdown literature), the covariance between them at higher resolutions (say monthly) should be positive. You should only be able to recover the deterrent effect of stops at very short temporal aggregations I think. Also crime and stops should be co-integrated at large temporal aggregations of a month or more.

Real life was not so convenient for me though. Here I have the project data and code saved. I have the rough draft of the theoretical aggregation junk here for those interested. Part of the reason this is in the failed idea bin is that neither of my hypotheses appears to be true with the actual crime and stop data. For the NYC citywide data I broke up stops into radio-runs and not-radio-runs (less discretion for radio runs, but still should have similar deterrent effects), and crimes as Part 1 Violent, Part 1 Non-Violent, and Part 2. More recently I handed it off to Zach Powell, and he ran various vector auto-regression models at the monthly/weekly/daily/hourly levels. IIRC it was pretty weak sauce evidence that stops at the lower temporal aggregations showed greater evidence of reducing crime.

There of course is a lot going on that could explain the results. Others have found deterrent effects using instrumental variable approaches (such as David Greenberg’s work using Arellano-Bond, or Wooditch/Weisburd using Bartik instruments). So maybe my idea that spatial aggregation does not matter is wrong.

Also there is plenty of stuff going on specifically in NYC. We had the dramatic drop in stops due to the same litigation. Further work by MacDonald/Fagan/Geller have shown stops that met a higher reasonable suspicion standard based on the reported data have greater effects than others (essentially using Impact zones as an instrument there).

So it was a question I was never able to figure out – how to correctly identify the right temporal unit to examine crime and deterrence from police action.

Statement on recent officer involved shooting research

Several recent studies (Johnson et al., 2019; Jetelina et al., 2020) use a similar study design to assess racial bias in officer involved shootings (OIS). In short, critiques of this work by Jon Mummolo (JM) are correct – they make a fundamental error in the analysis that renders the results mostly meaningless (Knox and Mummalo, 2020). JM critiques the work as switching conditional probabilities, this recent OIS work estimates the probability of the race of someone shot by police conditional on other characteristics, e.g. tests the hypothesis P(White | Other Stuff, Being Shot) = P(Minority | Other Stuff, Being Shot). Whereas we want Being Shot on the left hand side, e.g. P(Being Shot | Race), and switching these probabilities results in mostly a meaningless estimate in terms of inferring police behavior. You ultimately need to look at some cases in which folks were not shot to have a meaningful research design.

I’ve been having similar conversations with folks since publishing my work on officer involved shootings (Wheeler et al., 2017). Most folks don’t understand the critique, and unfortunately most folks also don’t take critiques very well. So this post is probably a waste of time, but here it is anyway.

The Road

I’m likely to get some of the timing wrong in how I came to be interested in this area – but here is what I remember. David Klinger and Richard Rosenfeld published a piece in Criminology & Public Policy (CPP) examining the count of OIS’s in neighborhoods in St. Louis, conditional on demographic and violent crime counts in those neighborhoods (Klinger et al., 2016). So in quantoid speak they estimated the expected number of OIS in neighborhoods, E[OIS_n | Demographic_n, Crime_n].

I thought this work was mostly meaningless, mainly because it really only makes sense to look at rates of behavior. You could stick a count of anything police do on the left hand side of this regression and the violent crime coefficient will be the largest positive effect. So you could say estimate the counts of officers helping old ladies cross the street, and you would make the same inferences as you would about OIS. It is basically just saying where officers spend more of their time at (in violent crime areas), and subsequently have more interactions with individuals. It doesn’t say anything fundamentally about police behavior in regards to racial bias.

So sometime in 2016 me and Scott Phillips came up with the study design using when officers draw their firearm as the denominator. (Before I moved to Dallas I knew about their open data.) It was the observational analogue to the shoot/don’t shoot lab experiments Lois James did (James et al., 2014). Also sometime during the time period Roland Fryer came out with his pre-print, in which he used Taser uses as the counter-factual don’t shoot cases (Fryer, 2019). I thought drawing the firearm made more sense as a counterfactual, but both are subject to the same potential selection effect. (Police may be quicker to the draw their firearms with minorities, which I readily admit in my paper.)

Also in that span Justin Nix came out with the birds-eye view CPP paper using the national level crowd sourced data (Nix et al., 2017) to estimate racial bias. They make what to me is a similar conditional probability mistake as the papers that motivated this post. Using the crowdsourced national level data, they estimate the probability of being unarmed, conditional on race (in the sample of just folks who were killed by the police). So they test whether P(Unarmed | White, Shot) = P(Unarmed | Minority, Shot).

Since like I said folks don’t really understand the conditional probability argument, basically at this point I just say folks get causality backwards. The police shooting at someone does not make them armed or unarmed, the same way police shooting at someone does not change their race. You cannot estimate a regression of X ~ beta*Y, then interpret beta as how much X causes Y. The stuff on the right hand side of the conditional probability statement works mostly the same way, we want to say stuff on the right hand side of the condition causes some change in the outcome.

I have this table I made in Wheeler et al. (2017) to illustrate various research designs – you can see the Ross (2015) made the same estimate of P(Unarmed | Race, Shot) as Justin did.

At this point you typically get a series of different retorts to the “you estimated the wrong conditional probability complaint”. The ones I’ve repeatedly seen are:

  1. No data is perfect. We should work with what we have.
  2. We ask a different research question.
  3. Our analysis are just descriptive, not causal.
  4. Our findings are consistent with a bunch of other work.

For (3) I would be OK if the results are described correctly, pretty much all of these articles are clearly interested in making inferences about police behavior though (which you cannot do with just looking at these negative encounters). It isn’t just a slip of mistaking conditional probabilities (like a common p-value mishap that doesn’t really impact the overall conclusions), the articles are directly motivated to make inferences about police behavior they cannot with this study design.

For (2) it is useful to consider how might the descriptive conditional probabilities be actually interpreted in a reasonable manner. So if we estimate P(Offender Race | Shot), you can think of a game where if you see a news headline about an OIS, and you want to guess the race of the person shot by police, what would be your best guess. Ditto for P(Unarmed | Shot), what is the probability of someone being unarmed conditional on them being shot. This game is clearly a superficial type of thing to estimate, those probabilities don’t say anything though about behavior in terms of things police officers can control, they are all just a function of how often police get in interactions with those different races (or armed status) of individuals.

Consider a different hypothetical, the probability a human is shot by police versus an animal. P(Human | Shot) is waay larger than P(Animal | Shot), are police biased against humans? No, the police just don’t deal with animals they need to shoot on a regular basis.

For (1) I will follow up below with some examples of how I think using this OIS data could actually be effective for shaping police behavior in practice, but suffice to say just collecting OIS you can’t really say anything about racial bias in terms of officer decision making.

I will say that a bunch of the individuals I am critiquing here I consider friends. Steve Bishopp was one of the co-authors on my OIS work with Dallas data. If I go to a conference Justin is one of the people I would prefer to sit down and have a drink with. I’ve been schmoozing up folks with good R programming skills to come to Dallas to work for Jenn Reingle-Gonzalez. They have all done other work I think is good. See Tregel et al. (2019) or Jetelina et al. (2017) or Cesario et al. (2019) for other examples I think are more legitimate research articles amongst the same people who I am critiquing here.

So in response to (4) I think you all made the wrong mistake – the conditional probability mistake is an easy one to make. So sorry to my friends whom I think are wrong about this. That being said, most of the vitriol in public forums, often accusing people of ad-hominem attacks on their motivations, is pretty much always out of line. I think basically everyone on Twitter is being a jerk to be frank. I’ve seen it all around on both sides in the most recent Twitter back and forth (both folks calling Jenn racist and JM biased against the police). None of them are racist or biased for/against the police. I suppose to expect any different though is setting myself up for dissapointment. I was called racist by academic reviewers for Wheeler et al. (2017) (it took 4 rejects for my OIS paper before it was published). I’ve seen Justin get critiques on Twitter for being white in the past when doing work in this area.

I think CJ folks questioning JM’s motivation miss the point of his critique though. He isn’t saying police are biased and these papers are wrong, he is just saying these research papers are wrong because they can’t tell whether police are biased one way or another.

Who gives a shit

So while I think better research could be conducted in this area – JM has his work on bounding estimates (Knox et al., 2019), and I imagine someone can come up with a reasonable instrumental variable strategy to address the selection bias in the same vein as my shoot/don’t shoot (say officer instruments, or exogenous incidents that make officers more on edge and more likely to draw their firearm). But I think the question of whether “the police” are racially biased is a facile question. Globally labelling all police (or a single department) as racist is mostly a waste of time. Good for academic papers and to get people riled up in Twitter, not so much for anything else.

The police are simply a cross section of the general public. So in terms of whether some officers are racist this is true (as it is for the general public). Or maybe even we are all a little racist (ala the implicit bias hypothesis). We can only observe behavior, we cannot peer into the hearts and minds of men. But suffice to say racism is still a part of our society in some capacity I believe is a pretty tame statement.

Subsequently if you gather enough data you will be able to get some estimate of the police being racist (the null is for sure wrong). But if people can’t reasonably understand conditional probabilities, imagine trying to have a conversation about what is a reasonable amount of racial bias for monitoring purposes (inferiority bounds). Or that this racial bias estimate is not for all police, but some mixture of police officers and actions. Hard pass on either of those from me.

Subsequently this work has no bearing on actual police practice (including my own). They are of very limited utility – at best a stick or shield in civil litigation. They don’t help police departments change behavior in response to discovering (or not discovering) racial bias. And OIS are basically so rare they are worthless for all but the biggest police departments in terms of a useful monitoring metric (it won’t be sensitive enough to say whether a police department as a whole is doing good or doing bad).

So what do I think is potentially useful way to use this data? I’ve used the term “monitoring metric” a few times – what I mean by that is using the information to actually inform some response. Internally for police departments, shootings should be part of an early intervention system used to monitor individual officers for problematic behavior. From a state or federal government perspective, they could actively monitor overall levels of force used to identify outlier agencies (see this blog post example of mine). For the latter think proactively identifying problematic departments, instead of the typical current approach of wait for some major incident and then the Department of Justice assigns a federal monitor.

In either of those strategies just looking at shootings won’t be enough, they would need to use all levels of use of force to effectively identify either bad individual cops or problematic departments as a whole. Hence why I suggested adding all levels of force to say NIBRS, rather than having a stand alone national level OIS database. And individual agencies already have all the data they need to do an effective early intervention system.

I’m not totally oppossed to having a national level OIS database just based on normative arguments – e.g. you think it is a travesty we can’t say how many folks were killed by police in the prior year. It is not a totally hollow gesture, as making people record the information does provide a level of oversight, so may make a small difference. But that data won’t be able to say anything about the racial bias in individual police officer decision making.

References

Cesario, J., Johnson, D. J., & Terrill, W. (2019). Is there evidence of racial disparity in police use of deadly force? Analyses of officer-involved fatal shootings in 2015–2016. Social psychological and personality science, 10(5), 586-595.

Fryer Jr, R. G. (2019). An empirical analysis of racial differences in police use of force. Journal of Political Economy, 127(3), 1210-1261.

Klinger, D., Rosenfeld, R., Isom, D., & Deckard, M. (2016). Race, crime, and the micro-ecology of deadly force. Criminology & Public Policy, 15(1), 193-222.

Knox, D., Lowe, W., & Mummolo, J. (2019). The bias is built in: How administrative records mask racially biased policing. Available at SSRN.

Knox, D., & Mummolo, J. (2020). Making inferences about racial disparities in police violence. Proceedings of the National Academy of Sciences, 117(3), 1261-1262.

James, L., Klinger, D., & Vila, B. (2014). Racial and ethnic bias in decisions to shoot seen through a stronger lens: Experimental results from high-fidelity laboratory simulations. Journal of Experimental Criminology, 10(3), 323-340.

Jetelina, K. K., Bishopp, S. A., Wiegand, J. G., & Gonzalez, J. M. R. (2020). Race/ethnicity composition of police officers in officer-involved shootings. Policing: An International Journal.

Jetelina, K. K., Jennings, W. G., Bishopp, S. A., Piquero, A. R., & Reingle Gonzalez, J. M. (2017). Dissecting the complexities of the relationship between police officer–civilian race/ethnicity dyads and less-than-lethal use of force. American journal of public health, 107(7), 1164-1170.

Johnson, D. J., Tress, T., Burkel, N., Taylor, C., & Cesario, J. (2019). Officer characteristics and racial disparities in fatal officer-involved shootings. Proceedings of the National Academy of Sciences, 116(32), 15877-15882.

Nix, J., Campbell, B. A., Byers, E. H., & Alpert, G. P. (2017). A bird’s eye view of civilians killed by police in 2015: Further evidence of implicit bias. Criminology & Public Policy, 16(1), 309-340.

Ross, C. T. (2015). A multi-level Bayesian analysis of racial bias in police shootings at the county-level in the United States, 2011–2014. PloS one, 10(11).

Tregle, B., Nix, J., & Alpert, G. P. (2019). Disparity does not mean bias: Making sense of observed racial disparities in fatal officer-involved shootings with multiple benchmarks. Journal of crime and justice, 42(1), 18-31.

Wheeler, A. P., Phillips, S. W., Worrall, J. L., & Bishopp, S. A. (2017). What factors influence an officer’s decision to shoot? The promise and limitations of using public data. Justice Research and Policy, 18(1), 48-76.

Knowing when to fold them: A quantitative approach to ending investigations

The recent work on investigations in the criminal justice field has my head turning about potential quantitative applications in this area (check out the John Eck & Kim Rossmo podcasts on Jerry’s site first, then check out the recent papers in Criminology and Public Policy on the topic for a start). One particular problem that was presented to me was detective case loads — detectives are humans, so can only handle so many cases at once. Triage is typically taken at the initial crime reporting stage, with inputs such as seriousness of the offense, the overall probability of the case being solved, and future dangerousness of folks involved being examples of what goes into that calculus to assign a case.

Here I wanted to focus on a different problem though — how long to keep cases open? There are diminishing returns to keeping cases open indefinitely, and so PDs should be able to right size the backend of detective open cases as well as the front end triaging. Here my suggested solution is to estimate a survival model of the probability of a case being solved, and then you can estimate an expected return on investment given the time you put in.

Here is a simplified example. Say the table below shows the (instantaneous) probability of a case being solved per weeks put into the investigation.

Week 1  20%
Week 2  10%
Week 3   5%
Week 4   3%
Week 5   1%

In survival model parlance, this would be the hazard function in discrete time increments. And then we have diminishing probabilities over time, which should also be true (e.g. a higher probability of being solved right away, and gets lower over time). The expected return of investigating this crime at time t is the cumulative probability of the crime being solved at time t, multiplied by whatever value you assign to the case being solved. The costs of investigating will be fixed (based on the detective salary), so is just a multiple of t*invest_costs.

So just to fill in some numbers, lets say that it costs the police department $1,000 a week to keep an investigation going. Also say a crime has a return of $10,000 if it is solved (the latter number will be harder to figure out in practice, as cost of crime estimates are not a perfect fit). So filling in our table, we have below our detective return on investment estimates (note that the cumulative probability of being solved is not simply the sum of the instantaneous probabilities, else it would eventually go over 100%). So return on investment (ROI), at week 1 is 10,000*0.2 = 2,000, at week 2 is 10,000*0.28 = 2,800, etc.

        h(t) solved%  cum-costs   ROI   
Week 1  20%    20%     1,000     2,000
Week 2  10%    28%     2,000     2,800
Week 3   5%    32%     3,000     3,200
Week 4   3%    33%     4,000     3,300
Week 5   1%    34%     5,000     3,400

So the cumulative costs outweigh the total detective resources devoted to the crime by Week 4 here. So in practice (in this hypothetical example) you may say to a detective you get 4 weeks to figure it out, if not solved by then it should be closed (but not cleared), and you should move onto other things. In the long run (I think) this strategy will make sure detective resources are balanced against actual cases solved.

This right sizes investigation lengths from a global perspective, but you also might consider whether to close a case on an individual case-by-case basis. In that case you wouldn’t calculate the sunk cost of the investigation so far, it is just the probability of the case being solved going forward relative to future necessary resources. (You do the same table, just start the cum-costs and solved percent columns from scratch whenever you are making that decision.)

In an actual applied setting, you can estimate the survival function however you want (e.g. you may want a cure mixture-model, so not all cases will result in 100% being solved given infinite time). It is also the case that different crimes will not only have different survival curves, but also will have different costs of crime (e.g. a murder has a greater cost to society than a theft) and probably different investigative resources needed (detective costs may also get lower over time, so are not constant). You can bake that all right into this estimate. So you may say the cost of a murder is infinite, and you should forever keep that case open investigating it. A burglary though may be a very short time interval before it should be dropped (but still have some initial investment).

Another neat application of this is that if you can generate reasonable returns to solving crimes, you can right size your overall detective bureau. That is you can make a quantitative argument I need X more detectives, and they will help solve Y more crimes resulting in Z return on investment. It may be we should greatly expand detective bureaus, but have them only keep many cases open a short time period. I’m thinking of the recent officer shortages in Dallas, where very few cases are assigned at all. (Some PDs have patrol officers take initial detective duties on the crime scene as well.)

There are definitely difficulties with applying this approach. One is that getting the cost of solving a crime estimate is going to be tough, and bridges both quantitative cost of crime estimates (although many of them are sunk costs after the crime has been perpetrated, arresting someone does not undo the bullet wound), likelihood of future reoffending, and ethical boundaries as well. If we are thinking about a detective bureau that is over-booked to begin with, we aren’t deciding on assigning individual cases at that point, but will need to consider pre-empting current investigations for new ones (e.g. if you drop case A and pick up case B, we have a better ROI). And that is ignoring the estimating survival part of different cases, which is tricky using observational data as well (selection biases in what cases are currently assigned could certainly make our survival curve estimates too low or too high).

This problem has to have been tackled in different contexts before (either by actuaries or in other business/medical contexts). I don’t know the best terms to google though to figure it out — so let me know in the comments if there is related work I should look into on solving this problem.

David Bayley

David Bayley is most known in my research area, policing interventions to reduce crime, based on this opening paragraph in Police for the future:

The police do not prevent crime. This is one of the best kept secrets of modern life. Experts know it, the police know it, but the public does not know it. Yet the police pretend that they are society’s best defense against crime and continually argue that if they are given more resources, especially personnel, they will be able to protect communities against crime. This is a myth.

This quote is now paraded as backwards thinking, often presented before discussing the overall success of hot spots policing. If you didn’t read the book, you might come to the conclusion that this quote is a parallel to the nothing works mantra in corrections research. That take is not totally off-base: Police for the future was published in 1994, so it was just at the start of the CompStat revolution and hot spots policing. The evidence base was no doubt much thinner at that point and deserving of skepticism.

I don’t take the contents of David’s book as so hardlined on the stance that police cannot reduce crime, at least at the margins, as his opening quote suggests though. He has a chapter devoted to traditional police responses (crackdowns, asset forfeiture, stings, tracking chronic offenders), where he mostly expresses scientific skepticism of their effectiveness given their cost. He also discusses problem oriented approaches to solving crime problems, how to effectively measure police performance (outputs vs outcomes), and promotes evaluation research to see what works. Still all totally relevant twenty plus years later.

The greater context of David’s quote comes from his work examining police forces internationally. David was more concerned about professionalization of police forces. Part of this is better record keeping of crimes, and in the short term crime rates will often increase because of this. In class he mocked metrics used to score international police departments on professionalization that used crime as a measure that went into their final grade. He thought the function of the police was broader than reducing crime to zero.


I was in David’s last class he taught at Albany. The last day he sat on the desk at the front of the room and expressed doubt about whether he accomplished anything tangible in his career. This is the fate of most academics. Very few of us can point to direct changes anyone implemented in response to our work. Whether something works is independent of an evaluation I conduct to show it works. Even if a police department takes my advice about implementing some strategy, I am still only at best indirectly responsible for any crime reductions that follow. Nothing I could write would ever compete with pulling a single person from a burning car.

While David was being humble he was right. If I had to make a guess, I would say David’s greatest impact likely came about through his training of international police forces — which I believe spanned multiple continents and included doing work with the United Nations. (As opposed to saying something he wrote had some greater, tangible impact.) But even there if we went and tried to find direct evidence of David’s impact it would be really hard to put a finger on any specific outcome.

If a police department wanted to hire me, but I would be fired if I did not reduce crimes by a certain number within that first year, I would not take that job. I am confident that I can crunch numbers with the best of them, but given real constraints of police departments I would not take that bet. Despite devoting most of my career to studying policing interventions to reduce crime, even with the benefit of an additional twenty years of research, I’m not sure if David’s quote is as laughable as many of my peers frame it to be.

Why I publish preprints

I encourage peers to publish preprint articles — journal articles before they go through the whole peer review process and are published. It isn’t normative in our field, and I’ve gotten some pushback from colleagues, so figured I would put on paper why I think it is a good idea. In short, the benefits (increased exposure) outweigh the minimal costs of doing so.

The good — getting your work out there

The main benefit of posting preprints is to get your work more exposure. This occurs in two ways: one is that traditional peer-review work is often behind paywalls. This prevents the majority of non-academics from accessing your work. This point about paywalls applies just the same to preventing other academics from reading your work in some cases. So while the prior blog post I linked by Laura Huey notes that you can get access to some journals through your local library, it takes several steps. Adding in steps you are basically losing out on some folks who don’t want to spend the time. Even through my university it is not uncommon for me to not be able to access a journal article. I can technically take the step of getting the article through inter-library loan, but that takes more time. Time I am not going to spend unless I really want to see the contents of the article.

This I consider a minor benefit. Ultimately if you want your academic work to be more influential in the field you need to write about your work in non-academic outlets (like magazines and newspapers) and present it directly to CJ practitioner audiences. But there are a few CJ folks who read journal articles you are missing, as well as a few academics who are missing your work because of that paywall.

A bigger benefit is actually that you get your work out much quicker. The academic publishing cycle makes it impossible to publish your work in a timely fashion. If you are lucky, once your paper is finished, it will be published in six months. More realistically it will be a year before it is published online in our field (my linked article only considers when it is accepted, tack on another month or two to go through copy-editing).

Honestly, I publish preprints because I get really frustrated with waiting on peer review. No offense to my peers, but I do good work that I want others to read — I do not need a stamp from three anonymous reviewers to validate my work. I would need to do an experiment to know for sure (having a preprint might displace some views/downloads from the published version) but I believe the earlier and open versions on average doubles the amount of exposure my papers would have had compared to just publishing in traditional journals. It is likely a much different audience than traditional academic crim people, but that is a good thing.

But even without that extra exposure I would still post preprints, because it makes me happy to self-publish my work when it is at the finish line, in what can be a miserably long and very much delayed gratification process otherwise.

The potential downsides

Besides the actual time cost of posting a preprint (next section I will detail that more precisely, it isn’t much work), I will go through several common arguments why posting preprints are a bad idea. I don’t believe they carry much weight, and have not personally experienced any of them.

What if I am wrong — Typically I only post papers either when I am doing a talk, or when it is ready to go out for peer review. So I don’t encourage posting really early versions of work. While even at this stage there is never any guarantee you did not make a big mistake (I make mistakes all the time!), the sky will not fall down if you post a preprint that is wrong. Just take it down if you feel it is a net negative to the scholarly literature (which is very hard to do — the results of hypothesis tests do not make the work a net positive/negative). If you think it is good enough to send out for peer review it is definitely at the stage where you can share the preprint.

What if the content changes after peer review — My experience with peer review is mostly pedantic stuff — lit. review/framing complaints, do some robustness checks for analysis, beef up the discussion. I have never had a substantive interpretation change after peer-review. Even if you did, you can just update the preprint with the new results. While this could be bad (an early finding gets picked up that is later invalidated) this is again something very rare and a risk I am willing to take.

Note peer review is not infallible, and so hedging that peer review will catch your mistakes is mostly a false expectation. Peer review does not spin your work into gold, you have to do that yourself.

My ideas may get scooped — This I have never personally had happen to me. Posting a preprint can actually prevent this in terms of more direct plagiarism, as you have a time-stamped example of your work. In terms of someone taking your idea and rewriting it, this is a potential risk (same risk if you present at a conference) — really only applicable for folks working on secondary data analysis. Having the preprint the other person should at least cite your work, but sorry, either presenting some work or posting a preprint does not give you sole ownership of an idea.

Journals will view preprints negatively — Or journals do not allow preprints. I haven’t come across a journal in our field that forbids preprints. I’ve had one reviewer note (out of likely 100+ at this point) that the pre-print was posted as a negative (suggesting I was double publishing or plagiarizing my own work). An editor that actually reads reviews should know that is not a substantive critique. That was likely just a dinosaur reviewer that wasn’t familiar with the idea of preprints (and they gave an overall positive review in that one case, so did not get the paper axed). If you are concerned about this, just email the editor for feedback, but I’ve never had a problem from editors.

Peer reviewers will know who I am — This I admit is a known unknown. So peer review in our crim/cj journals are mostly doubly blind (most geography and statistic journals I have reviewed for are not, I know who the authors are). If you presented the work at a conference you have already given up anonymity, and also the field is small enough a good chunk of work the reviewers can guess who the author is anyway. So your anonymity is often a moot point at the peer review stage anyway.

So I don’t know how much reviewers are biased if they know who you are (it can work both ways, if you get a friend they may be more apt to give a nicer review). It likely can make a small difference at the margins, but again I personally don’t think the minor risk/cost outweighs the benefits.

These negatives are no doubt real, but again I personally find them minor enough risks to not outweigh the benefits of posting preprints.

The not hard work of actually posting preprints

All posting a preprint involves is uploading a PDF file of your work to either your website or a public hosting service. My workflow currently I have my different components of a journal article in several word documents (I don’t use LaTex very often). (Word doesn’t work so well when it has one big file, especially with many pictures.) So then I export those components to PDF files, and stitch them together using a freeware tool PDFtk. It has a GUI and command line, so I just have a bat file in my paper directory that lists something like:

pdftk.exe TitlePage.pdf MainPaper.pdf TablesGraphs.pdf Appendix.pdf cat output CombinedPaper.pdf

So just a double click to update the combined pdf when I edit the different components.

Public hosting services to post preprints I have used in the past are Academia.edu, SSRN, and SoxArXiv, although again you could just post the PDF on your webpage (and Google Scholar will eventually pick it up). I use SocArXiv now, as SSRN currently makes you sign up for an account to download PDFs (again a hurdle, the same as a going through inter-library loan). Academia.edu also makes you sign up for an account, and has weird terms of service.

Here is an example paper of mine on SocArXiv. (Note the total downloads, most of my published journal articles have fewer than half that many downloads.) SocArXiv also does not bother my co-authors to create an account when I upload a paper. If we had a more criminal justice focused depository I would use that, but SocArXiv is fine.

There are other components of open science I should write about — such as replication materials/sharing data, and open peer reviewed journals, but I will leave those to another blog post. Posting preprints takes very little extra work compared to what academics are currently doing, so I hope more people in our field start doing it.

 

My Year Blogging in Review – 2018

The blog continues to grow in site views. I had a little north of 90,000 site views over the entire year. (If you find that impressive don’t be, a very large proportion are likely bots.)

The trend on the original count scale looks linear, but on the log scale the variance is much nicer. So I’m not sure what the best forecast would be.

I thought the demise had already started earlier in the year, as I actually saw the first year-over-year decreases in June and July. But the views recovered in the following months.

So based on that the slow down in growth I think is a better bet than the linear projection.

For those interested in extending their reach, you should not only consider social media and creating a website/blog, but also writing up your work for a more general newspaper. I wrote an article for The Conversation about some of my work on officer involved shootings in Dallas, and that accumulated nearly 7,000 views within a week of it being published.

Engagement in a greater audience is very bursty. Looking at my statistics for particular articles, it doesn’t make much sense to report average views per day. I tend to get a ton of views on the first few days, and then basically nothing after that. So if I do the top posts by average views per day it is dominated by my more recent posts.

This is partly due to shares on Twitter, which drive short term views, but do not impact longer term views as far as I can tell. That is a popular post on Twitter does not appear to predict consistent views being referred via Google searches. In the past year I get a ratio of about 50~1 referrals from Google vs Twitter, and I did not have any posts that had a consistent number of views (most settle in at under 3 views per day after the initial wave). So basically all of my most viewed posts are the same as prior years.

Since I joined Twitter this year, I actually have made fewer blog posts. Not including this post, I’ve made 29 posts in 2018.

2011  5
2012 30
2013 40
2014 45
2015 50
2016 40
2017 35
2018 29

Some examples of substitution are tweets when a paper is published. I typically do a short write up when I post a working paper — there is not much point of doing another one when it is published online. (To date I have not had a working paper greatly change from the published version in content.) I generally just like sharing nice graphs I am working on. Here is an example of citations over time I just quickly published to Twitter, which was simpler than doing a whole blog post.

Since it is difficult to determine how much engagement I will get for any particular post, it is important to just keep plugging away. Twitter can help a particular post take off (see these examples I wrote about for the Cross Validated Blog), but any one tweet or blog post is more likely to be a dud than anything.

Reasons Police Departments Should Consider Collaborating with Me

Much of my academic work involves collaborating and consulting with police departments on quantitative problems. Most of the work I’ve done so far is very ad-hoc, through either the network of other academics asking for help on some project or police departments cold contacting me directly.

In an effort to advertise a bit more clearly, I wrote a page that describes examples of prior work I have done in collaboration with police departments. That discusses what I have previously done, but doesn’t describe why a police department would bother to collaborate with me or hire me as a consultant. In fact, it probably makes more sense to contact me for things no one has previously done before (including myself).

So here is a more general way to think about (from a police departments or criminal justice agencies perspective) whether it would be beneficial to reach out to me.

Should I do X?

So no one is going to be against different evidence based policing practices, but not all strategies make sense for all jurisdictions. For example, while focussed deterrence has been successfully applied in many different cities, if you do not have much of a gang violence problem it probably does not make sense to apply that strategy in your jurisdiction. Implementing any particular strategy should take into consideration the cost as well as the potential benefits of the program.

Should I do X may involve more open ended questions. I’ve previously conducted in person training for crime analysts that goes over various evidence based practices. It also may involve something more specific, such as should I redistrict my police beats? Or I have a theft-from-vehicle problem, what strategies should I implement to reduce them?

I can suggest strategies to implement, or conduct cost-benefit analysis as to whether a specific program is worth it for your jurisdiction.

I want to do X, how do I do it?

This is actually the best scenario for me. It is much easier to design a program up front that allows a police department to evaluate its efficacy (such as designing a randomized trial and collecting key measures). I also enjoy tackling some of the nitty-gritty problems of implementing particular strategies more efficiently or developing predictive instruments.

So you want to do hotspots policing? What strategies do you want to do at the hotspots? How many hotspots do you want to target? Those are examples of where it would make sense to collaborate with me. Pretty much all police departments should be doing some type of hot spots policing strategy, but depending on your particular problems (and budget constraints), it will change how you do your hot spots. No budget doesn’t mean you can’t do anything — many strategies can be implemented by shifting your current resources around in particular ways, as opposed to paying for a special unit.

If you are a police department at this stage I can often help identify potential grant funding sources, such as the Smart Policing grants, that can be used to pay for particular elements of the strategy (that have a research component).

I’ve done X, should I continue to do it?

Have you done something innovative and want to see if it was effective? Or are you putting a bunch of money into some strategy and are skeptical it works? It is always preferable to design a study up front, but often you can conduct pretty effective post-hoc analysis using quasi-experimental methods to see if some crime reduction strategy works.

If I don’t think you can do a fair evaluation I will say so. For example I don’t think you can do a fair evaluation of chronic offender strategies that use officer intel with matching methods. In that case I would suggest how you can do an experiment going forward to evaluate the efficacy of the program.

Mutual Benefits of Academic-Practitioner Collaboration

Often I collaborate with police departments pro bono — which you may ask what is in it for me then? As an academic I get evaluated mostly by my research productivity, which involves writing peer reviewed papers and getting research grants. So money is not the main factor from my perspective. It is typically easier to write papers about innovative problems or programs. If it involves applying for a grant (on a project I am interested in) I will volunteer my services to help write the grant and design the study.

I could go through my career writing papers without collaborating with police departments. But my work with police departments is more meaningful. It is not zero-sum, I tend to get better ideas when understanding specific agencies problems.

So get in touch if you think I can help your agency!

New preprint: Allocating police resources while limiting racial inequality

I have a new working paper out, Allocating police resources while limiting racial inequality. In this work I tackle the problem that a hot spots policing strategy likely exacerbates disproportionate minority contact (DMC). This is because of the pretty simple fact that hot spots of crime tend to be in disadvantaged/minority neighborhoods.

Here is a graph illustrating the problem. X axis is the proportion of minorities stopped by the police in 500 by 500 meter grid cells (NYPD data). Y axis is the number of violent crimes over along time period (12 years). So a typical hot spots strategy would choose the top N areas to target (here I do top 20). These are all very high proportion minority areas. So the inevitable extra police contact in those hot spots (in the form of either stops or arrests) will increase DMC.

I’d note that the majority of critiques of predictive policing focus on whether reported crime data is biased or not. I think that is a bit of a red herring though, you could use totally objective crime data (say swap out acoustic gun shot sensors with reported crime) and you still have the same problem.

The proportion of stops by the NYPD of minorities has consistently hovered around 90%, so doing a bunch of extra stuff in those hot spots will increase DMC, as those 20 hot spots tend to have 95%+ stops of minorities (with the exception of one location). Also note this 90% has not changed even with the dramatic decrease in stops overall by the NYPD.

So to illustrate my suggested solution here is a simple example. Consider you have a hot spot with predicted 30 crimes vs a hot spot with predicted 28 crimes. Also imagine that the 30 crime hot spot results in around 90% stops of minorities, whereas the 28 crime hot spot only results in around 50% stops of minorities. If you agree reducing DMC is a reasonable goal for the police in-and-of-itself, you may say choosing the 28 crime area is a good idea, even though it is a less efficient choice than the 30 crime hot spot.

I show in the paper how to codify this trade-off into a linear program that says choose X hot spots, but has a constraint based on the expected number of minorities likely to be stopped. Here is an example graph that shows it doesn’t always choose the highest crime areas to meet that racial equity constraint.

This results in a trade-off of efficiency though. Going back to the original hypothetical, trading off a 28 crime vs 30 crime area is not a big deal. But if the trade off was 3 crimes vs 30 that is a bigger deal. In this example I show that getting to 80% stops of minorities (NYC is around 70% minorities) results in hot spots with around 55% of the crime compared to the no constraint hot spots. So in the hypothetical it would go from 30 crimes to 17 crimes.

There won’t be a uniform formula to calculate the expected decrease in efficiency, but I think getting to perfect equality with the residential pop. will typically result in similar large decreases in many scenarios. A recent paper by George Mohler and company showed similar fairly steep declines. (That uses a totally different method, but I think will be pretty similar outputs in practice — can tune the penalty factor in a similar way to changing the linear program constraint I think.)

So basically the trade-off to get perfect equity will be steep, but I think the best case scenario is that a PD can say "this predictive policing strategy will not make current levels of DMC worse" by applying this algorithm on-top-of your predictive policing forecasts.

I will be presenting this work at ASC, so stop on by! Feedback always appreciated.

New paper: A simple weighted displacement difference test to evaluate place based crime interventions

At the ECCA conference this past spring Jerry Ratcliffe asked if I could apply some of my prior work on evaluating changes in crime patterns over time to make a set of confidence intervals for the weighted displacement quotient statistic (WDQ). The answer to that is no, you can’t, but in its stead I created another statistic in which you can do that, the weighted displacement difference (WDD). The work is published in the open access journal Crime Science.

The main idea is we wanted a simple statistic folks can use to evaluate place based interventions to reduce crime. All you need is pre and post crime counts for you treated and control areas of interest. Here is an excel spreadsheet to calculate the statistic, and below is a screen shot. You just need to fill in the pre and post counts for the treated and control locations and the spreadsheet will spit out the statistic, along with a p-value and a 95% confidence interval of the number of crimes reduced.

What is different compared to the WDQ statistic is that you need a control area for the displacement area too in this statistic. But if you are not worry about displacement, you can actually just put in zero’s for the displacement area and still do the statistic for the local (and its control area). In this way you can actually do two estimates, one for the local effects and one for the displacement. Just put in zero’s for the other values.

While you don’t really need to read the paper to be able to use the statistic, we do have some discussion on choosing control areas. In general the control areas should have similar counts of crime, you shouldn’t have a treatment area that has 100 crimes and a control area that only has 10 crimes. We also have this graph, which is basically a way to conduct a simple power analysis — the idea that “could you reasonably detect whether the intervention reduced crime” before you actually conduct the analysis.

So the way to read this graph is if you have a set of treated and control areas that have an average of 100 crimes in each period (so the cumulative total crimes is around 800), the number of crimes you need to reduce due to the intervention to even have weak evidence of a crime reduction (a one-tailed p-value of less than 0.1), the intervention needs to have prevented around 30 crimes. Many interventions just aren’t set up to have strong evidence of crime reductions. For example if you have a baseline of 20 crimes, you need to prevent 15 of them to find weak evidence of effectiveness. Interventions in areas with fewer baseline crimes basically cannot be verified they are effective using this simple of a design.

For those more mathy, I created a test statistic based on the differences in the changes of the counts over time by making an assumption that the counts are Poisson distributed. This is then basically just a combination of two difference-in-difference estimates (for the local and the displacement areas) using counts instead of means. For researchers with the technical capabilities, it probably makes more sense to use a data based approach to identify control areas (such as the synthetic control method or propensity score matching). This is of course assuming an actual randomized experiment is not feasible. But this is too much a burden for many crime analysts, so if you can construct a reasonable control area by hand you can use this statistic.