Dashboards are often not worth the effort

When end users see dashboards, they often think “this is really whiz-bang cool, lets do that for our data”. There are two issues though I commonly see with dashboards. One is the nature of the task to be accomplished with the dashboard is not well defined, and so even a visually well done dashboard mostly goes unused. The second, there are a ton of headaches deploying dashboards with real data – and the effort to do it right is not worth it.

For the first part, consider a small agency that has a simple crime trends dashboard. The intent is to identify anomalous upticks of crime. This requires someone log into the dashboard, click around the different crime trends, and visually seeing if they are higher than expected. It would be easier to either have an automated alert when some threshold is met, or a standardized report (e.g. once a week) that is emailed to review.

This post is not going to even be about ‘most dashboards show stupid data’ or ‘the charts show the data in totally inappropriate ways’. Even in cases in which you can build a nice dashboard, the complexity level is IMO not worth it in many situations I have encountered. Automating alerts and building regular standard reports for the vast majority of situations is in my opinion a better solution for data products. The whiz-bang being able to interactively click stuff will only be used by a very small number of people (who can often build stuff like that for themselves anyway).

So onto the second part, deploying dashboards with real data that others can interact with. Many of the cool example dashboards you see online do tricks that would be inappropriate for a production dashboard. So even if someone is like ‘yeah I can do a demo dashboard in a day on my laptop’, there is still a ton of more work to expose that dashboard to different individuals, which is probably the ultimate goal.

Dashboards have two parts: A) connecting to a source database, then B) exposing the user-interface to outside parties. Seems simple right? Wrong.

Lets start with part A, connecting to a source database. Many dashboard examples you see online cheat at this stage – they embed a static version of the data in the dashboard itself. If Gary needs to re-upload the data to the dashboard every week, what happens when Gary goes on vacation or takes a day off? You now have an out of date dashboard.

It is quite hard to manage connections between live source data and a dashboard. In the case you have a public facing dashboard, I would just host a public file somewhere on the internet (so anyone can download the source data), and point to that source, and then have some automated process update that source data. This solves the Gary went on vacation factor at least, and there is no security risk. You intentionally only upload data that can be disseminated to the public.

One potential alternative with the public facing dashboard is to make a serverless file. This will often embed the dashboard (and maybe data) into the webpage itself (e.g. pyscript wasm), so it may be slow to start up, but will work reasonably well if you can handle a one minute lag. You don’t need to worry about malicious actors in that scenario, as the heavy computation is done on the clients computer, not your server. I have an example on CRIME De-Coder (note it is currently not up to date, my code is running fine, but Dallas post the cyber-attack has not been updating their public data).

Managing a direct connection to your source data in a public facing dashboard is not a good idea, as malicious actors can spam your dashboard. This denial of service attack will not only make your dashboard unresponsive, but will also eat up your database processing. (Big companies often have a reporting database server vs a production database server in the best case scenario, but this requires resources most public sector agencies do not have.)

The solution to this is to limit who can access the dashboard, part B above. Unfortunately, the majority of dashboard software when you want to make a live connection and/or limit who can see the information, you are in the ‘need to pay for this scenario’. A very unfortunate aspect of the ‘you need to pay for this’ is that most of these vendors charge per viewer – it isn’t a flat fee. PowerBI is maybe OK if your organization already pays for Sharepoint, Tableau licensing is so painful I wouldn’t suggest it.

So what is the alternative? You are now in charge of spinning up your own server so people can click a dropdown menu and generate a line graph. Again you need to worry about security, but at least if you can hide behind a local network or VPN it is probably doable for most police departments.

I don’t want to say dashboards are never worth it. I think public facing dashboards are a good thing for police transparency, and if done right are easier to implement than private ones. Private ones are doable as well (especially if limiting to intranet applications). But like I said, this level of effort is over the top compared to just doing a regular report.

Updates on CRIME De-Coder and ASEBP

So I have various updates on my CRIME De-Coder consulting site, as well as new posts on the American Society of Evidence Based Policing Criminal Justician series.

CRIME De-Coder

Blog post, Don’t use percent change for crime data, use this stat instead. I have written a bunch here about using Poisson Z-scores, so if you are reading this it is probably old news. Do us all a favor and in your Compstat reports drop ridiculous percent change metrics with low baselines, and use 2 * ( sqrt(Current) - sqrt(Past) ).

Blog post, Dashboards should be up to date. I will have a more techy blog post here on my love/hate relationship with dashboards (most of the time static reports are a better solution). But one scenario they do make sense is for public facing dashboards, but they should be up to date. The “free” versions of popular tools (Tableau, PowerBI) don’t allow you to link to a source dataset and get auto-updated, so you see many old dashboards out of date online. If you contract with me, I can automate it so it is up to date and doesn’t rely on an analyst manually updating the information.

Demo page – that page currently includes demonstrations for:

The WDD Tool is pure javascript – picking up more of that slowly (the Folium map has a few javascript listener hacks to get it to look the way I want). As a reference for web development, I like Jon Duckett’s three books (HTML, javascript, PHP).

Ultimately too much stuff to learn, but on the agenda are figuring out google cloud compute + cloud databases a bit more thoroughly. Then maybe add some PHP to my CRIME De-Coder site (a nicer contact me form, auto-update sitemap, and rss feed). I also want to learn how to make ArcGIS dashboards as well.

Criminal Justician

The newest post is Situational crime prevention and offender planning – discussing one of my favorite examples of crime prevention through environmental design (on suicide prevention) and how it is a signal about offender behavior that goes beyond simplistic impulsive behavior. I then relate this back to current discussion of preventing mass shootings.

If you have ideas about potential posts for the society (or this blog or crime de-coders blog), always feel free to make a pitch

A statistical perspective on year-to-date metrics

Jerry Ratcliffe, and now more recently Jeff Asher, have written about how volatile early year projection of year-to-date (YTD) percent changes. I am going to write about this is not the right way to frame the problem in my opinion – I will present a better behaved estimate that is less volatile, but clearly doesn’t give police departments what they want.

Going to the end advice first – people find me irksome for the suggestion, but you shouldn’t be using percent changes at all. A simple alternative I have stated for low count crime data is a Poisson Z-score, which is simply 2*(sqrt(Current) - sqrt(Past)) – a value of greater than 3 or 4 is a signal the two processes are significantly different (under the null hypothesis that the counts have a Poisson distribution).

A Better YTD estimate

So here I am going to present a more accurate YTD percent change metric – but don’t take that as advice you should be using YTD percent change. It is more of an exercise to say why you shouldn’t be using this metric to begin with. Year end percent change is defined as:

(Current - Past)/Past = % Change

Note that you can rewrite this as:

Current/Past - Past/Past  = % Change
Current/Past - 1          = % Change

So really it is only the ratio of Current/Past that we care about estimating, the translating to a percent doesn’t matter. In the above equations, I am writing these as cumulative totals for the whole year. So lets do breakdowns via subscripts, and shorten Current and Past to C and P respectively. So say we have data through January, people typically estimate the YTD percent change then as:

(C_January - P_January)/P_January = % Change January

To make it easier, I am going to write e subscript for early, and l subscript for later. So if we then estimate YTD for February, we then have C_January + C_February = C_e. Also note that C_e + C_l = Current, the early observed values plus the later unobserved values equals the year totals. This identifies a clear error when people use only subsets of the data to do YTD year end projections (what both Jerry and Jeff did in their posts to argue against early YTD estimates). You should not just use P_e in your estimate, you should use the full prior year counts.

Lets go back to our year end estimate, writing in early/later form:

[C_e + C_l - (P_e + P_l)]/(P_e + P_l) = % Change

This only has one unknown in the equation – C_l, the unknown rest of year projection. You should not use (C_e - P_e)/P_e, as this introduces several stochastic elements where none are needed. P_e is not necessarily a good estimate of P_e + P_l. So lets do a simple example, imagine we had homicide totals:

     Past Current
Jan    2     1
Feb    0      
Mar    1      
Apr    1      
May    1      
Jun    1      
Jul    1      
Aug    1      
Sep    1      
Oct    1      
Nov    1      
Dec    1      
---
Tot   12

The naive way of doing YTD estimates, we would say our January YTD estimates are (1 - 2)/2 = -50%. Whereas I am saying, you should use (1 + C_l)/12 – filling in whatever value you project to the rest of the year totals C_l. Simple ones you can do in a spreadsheet are ‘no change’, just fill in the prior year which here would be C_l = 10, and would give a YTD percent change estimate of (11 - 12)/12 ~ -8%. Or another simple one is extrapolate, which would be C_l = C_e*(1/year_proportion) = 1*12, so (12 - 12)/12 = 0%. (You would really want to fit a model with seasonal and trend components and project out the remaining part of the year, which will often be somewhere between these two simpler methods.)

So far this is just theoretical “should be a better estimator” – lets show with some actual data. Python code to replicate here, but I took open data from Cary, NC, which goes back to 2000, so we have a sample of 22 years. Estimates of the error, broken down by month and version, are below. The naive estimate is how it is typically done (equivalent to Jeff/Jerry’s blog posts), the running estimate is taking prior to fill in C_l, and extrapolate is using the current months to fill in. The error metrics are | (estimated % change) - (actual year end % change) |, and the stats show the mean (standard deviation) of the sample (n=22). Here are the metrics for larceny, which average 123 per month over the sample:

       Naive   Running  Extrapolate
Jan   12 (7)    6 (4)     10 (7)
Feb    8 (6)    6 (4)     11 (7)
Mar    9 (6)    5 (3)      8 (6)
Apr    9 (7)    5 (3)      8 (5)
May    7 (6)    5 (3)      6 (4)
Jun    6 (4)    4 (3)      4 (3)
Jul    5 (3)    4 (3)      4 (3)
Aug    4 (3)    3 (2)      3 (2)
Sep    3 (2)    3 (2)      2 (2)
Oct    2 (1)    2 (1)      2 (1)
Nov    1 (1)    1 (1)      1 (1)
Dec    0 (0)    0 (0)      0 (0)

And here are the metrics for burglary, which average 28 per month over the sample. Although these have higher error metrics (due to lower/more volatile baseline counts), my estimator is still better than the naive one for the majority of the year.

       Naive   Running  Extrapolate
Jan   34 (25)   12 (8)    24 (23)
Feb   15 (14)   11 (7)    16 (13)
Mar   15 (14)   12 (7)    15 (11)
Apr   15 (11)   10 (7)    13 ( 8)
May   14 (10)   10 (7)    10 ( 7)
Jun   11 ( 8)   10 (7)     8 ( 6)
Jul    9 ( 7)    9 (7)     7 ( 5)
Aug    7 ( 5)    8 (5)     6 ( 3)
Sep    6 ( 4)    6 (5)     4 ( 3)
Oct    6 ( 4)    5 (4)     3 ( 3)
Nov    3 ( 3)    3 (3)     2 ( 2)
Dec    0 ( 0)    0 (0)     0 ( 0)

Running tends to do better for earlier in the year (and for smaller N samples). Both the running and extrapolate estimates are closer to the true year end percent change compared to the naive estimate in around 70% of the observations in this sample. (And tends to be even more pronounced in the smaller crime count categories, closer to 80% to 90% of the time better.)

In Jerry’s and Jeff’s posts, they use a metric +/- 5 to say “it is close” – this corresponds to in my tables absolute errors in the range of 5 percentage points. You meet that criteria on average in this sample for my estimator in March for Larcenies (running) and September (extrapolate) for Burglaries.

To be clear though, even with the more accurate projections, you should not use this estimate.

What do police departments want?

So Jeff may literally want an end-of-year projection for when he writes a Times article – similar to how a government might give a year end projection for GDP growth. But this is not what most police departments want when they calculate YTD metrics. So saying in turn “you shouldn’t use YTD because the error is high” to me misses the boat a bit. I can give a metric that has lower error rates, but you still shouldn’t use YTD percent change.

What police departments want to examine is the more general question “are my numbers high?” – you can further parse this into “are my numbers high consistently over the past date range” (of which the past year is just a convenient demarcation) or “are my numbers anomalous high right now”. The former is asking about long term trends, and the latter is asking about short term increases. Part of why I don’t like YTD is that it masks these two metrics – a spike early in the year can look like a perpetual long term upward trend later in the year.

I have training material showing off two different types of charts I like to use in lieu of YTD metrics. These can identify anomalous short term and long term trends. Here is an example weekly chart showing trends (in black line) and short term spikes (outside the error intervals):

So this is an uber nerd post – I hope it has general lessons though. One is that if you need to estimate Y, and you can write Y as a function of other variables, some that are variable and some that are not, e.g. Y = f(x1,c), then maybe you should just focus on estimating x1 in this scenario, not model Y directly.

In terms of more general statistical modeling of crime trends, I have debated in the past examining more thoroughly seasonal-trend decomposition techniques, but I think the examples I give above are quite sufficient for most analysis (and can be implemented in a spreadsheet).

ASEBP blog posts, and auto screenshotting websites

I wanted to give an update here on the Criminal Justician series of blogs I have posted on the American Society of Evidence Based Policing (ASEBP) website. These include:

  • Denver’s STAR Program and Disorder Crime Reductions
    • Assessing whether Denver’s STAR alternative mental health responders can be expected to decrease a large number of low-level disorder crimes.
  • Violent crime interventions that are worth it
    • Two well-vetted methods – hot spots policing and focused deterrence – are worth the cost for police to implement to reduce violent crime.
  • Evidence Based Oversight on Police Use of Force
    • Collecting data in conjunction with clear administrative policies has strong evidence it overall reduces officer use of force.
  • We don’t know what causes widespread crime trends
    • While we can identify whether crime is rising or falling, retrospectively identifying what caused those ups and downs is much more difficult.
  • I think scoop and run is a good idea
    • Keeping your options open is typically better than restricting them. Police should have the option to take gun shot wound victims directly to the emergency room when appropriate.
  • One (well done) intervention is likely better than many
    • Piling on multiple interventions at once makes it impossible to tell if a single component is working, and is likely to have diminishing returns.

Going forward I will do a snippet on here, and refer folks to the ASEBP website. You need to sign up to be able to read that content – but it is an organization that is worth joining (besides for just reading my takes on science around policing topics).


So my CRIME De-Coder LLC has a focus on the merger of data science and policing. But I have a bit of wider potential application. Besides statistical analysis in different subject areas, one application I think will be of wider interest to public and private sector agencies is my experience in process automation. These often look like boring things – automating generating a report, sending an email, updating a dashboard, etc. But they can take substantial human labor, and automating also has the added benefit of making a process more robust.

As an example, I needed to submit my website as a PDF file to obtain a copyright. To do this, you need to take screenshots of your website and all its subsequent pages. Googling on this for selenium and python, the majority of the current solutions are out of date (due to changes in the Chrome driver in selenium over time). So here is the solution I scripted up the morning I wanted to submit the copyright – it took about 2 hours total in debugging. Note that this produces real screenshots of the website, not the print to pdf (which looks different).

It is short enough for me to just post the entire script here in a blog post:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import time
from PIL import Image
import os

home = 'https://crimede-coder.com/'

url_list = [home,
            home + 'about',
            home + 'blog',
            home + 'contact',
            home + 'services/ProgramAnalysis',
            home + 'services/PredictiveAnalytics',
            home + 'services/ProcessAutomation',
            home + 'services/WorkloadAnalysis',
            home + 'services/CrimeAnalysisTraining',
            home + 'services/CivilLitigation',
            home + 'blogposts/2023/ServicesComparisons']

res_png = []

def save_screenshot(driver, url, path, width):
    driver.get(url)
    # Ref: https://stackoverflow.com/a/52572919/
    original_size = driver.get_window_size()
    #required_width = driver.execute_script('return document.body.parentNode.scrollWidth')
    required_width = width
    required_height = driver.execute_script('return document.body.parentNode.scrollHeight')
    driver.set_window_size(required_width,required_height)
    #driver.save_screenshot(path)  # has scrollbar
    driver.find_element(By.TAG_NAME, 'body').screenshot(path)  # avoids scrollbar
    driver.set_window_size(original_size['width'], original_size['height'])

options = Options()
options.headless = True
driver = webdriver.Chrome(options=options)

for url in url_list:
    driver.get(url)
    if url == home:
        name = "index.png"
    else:
        res_url = url.replace(home,"").replace("/","_")
        name = res_url + ".png"
    time.sleep(1)
    res_png.append(name)
    save_screenshot(driver,url,name,width=1400)

driver.quit()

# Now appending to PDF file
images = [Image.open(f).convert('RGB') for f in res_png if f[-3:] == 'png']
i1 = images.pop(0)
i1.save(r'Website.pdf', save_all=True, append_images=images)

# Now removing old PNG files
for f in res_png:
    os.remove(f)

One of the reasons I want to expand knowledge of coding practices into policing (as well as other public sector fields) is that this simple of a thing doesn’t make sense for me to package up and try to monetize. The IP involved in a 2 hour script is not worth that much. I realize most police departments won’t be able to take the code above and actually use it – it is better for your agency to simply do a small contract with me to help you automate the boring stuff.

I believe this is in large part a better path forward for many public sector agencies, as opposed to buying very expensive Software-as-a-Service solutions. It is better to have a consultant to provide a custom solution for your specific agency, than to spend money on some big tool and hope your specific problems fit their mold.

Crime De-Coder LLC Website

So I have created CRIME De-Coder LLC, a firm to do my consulting work with police departments. Check out my website, crimede-coder.com.

Feedback is welcome. In particular check out the services pages, and my first blog post on what distinguishes my services from most firms. Providing computer code to generate the end product is “teaching a man a fish”, whereas most firms just drop a final report and leave.

And of course feel free to reach out to consult@crimede-coder.com if you are interested in pursuing a project. Going forward I plan on making a new post around once a month, so sign up in your feed reader or using a service like IFTTT.


Setting up a stand alone website is not that hard in the end. Currently it is a static site with some custom javascript (hosted on Hostinger). I should do a PHP server for the new blog posts and RSS feed eventually, but for now this is fine. I suggest for those interested in the same get the Jon Duckett books (HTML/Javascript/PHP) for overview of the tech, and then check out Dani Kross’s youtube tutorials (for random things like editing the htaccess file).

I am not doing a newsletter for the blog-posts, as I am concerned it will get my email on random block lists. But if there is demand for it in the future I will figure out some other service I guess to do that.

I wanted a more bare-metal setup (not a hosted wordpress like this site), as in the future I will likely do demo’s of dashboards, host some pyscript, make a sign in for paid content, etc. I just wanted flexibility from the start. So stay tuned for more content from CRIME De-Coder!

Scorpion was probably not doing hot spots policing

So the Wall Street Journal had a recent article describing how crackdowns in hot spots of crime may not be the best policing tactic, Tyre Nichols Case Prompts Questions About Police Tactics in Crime Hot Spots. This is actually an OK article, but to be clear “hot spots” policing isn’t really defined by police tactics, hot spots are just a method to identify small areas with the most crime in a city. Identifying the hot spots does not explicitly determine the policing (or non-policing) tactic that one should use to reduce crime in that area. The Washington Post had a recent article in a similar vein critiquing the work of Tamara Herold in Breonna Taylor’s death. The WaPo article even prompted a response by a group of well known criminologists how it was inappropriate to blame Herold’s strategy.

So hotspots have always had a mix of different policing tactics that go with it, the most common strategies I would say are problem oriented policing (Braga et al., 1999), increased street or traffic stops (MacDonald et al., 2016; Sherman & Rogan, 1995), or simply patrolling/hanging out in the area (Groff et al., 2015; Koper, 1995). The WSJ article talks about Joel Caplan’s RTM group (which I think do good work), and they are really just doing a version of problem oriented policing. (POP has always had a component of working in tandem with the community and different public/private sector agencies.)

One of the reasons I wanted to write about this post though, is that often in my career I see a disconnect in purportedly hot spots policing (or similar tactics, such as DDACTS) on paper and what is actually happening on the ground. So using the Memphis Open Crime Data, I identified the top 100 street segments in terms of violent crime (code on github to replicate). As I suspected, the place where Nichols was pulled over is not a hot spot of crime, making the connection between the Scorpion units behavior and hot spots policing tactics a bit suspect.

If the embedded google map does now work, here is a screen shot to show how none of the top 100 street midpoints are around the location of where Nichol’s was initially stopped:

It happens to be the case that officers often have misperceptions of where hot spots are (Macbeth & Ariel, 2019; Ratcliffe & McCullagh, 2001). And that if left to no oversight, there tends to be a mismatch between where police proactivity is occurring and where the most serious crime is spatially concentrated (Wheeler et al., 2018). That is why a system to feed back information to officers for whether they are making high quality stops is so important (Worden et al., 2018).

To be clear, this is not me making excuses for researchers or crime analysts to not know what is actually occurring in their jurisdictions, and to potentially ignore the secondary harms that can come with intensive policing. But in my experience, taking the time to do hot spots policing right, which at its most basic is actually identifying hot spots using data, is a good sign that police departments take seriously the tactics they use and to seriously think about mitigating some of these secondary harms. Hot spots policing does not intrinsically result in unequal outcomes, which can be done via tactics that mitigate harm (such as problem oriented policing), or constructing a hot spots policy that promotes racial equity in outcomes from the start (Wheeler, 2020).

References

  • Braga, A.A., Weisburd, D.L., Waring, E.J., Mazerolle, L.G., Spelman, W., & Gajewski, F. (1999). Problem‐oriented policing in violent crime places: A randomized controlled experiment. Criminology, 37(3), 541-580.
  • Groff, E. R., Ratcliffe, J. H., Haberman, C. P., Sorg, E. T., Joyce, N. M., & Taylor, R. B. (2015). Does what police do at hot spots matter? The Philadelphia policing tactics experiment. Criminology, 53(1), 23-53.
  • Koper, C.S. (1995). Just enough police presence: Reducing crime and disorderly behavior by optimizing patrol time in crime hot spots. Justice Quarterly, 12(4), 649-672.
  • Macbeth, E., & Ariel, B. (2019). Place-based statistical versus clinical predictions of crime hot spots and harm locations in Northern Ireland. Justice Quarterly, 36(1), 93-126.
  • MacDonald, J., Fagan, J., & Geller, A. (2016). The effects of local police surges on crime and arrests in New York City. PLoS one, 11(6), e0157223.
  • Ratcliffe, J.H., & McCullagh, M.J. (2001). Chasing ghosts? Police perception of high crime areas. British Journal of Criminology, 41(2), 330-341.
  • Sherman, L.W., & Rogan, D.P. (1995). Effects of gun seizures on gun violence:“Hot spots” patrol in Kansas City. Justice Quarterly, 12(4), 673-693.
  • Wheeler, A.P. (2020). Allocating police resources while limiting racial inequality. Justice Quarterly, 37(5), 842-868.
  • Wheeler, A. P., Steenbeek, W., & Andresen, M. A. (2018). Testing for similarity in area‐based spatial patterns: Alternative methods to Andresen’s spatial point pattern test. Transactions in GIS, 22(3), 760-774.
  • Worden, R.E., McLean, S.J., Wheeler, A.P., Reynolds, D.L., Dole, C., Cochran, H. Smart Stops: An Inquiry into Proactive Policing. Summary Report to the National Institute of Justice, Award No. 2013-MU-CX-0012.

ptools R package update

So as an update to my R package ptools, I have bumped a major version change to 2.0, which is now up on CRAN.

There is no new functionality, but I wanted to bump versions because I swapped out using rgdal/rgeos with sf (rgdal and rgeos are being deprecated). All the functions currently still take as inputs/output sp objects. If I ever get around to it, I will convert the functions to take either. They are somewhat inefficient with conversions, but if you are doing something where it matters you should likely switch data-engineering to another system entirely (such as via SQL in postgis directly). Generating hexagons should actually be faster now, as the sf version I swapped out is vectorized (whereas how I was using sp prior was a loop).

I debate every now and then just letting this go. I can see on cranlogs I have a total of just over 1k (as of 2/7/2023) downloads, and averaged 200 some last month (grand total, last month).

Time is finite, so I have debated on dropping this and just porting most of the functions over into python. Those cumulative downloads are partially bots (I may have racked up 100 of those downloads in my CICD actions). Let me know if you actually use this, as that gives me feedback whether to bother continuing to develop/update this.

Using quantile regression to evaluate police response times

Jeff Asher recently had a post on analysis of response times across many agencies. One nitpick though (and ditto for prior analyses I have seen, such as Scott Mourtgos and company), is that you should not use linear models (or means in general) to describe response time distributions. They are very heavily right skewed, and so the mean tends to be not representative of the bulk of the data.

When evaluating changes in response time, imagine two simplistic scenarios. One, every single call increases by 5 minutes, so what used to be 5 is now 10, 20 is now 25, 60 is now 65, etc. That is probably not realistic for response times, it is probably calls in the tail (ones that take a very long time to wait for an opening in the queue) are what changes. E.g. 5 is still 5, 20 is still 20, but 60 is now 120. In the latter scenario, the left tail of the distribution does not change, only the right tail. In both scenarios the mean shifts.

I think a natural way to model the problem is instead of focusing on means, is to use quantile regression. It allows you to generalize the entire distribution (look at the left and right tails) and still control for individual level circumstances. Additionally, often emergency agencies set goals along the lines of “I want to respond to 90% of emergency events with X minutes”. Quantile regression is a great tool to describe that 90% make. Here I am going to show an example using the New Orleans calls for service data and python.

First, we can download the data right inside of python without saving it directly to disk. I am going to be showing off estimating quantile regression with the statsmodel library. I do the analysis for 19 through 22, but NOLA has calls for service going back to the early 2010s if folks are interested.

import pandas as pd
import statsmodels.formula.api as smf

# Download data, combo 19/20/21/22
y19 = 'https://data.nola.gov/api/views/qf6q-pp4b/rows.csv?accessType=DOWNLOAD'
y20 = 'https://data.nola.gov/api/views/hp7u-i9hf/rows.csv?accessType=DOWNLOAD'
y21 = 'https://data.nola.gov/api/views/3pha-hum9/rows.csv?accessType=DOWNLOAD'
y22 = 'https://data.nola.gov/api/views/nci8-thrr/rows.csv?accessType=DOWNLOAD'
yr_url = [y19,y20,y21,y22]
res_pd = [pd.read_csv(url) for url in yr_url]
data = pd.concat(res_pd,axis=0) # alittle over 1.7 million

Now we do some data munging. Here I eliminate self initiated events, as well as those with missing data. There then are just a handful of cases that have 0 minute arrivals, which to be consistent with Jeff’s post I also eliminate. I create a variable, minutes, that is the minutes between the time created and the time arrived on scene (not cleared).

# Prepping data
data = data[data['SelfInitiated'] == 'N'].copy() # no self init
data = data[~data['TimeArrive'].isna()].copy()   # some missing arrive
data['begin'] = pd.to_datetime(data['TimeCreate'])
data['end'] = pd.to_datetime(data['TimeArrive'])
dif = data['end'] - data['begin']
data['minutes'] = dif.dt.seconds/60
data = data[data['minutes'] > 0].copy() # just a few left over 0s

# Lets look at the distribution
data['minutes'].quantile([0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9])

For quantiles, for the entire sample the median time is around 20 minutes, the 10th percentile is under 3 minutes and the 90th percentile is around 5 hours. Using the mean here (which in Jeff’s post varies from 50 to 146 minutes over the same 4 year period), can be somewhat misleading.

An important component of response times is differentiating between different priority calls. NOLA in their data, higher numbers are higher priority. Zero priority are things NOLA says don’t necessarily need an officer at all. So it could be those “0 priority” calls are really just dragging the overall average down over time, although they may have little to do with clearance rates or public safety overall. The priority category fields also has sub-categories, e.g. 1A is higher priority than 1B. To keep the post simple I just breakdown by integer leading values, not the sub letter-categories.

# Priority just do 1/2/3
# 3 is highest priority
data['PriorCat'] = data['Priority'].str[0]
# Only 5 cases of 3s, will eliminate these as well
data.groupby('PriorCat')['minutes'].describe()

Here you can really see the right skewness – priority 2 calls the mean is 25 minutes, but the median is under 10 minutes for the entire sample. A benefit of quantile regression I will use in a bit, the few outlying cases (beyond the quantiles of interest), really don’t impact the analysis. So those cases that take almost 24 hours (I imagine they are just auto-filled in like that in the data), really don’t impact estimates of smaller quantiles. But they can have noticeable influence on mean estimates.

Some final data munging, to further simplify I drop the 16 cases of priority 3s and 4s, and add in a few more categorical covariates for hour of the day, and look at months over time as categorical. (These decisions are more so to make the results easier to parse in a blog post in simpler tables, it would take more work to model a non-linear continuous over time variable, say via a spline, and make a reasonable ordinal encoding for the sub-priority categories.)

# only worry about 0/1/2s
data = data[data['PriorCat'].isin(['0','1','2'])].copy()
# Total in the end almost 600k cases

# Some factor date variables
def dummy_stats(vdate,begin_date):
    bd = pd.to_datetime(begin_date)
    year = vdate.dt.year
    month = vdate.dt.month
    week_day = vdate.dt.dayofweek
    hour = vdate.dt.hour
    diff_days = (vdate - bd).dt.days
    # if binary, turn week/month into dummy variables
    return diff_days, week_day, hour, month, year

dn, wd, hr, mo, yr = dummy_stats(data['begin'],'1/1/2022')
data['Hour'] = hr
data['Month'] = mo
data['Year'] = yr

# Lets just look at months over time
data['MoYr'] = data['Year'] + data['Month']/100

Now finally onto the modeling stuff. For those familiar with regression, quantile regression instead of predicting the mean predicts a quantile of the distribution. Here I show predicting the 50th quantile (the median). For those not familiar with regression, this is not all that different than doing a pivot table/group by, but aggregating by quantiles instead of means. Regression is somewhat different than the simpler pivot table, since you “condition on” other continuous factors (here I “control for” hour of day), but in broad strokes is similar.

Here I use a patsy “R style” formula, and fit a categorical covariate for the 0/1/2 categories, hour of day, and the time varying months over time (to see the general trends). The subsequent regression table is big, so will show in parts:

# Quantile regression for median
mod = smf.quantreg("minutes ~ C(PriorCat, Treatment(reference='2')) + C(Hour) + C(MoYr)", data)
res50 = mod.fit(q=0.5)
res50.summary()

First, I use 2 priority events as the referent category, so you can see (in predicting the median of the distribution), priority 1 events have a median 24 minutes longer than priority 2, and priority 0 have a median two hours later. You can see some interesting patterns in the hour of the day effects (which are for the overall effects, not broken down by priority). So there are likely shift changes at 06:00, 14:00, and 22:00 that result in longer wait times.

But of most interest are patterns over time, here is the latter half of that table, showing median estimates over the months in this sample.

You could of course make the model more complicated, e.g. look at spatial effects or incorporate other direct measures of capacity/people on duty. But here it is complicated enough for an illustrative blog post. January-2019 is the referent category month, and we can see some slight decreases in a few minutes around the start of the pandemic, but have been clearly been increasing at the median time fairly noticeably starting later in 2021.

As opposed to interpreting regression coefficients, I think it is easier to see model predictions. We can just make sample data points, here at noon over the different months, and do predictions over each different priority category:

# Predictions for different categories
hour = 12
prior_cat = [0,1,2]
oos = data.groupby(['PriorCat','MoYr'],as_index=False)['Hour'].size()
oos['Hour'] = 12
oos['Q50'] = res50.predict(oos)

print(oos[oos['PriorCat'] == '0'])
print(oos[oos['PriorCat'] == '1'])
print(oos[oos['PriorCat'] == '2'])

So here for priority 0, 130 has creeped up to 143.

And for priority 1, median times 35 to 49.

Note that the way I estimated the regression equation, the increase/decrease per month is forced to be the same across the different priority calls. So, the increase among priority 2 calls is again around 13 minutes according to the model.

But this assumption is clearly wrong. Remember my earlier “fast” and “slow” example, with only the slow calls increasing. That would suggest the distributions for the priority calls will likely have different changes over time. E.g. priority 0 may increase by alot, but priority 2 will be almost the same. You could model this in the formula via an interaction effect, e.g. something like "minutes ~ C(PriorCat)*C(MoYr) + C(Hour)", but to make the computer spit out a solution a bit faster, I will subset the data to just priority 2 calls.

Here the power of quantile regression is we can look at different distributions. Estimating extreme quantiles is tough, but looking at the 10th/90th (as well as the median) is pretty typical. I do those three quantiles, and generate model predictions over the months (again assuming a call at 12).

# To save time, I am only going to analyze
# Priority 2 calls now
p2 = data[data['PriorCat'] == '2'].copy()
m2 = smf.quantreg("minutes ~ C(MoYr) + C(Hour)", p2)
oos2 = oos[oos['PriorCat'] == '2'].copy()

# loop over different quantiles
qlist = [0.1, 0.5, 0.9]
for q in qlist:
    res = m2.fit(q=q)
    oos2[f'Q_{q}'] = res.predict(oos2)

oos2

So you can see my story about fast and slow calls plays out, although even when restricted to purportedly high risk calls. When looking at just priority 2 calls in New Orleans, the 10th percentile stays very similar over the period, although does have a slight increase from under 4 to almost 5 minutes. The 50th percentile has slightly more growth, but is from 10 minutes to 13 minutes. The 90th percentile though has more volatility – grew from 30 to 60 in small increases in 2022, and late 2022 has fairly dramatic further growth to 70/90 minutes. And you can see how the prior model that did not break out priority 0/1 calls changed this estimate for the left tail for the priority 2 left tail as well. (So those groups likely also had large shifts across the entire set.)

So my earlier scenario is overly simplistic, we can see some increase in the left tails of the distribution as well in this analysis. But, the majority of the increase is due to changes in the long right tail – calls that used to take less than 30 minutes are now taking 90 minutes to arrive. Which still likely has implications for satisfaction with police and reporting behavior, maybe not so much though with clearance or direct public safety.

No easy answers here in terms of giving internet advice to New Orleans. If working with NOLA, I would like to get estimates of officer capacity per shift, so I could incorporate into the quantile regression model that factor directly. That would allow you to precisely quantify how officer capacity impacts the distribution of response times. So not just “response times are going up” but “the decrease in capacity from A to B resulted in X increase in the 90th percentile of response times”. So if NOLA had goals set they could precisely state where officer capacity needed to be to have a shot of obtaining that goal.

Preprint: Analysis of LED street light conversions on firearm crimes in Dallas, Texas

I have a new pre-print out, Analysis of LED street light conversions on firearm crimes in Dallas, Texas. This work was conducted in collaboration with the Child Poverty Action Lab, in reference to the Dallas Taskforce report. Instead of installing the new lights though at hotspots that CPAL suggested, Dallas stepped up conversion of street lamps to LED. Here is the temporal number of conversions over time:

And here is an aggregated quadrat map at quarter square mile grid cells (of the total number of LED conversions):

I use a diff-in-diff design (compare firearm crimes in daytime vs nighttime) to test whether the cumulative LED conversions led to reduced firearm crimes at nighttime. Overall I don’t find any compelling evidence that firearm crimes were reduced post LED installs (for a single effect or looking at spatial heterogeneity). This graph shows in the aggregate the DiD parallel trends assumption holds citywide (on the log scale), but the identification strategy really relies on the DiD assumption within each grid cell (any good advice for graphically showing that with noisy low count data for many units I am all ears!).

For now just wanted to share the pre-print. To publish in peer-review I would need to do a bunch more work to get the lit review where most CJ reviewers would want it. Also want to work on spatial covariance adjustments (similar to here, but for GLM models). Have some R code started for that, but needs much more work/testing before ready for primetime. (Although as I say in the pre-print, these should just make standard errors larger, they won’t impact the point estimates.)

So no guarantees that will be done in anytime in the near future. But no reason to not share the pre-print in the meantime.

New paper: An Open Source Replication of a Winning Recidivism Prediction Model

Our paper on the NIJ forecasting competition (Gio Circo is the first author), is now out online first in the International Journal of Offender Therapy and Comparative Criminology (Circo & Wheeler, 2022). (Eventually it will be in special issue on replications and open science organized by Chad Posick, Michael Rocque, and Eric Connolly.)

We ended up doing the same type of biasing as did Mohler and Porter (2022) to ensure fairness constraints. Essentially we biased results to say no one was high risk, and this resulted in “fair” predictions. With fairness constraints or penalities you sometimes have to be careful what you wish for. And because not enough students signed up, me and Gio had more winnings distributed to the fairness competition (although we did quite well in round 2 competition even with the biasing).

So while that paper is locked down, we have the NIJ tech paper on CrimRXiv, and our ugly code on github. But you can always email for a copy of the actual published paper as well.

Of course since not an academic anymore, I am not uber focused on potential future work. I would like to learn more about survival type machine learning forecasts and apply it to recidivism data (instead of doing discrete 1,2,3 year predictions). But my experience is the machine learning models need very large datasets, even the 20k rows here are on the fringe where regression are close to equivalent to non-linear and tree based models.

Another potential application is simple models. Cynthia Rudin has quite a bit of recent work on interpretable trees for this (e.g. Liu et al. 2022), and my linked post has examples for simple regression weights. I suspect the simple regression weights will work reasonably well for this data. Likely not well enough to place on the scoreboard of the competition, but well enough in practice they would be totally reasonable to swap out due to the simpler results (Wheeler et al., 2019).

But for this paper, the main takeaway me and Gio want to tell folks is to create a (good) model using open source data is totally within the capabilities of PhD criminal justice researchers and data scientists working for these state agencies.They are quantitaive skills I wish more students within our field would pursue, as it makes it easier for me to hire you as a data scientist!

References