The sausage making behind peer review

Even though I am not on Twitter, I still lurk every now and then. In particular I can see webtraffic referrals to the blog, so I will go and use nitter to look it up when I get new traffic.

Recently my work about why I publish preprints was referenced in a thread. That blog post was from the perspective of why I think individual scholars should post preprints. The thread that post was tagged in was not saying from a perspective of an individual writer – it was saying the whole idea of preprints is “a BIG problem” (Twitter thread, Nitter Thread).

That is, Dan thinks it is a problem other people post preprints before they have been peer reviewed.

Dan’s point is one held by multiple scholars in the field (have had similar interactions with Travis Pratt back when I was on Twitter). Dan does not explicitly say it in that thread, but I take this as pretty strong indication Dan thinks posting preprints without peer review is unethical (Dan thinks postprints are ok). The prior conversations I had with Pratt on Twitter he explicitly said it was unethical.

The logic goes like this – you can make errors, so you should wait until colleagues have peer reviewed your work to make sure it is “OK” to publish. Otherwise, it is misleading to readers of the work. In particular people often mention the media uncritically reporting preprint articles.

There are several reasons I think this opinion is misguided.

One, the peer review system itself is quite fallible. Having received, delivered, and read hundreds of peer review reports, I can confidently say that the entire peer review system is horribly unreliable. It has both a false negative and a false positive problem – in that things that should be published get rejected, and things that should not be published get through. Both happen all the time.

Now, it may be the case that the average preprint is lower quality than a peer reviewed journal article (given selection of who posts preprints I am actually not sure this is the case!) In the end though, you need to read the article and judge the article for yourself – you cannot just assume an article is valid simply because it was published in peer review. Nor can you assume the opposite – something not peer reviewed is not valid.

Two, the peer review system is vast currently. To dramatically oversimplify, there are “low quality” (paid for journals, some humanities journals, whatever journals publish the “a square of chocolate and a glass of red wine a day increases your life expectancy” garbage), and “high quality” journals. The people who Dan wants to protect from preprints are exactly the people who are unlikely to know the difference.

I use scare quotes around low and high quality in that paragraph on purpose, because really those superficial labels are not fair. BMC probably publishes plenty of high quality articles, it just happened to also publish an a paper that used a ridiculous methodology that dramatically overestimated vaccine adverse effects (where the peer reviewers just phoned in superficial reviews). Simultaneously high quality journals publish junk all the time, (see Crim, Pysch, Econ, Medical examples).

Part of the issue is that the peer review system is a black box. From a journalists perspective you don’t know what papers had reviewers phone it in (or had their buddies give it a thumbs up) versus ones that had rigorous reviews. The only way to know is to judge the paper yourself (even having the reviews is not informative relative to just reading the paper directly).

To me the answer is not “journalists should only report on peer reviewed papers” (or the same, no academic should post preprints without peer review) – all consumers need to read the work for themselves to understand its quality. Suggesting that something that is peer reviewed is intrinsically higher quality is bad advice. Even if on average this is true (relative to non-peer reviewed work), any particular paper you pick up may be junk. There is no difference from the consumer perspective in evaluating the quality of a preprint vs a peer reviewed article.

The final point I want to make, three, is that people publish things that are not peer reviewed all the time. This blog is not peer reviewed. I would actually argue the content I post here is often higher quality than many journal articles in criminology (due to transparent, reproducible code I often share). But you don’t need to take my word for it, you can read the posts and judge that for yourself. Ditto for many other popular blogs. I find it pretty absurd for someone to think me publishing a blog is unethical – ditto for preprints.

No point in arguing with peoples personal opinions about what is ethical vs what is not though. But thinking that you are protecting the public by only allowing peer reviewed articles to be reported on is incredibly naive as well as paternalistic.

We would be better off, not worse, if more academics posted preprints, peer review be damned.

Survey duplicates and other stuff

So for various updates. First, I have started an AltAc newsletter, see the first email here. It will be examples of jobs, highlights of criminologists who are in the private sector, and random beginner tech advice (this week was getting started in NLP using simpletransformers, next week I will give a bit of SQL). Email me if you want to be added to the list (open to whomever).

Second, over on CRIME De-Coder I have a post on more common examples of NLP in crime analysis: named entity recognition, semantic similarity, and supervised learning. I have been on an NLP kick lately – everyone thinks genAI (e.g. ChatGPT) is all the rage, most applications though really don’t want genAI, they just think genAI is cool.

Third, Andrew Gelman blogged about the Hogan/Kaplan back and forth. I do like Gelman’s hot takes, even if he is not super in-the-know on current flavors of different micro-econ identification strategies.

Fourth, I have pushed mine and Gio’s whitepaper on using every door direct mail and web based push surveys + MRP to CrimRXiv. Using Every Door Direct Mail Web Push Surveys and Multi-level modelling with Post Stratification to estimate Perceptions of Police at Small Geographies (Circo & Wheeler, 2023). This is our solution to measuring spatially varying attitudes towards police with a reasonable budget (the NIJ community perceptions challenge). Check it out and get in touch if you want to deploy something like that to your jurisdiction.

For a bit of background, we (me and Gio) intentionally did not submit to the other categories in the competition. I don’t think you can reasonably measure micro place community attitudes using any of the other methods in the competition (with the exception of boots on the ground survey takers, which is cost prohibitive and a non-starter for most cities). So we could be like ‘uSiNG TwiTTeR aNd SenTimeNT aNalYSis tO mEasUre HoW mUCh PeoPLe hAtE pOLIcE’, but this is bad both from a measure perspective (sentiment analyses are not good, even ignoring selection biases in those public social media sources) and a tieing it to a particular spatial area perspective. The tieing it to a small spatial area also makes me very hesitant to suggest purely web based adverts to generate survey responses.

Analyzing Survey Duplicates

The last, and biggest thing I wanted to share. Jake Day and Jon Brauer’s blog, Reluctant Criminologists, has a series of posts on analyzing near survey duplicates (also with contribution by Maja Kotlaja). Apparently there was a group with SurveyMonkey/Princeton that gives a recommendation that matches over 85% are cause for concern – so if you take two survey responses, and those two responses have over 85% of the same survey responses, that is symptomatic of fraud in the SurveyMonkey advice.

This is theoretically caused by a malicious actor (such as a survey taker not wanting to do work). So they take an existing survey, duplicate it, but to be sneaky change a few responses to make it look less suspicious (this is not about making up responses whole cloth).

The 85% rule is bad advice and will result in chasing the noise. Many real world criminology surveys will have more random matches, so people will falsely assume surveys are fraudulent. RC do EDA on one of their own surveys and say why they don’t think the over 85% is likely to be fraudulent. (And sign up for their blog and other work while you go check out their post.)

So I give python code how I would analyze survey duplicates, taking into account the nature of the baseline data, Survey Forensics: Identifying Near Duplicate Responses. I show in simulations how you can get more than 85% duplicate matches, even in random data, depending on the marginal distribution of the survey responses.

I also show how to use statistics to identify outliers use false discovery rate corrections, and then cluster like responses together for easier analysis of the identified problem responses. Using data in Raleigh, I show a several groups of outlying survey near duplicates are people just doing run responses (all 5s, 4s, or missings). But I do identify 3 pairs that are somewhat suspicious.

While this is particular type of fraud is not so much a problem in web based surveys, it is not totally irrelevant – you can have people retake web based push surveys by accident. Urban talks about this in terms of analyzing IP addresses. Cell phones and WiFi though some of those could slip through, so this is idea is not totally irrelevant even for web surveys. And for shared WiFi (universities, or people in the same home/apt taking the survey) IP addresses won’t necessarily discriminate.

References

Synthetic control in python: Opioid death increases in Oregon and Washington

So Charles Fain Lehman has a recent post on how decriminalization of opioids in Oregon and Washington (in the name of harm reduction) appear to have resulted in increased overdose deaths. Two recent papers, both using synthetic controls, have come to different conclusions, with Joshi et al. (2023) having null results, and Spencer (2023) having significant results.

I have been doing synth analyses for several groups recently, have published some on micro-synth in the past (Piza et al., 2020). The more I do, the more I am concerned about the default methods. Three main points to discuss here:

  • I think the default synth fitting mechanism is not so great, so I have suggested using Lasso regression (if you want a “real” peer-reviewed citation, check out DeBiasi & Circo (2021) for an application of this technique). Also see this post on crime counts/rates and synth problems, which using an intercept in a Lasso regression avoids.
  • The fitting mechanism + placebo approach to generate inference can be very noisy, resulting in low powered state level designs. Hence I suggest a conformal inference approach to generate the null distribution
  • You should be looking at cumulative effects, not just instant effects in these designs.

I have posted code on Github, and you can see the notebook with the results. I will walk through here quickly. I initially mentioned this technique is a blog post a few years ago (with R code). Here I spent some time to script it up in python.

So first, we load in the data, and go on to conduct the Oregon analysis (you drop Washington as a potential control). Now, a difference in the Abadie estimator (just a stochastic gradient descent optimizer with hard constraints), vs a lasso estimator (soft constraints), is that you need to specify how much to penalize the coefficients. There is no good default for how much, it depends on the scale of your data (doing death rates per 1,000,000 vs per 100,000 will change the amount of penalization), how many rows of data you have, and have many predictor variables you have. So I use an approach to suggest the alpha coefficient for the penalization in a seperate step:

import LassoSynth
import pandas as pd

opioid = pd.read_csv('OpioidDeathRates.csv')
wide = LassoSynth.prep_longdata(opioid,'Period','Rate','State')

# Oregon Analysis
or_data = wide.drop('Washington', axis=1)
oregon = LassoSynth.Synth(or_data,'Oregon',38)

oregon.suggest_alpha() # default alpha is 1

This ends up suggesting an alpha value of 0.17 (instead of default 1). Now you can fit (I passed in the data already to prep it for synth on the init, so no need to re-submit the data):

oregon.fit()
oregon.weights_table()

The fit prints out some metrics, root mean square error and R-squared, {'RMSE': 0.11589514406988406, 'RSquare': 0.7555976595776881}, here for this data. Which offhand looks pretty similar to the other papers (including Charles). And for the weights table, Oregon ends up being very sparse, just DC and West Virginia for controls (plus the intercept):

Group                       Coef
Intercept               0.156239
West Virginia           0.122256
District of Columbia    0.027378

The Lasso model here does constrain the coefficients to be positive, but does not force them to sum to 1 (plus it has an intercept). I think these are all good things (based on personal experience fitting functions). We can graph the fit for the historical data, plus the standard error of the lasso counterfactual forecasts in the post period:

# Default alpha level is 95% prediction intervals for counterfactual
oregon.graph('Opioid Death Rates per 100,000, Oregon Synthetic Estimate')

So you can see the pre-intervention fit is smoother than the monthly data in Oregon, but by eye seems quite reasonable (matches the recent increase and spikes post period 20, starts in Jan-2018, so starting in August-2019). (Perfect fits are good evidence of over-fitting in machine learning.)

Post intervention, after period 37, I do my graph a bit differently. Sometimes people are confused when the intervention starts in the graph, so here I literally split pre/post data lines, so there should be no confusion. I use the conformal inference approach to generate 95% prediction intervals around the counterfactual trend. You can see the counterfactual trend has slightly decreased, whereas Oregon increased and is volatile. Some of the periods are covered by the upper most intervals, but the majority are clearly outside.

Now, besides the fitting function, one point I want to make is people should be looking at cumulative effects, not just instant effects. So Abadie has a global test, using placebos, that looks at the ratio of the pre-fit to post fit (squared errors), then does the placebo p-value based on that stat. This doesn’t have any consideration though for consistent above/below effects.

So pretend the Oregon observed was always within the 95% counterfactual error bar, but was always consistently at the top, around 0.1 increase in overdose deaths. Any single point-wise monthly inference fails to reject the null, but that overall always high pattern is not regular. You want to look at the entire curve, not just a single point. Random data won’t always be high or low, it should fluctuate around the counterfactual estimate.

To do this you look at the cumulative differences between the counterfactual and the observed (and take into account the error distribution for the counterfactuals).

# again default is 95% prediction intervals
oregon.cumgraph('Oregon Cumulative Effects, [Observed - Predicted]')

Accumulated over time, this is a total of over 7 per 100,000. With Oregon having a population of around 4.1 million, I estimate that the cumulative increased number of overdose deaths is around 290 in Oregon. This is pretty consistent with the results in Spencer (2023) as well (182 increased deaths over fewer months).

To do a global test with this approach, you just look at the very final time period and whether it covers 0. This is what I suggest in-place of the Abadie permutation test, as this has a point estimate and standard error, not just a discrete p-value.

We can do the same analysis for Washington as we did for Oregon. It shows increases, but many of the time periods are covered by the counter-factual 95% prediction interval.

But like I mentioned before, they are consistently high. So when you do the cumulative effects for Washington, they more clearly show increases over time (this data last date is March 2022).

At an accumulated 2.5 per 100,000, with a state population of around 7.7 million, it is around 190 additional overdose deaths in Washington. You can check out the notebook for more stats, Washington has a smaller suggested alpha, so the matched weights have several more states. But the pre-fit is better, and so it has smaller counterfactual intervals. All again good things compared to the default (placebo approach Washington/Oregon will pretty much have the same error distribution, so Washington being less volatile does not matter using that technique).

I get that Abadie is an MIT professor, published a bunch in JASA and well known econ journals, and that his approach is standard in how people do synthetic control analyses. My experience though over time has made me think the default approaches here are not very good – and the placebo approach where you fit many alternative analyses just compounds the issue. (If the fit is bad, it makes the placebo results more variable, causing outlier placebos. People don’t go and do a deep dive of the 49 placebos though to make sure they are well behaved.)

The lasso + conformal approach is how I would approach the problem from my experience fitting machine learning models. I can’t give perfect proof this is a better technique than the SGD + placebo approach by Adadie, but I can release code to at least make it easier for folks to use this technique.

References

  • De Biasi, A., & Circo, G. (2021). Capturing crime at the micro-place: a spatial approach to inform buffer size. Journal of Quantitative Criminology, 37, 393-418.

  • Joshi, S., Rivera, B. D., Cerdá, M., Guy, G. P., Strahan, A., Wheelock, H., & Davis, C. S. (2023). One-year association of drug possession law change with fatal drug overdose in Oregon and Washington. JAMA Psychiatry Online First.

  • Piza, E. L., Wheeler, A. P., Connealy, N. T., & Feng, S. Q. (2020). Crime control effects of a police substation within a business improvement district: A quasi‐experimental synthetic control evaluation. Criminology & Public Policy, 19(2), 653-684.

  • Spencer, N. (2023). Does drug decriminalization increase unintentional drug overdose deaths?: Early evidence from Oregon Measure 110. Journal of Health Economics, 91, 102798.

Soft launching tech recruiting

I am soft-launching a tech recruiting service. I have had conversations with people on all sides of the equation on a regular basis, so I might as well make it a formal thing I do.

If you are an agency looking to fill a role, get in touch. If you are looking for a role, get in touch at https://crimede-coder.com/contact or send an email directly to andrew.wheeler@crimede-coder.com.

Why am I doing this?

I have a discussion with either a friend or second-degree friend about once a month who are current professors who ask me about making the jump to private sector. You can read my post, Make More Money, how I think many criminal justice professors are grossly underpaid. For PhD students you can see my advice at Flipping a CJ PhD to an Alt-Academic career.

If you are a current student or professor and want to chat, reach out and let me know you are interested. I am just going to start keeping a list of folks to help match them to current opportunities.

I have discussions with people who are trying to hire for jobs regularly as well. This includes police departments that are upping their game to hire more advanced roles, think tanks who want to hire early career individuals, and some tech companies in the CJ space who need to fill data science roles.

These are good jobs, and we have good people, so why are these agencies and businesses having a hard time filling these roles? Part of it is advertisement – these agencies don’t do a good job of getting the word out to the right audience. A second part is people have way off-base salary expectations (this is more common for academic positions, post docs I am looking at you). Part of the salary discussion is right sizing the role and expectations – you can’t ask for 10+ years experience and have a 90k salary for someone with an advanced degree – doesn’t really matter what job title you are hiring for.

I can help with both of those obviously – domain knowledge and my network can help your agency right size and fill that role.

Finally, I get cold messaged by recruiters multiple times a month. The straw to finally put this all on paper is I routinely encounter gross incompetence from recruiters. They do not understand the role the business is hiring for, they do not have expertise to evaluate potential candidates, and by cold emailing they clearly do not have a good network to pull potential candidates from.

If you are an agency or company whom you think my network of scholars can help fill your roll get in touch. I only get paid when you fill the position, so it is no cost to try to use my recruiting services. Again will help go over the role with you and say whether it feasible to fill that position as is, or whether it should be tweaked.

Below is my more detailed advice for job seekers. Again reach out if you are a job seeker, even if we have not met we can chat, I will see you do good work, and I will put you on my list of potential applicants to pull from in the future.

Tech Job Applying Advice, P1 Tech Roles

Here is an in-depth piece of advice I gave a friend recently – I think this will be useful in general to individuals in the social sciences who are interested in making the jump to the private sector.

First is understanding what jobs are available. This blog has a focus on quantitative work, but even if you do qualitative work there are tech opportunities. Also some jobs only need basic quant skills (Business Analyst) that any PhD will have (if you know how to use Excel and PowerPoint you have the necessary tech skills to be a business analyst).

Job labels and responsibilities are fuzzy, but here is a rundown of different tech roles and some descriptions:

  • Data Scientist
    • Role, fitting models and automating processes (writing code to shift data around)
    • need to have more advanced coding/machine learning background, e.g. have examples in python/R/SQL and know machine learning concepts
  • Business Analyst
    • anyone with a PhD can do this, Excel/Powerpoint
    • domain knowledge is helpful (which can be learned)
  • Program Manager/Project Manager
    • Help manage teams, roles are similar to “managing grants”, “supervising students”
    • often overlap with various project management strategies (agile, scrum).
    • These names are all stupid though, it is just supervising and doing “non-tech” things to help teams
  • Product Owner
    • Leads longer term development of a product, e.g. we should build X & Y in the next 3-6 months
    • Mix of tech or non-tech background (typically grow into this role from other prior roles)
    • If no tech need strong domain knowledge
    • Sometimes need to “sell” product, internally or externally
  • Director
    • Leads larger team of data scientists/programmers
    • Discusses with C-level, budgets/hiring/revenue projections
    • Often internal from Data Scientist or less often Product Owner/Business Analyst
    • but is possible to be direct into role with good domain knowledge

Salaries vary, it will generally be:

Business Analyst < Project Manager < {Data Scientist,Product Owner} < Director

But not always – tech highly values writing code, so it is not crazy for a supervisory role (Director) to make less than a senior Data Scientist.

Within Business Analyst you can have Junior/Senior (JR/SR) roles (for PhDs you should come in as Senior). Data scientist can have JR/SR/{Lead,Principal} (PhD should come in as Senior). JR needs supervision, SR can be by themselves and be OK, Lead is expected to mentor and supervise JRs.

Very generic salary ranges for typical cities (you should not take a job lower than the low end on these, with enough work you can find jobs higher, but will be hard in most markets):

  • Business Analyst: 70k – 120k
  • JR Data Scientist: 100k – 130k
  • SR Data Scientist: 130k – 180k
  • Program Manager: 100k – 150k
  • Product Owner: 120k – 160k
  • Director: 150k – 250k

Note I am not going to go and update this post (so this is September 2023), just follow up with me or do your own research to figure out typical salary ranges when this gets out of date in a year from now.

So now that you are somewhat familiar with roles, you need to find roles to apply to. There are two strategies; 1) find open roles online, 2) find specific companies. Big piece of advice here is YOU SHOULD BE APPLYING TO ROLES RIGHT NOW. Too many people think “I am not good enough”. YOU ARE GOOD ENOUGH TO APPLY TO 100s OF POSITIONS RIGHT NOW BASED ON YOUR PHD. Stop second guessing yourself and apply to jobs!

Tech Job Applying Advice, P2 Finding Positions

So one job strategy is to go to online job boards, such as LinkedIn, and apply for positions. For example, if I go search “Project Manager” in the Raleigh-Durham area, I get something like two dozen jobs pop-up. You may be (wrongly) thinking I don’t qualify for this job, buts lets look specifically at a job at NTT Data for a project manager, here are a few things they list:

  • Working collaboratively with product partners and chapter leaders to enable delivery of the squad’s mission through well-executed sprints
  • Accelerating overall squad performance, efficiency and value delivered by engaging within and across squads to find opportunities to improve agile maturity and metrics, and providing coaching, training and resources
  • Maintaining and updating squad performance metrics (e.g., burn-down charts) and artifacts to ensure accurate and clear feedback to the squad members and transparency to other partners
  • Managing, coordinating logistics for and participating in agile events (e.g., backlog prioritization, sprint planning, daily meetings, retrospectives and as appropriate, scrum of scrum masters)

This is all corporate gobbledygook for “managing a team to make sure people are doing their work on time” (and all the other bullet points are just more junk to say the same thing). You know who does that? Professors who supervise multiple students and manage grants.

For those with more quant programming skills, you have more potential opportunities (you can apply to data scientist jobs that require coding). But even if you do not have those skills, there are still plenty of opportunities.

Note that many of these jobs list “need to have” and “want to have”. You should still apply even if you do not meet all of the “need to have”. Very often these requirements will be made up and not actually “need to have” (it is common for job adverts to have obvious copy-paste mistakes or impossible need to haves). That NTT Data one has a “Certified Scrum Master (CSM) required” – if you see a bunch of jobs and that is what is getting you cut guess what? You can go and take a scrum master course in two days and check off that box. And have ChatGPT rewrite your cover letter asking it to sprinkle in agile buzzwords in the professor supervisory experience – people will never know that you just winged it when supervising students instead of using someone elses made up project management philosophy.

So I cannot say that your probability of landing any particular job is high, it may only be 1%. But unlike in academia, you can go on LinkedIn, and if you live in an urban area, likely find 100+ jobs that you could apply for right now (that pay more than a starting assistant professor in criminal justice).

So apply to many jobs, and most people I talk to with this strategy will be able to land something in 6-12 months. For resume/cover letter advice, here is my data science CV, and here is an example cover letter. For CV make it more focused on clear outcomes you have accomplished, instead of just papers say something like “won grant for 1 million dollars”, “supervised 5 students to Phd completion”, “did an RCT that reduced crime by 10%”. But you do not need to worry about making it only fit on 1 page (it can be multiple pages). Make it clear you have a PhD, people appreciate that, and people appreciate if you have a book published with a legit publisher as well (lay people find that more obviously impressive than peer reviewed publishing, because most people don’t know anything about peer reviewed publishing).

Do not bother tinkering to make different materials for every job (if the job requires cover letter, make generic and just swap out a few key words/company name). A cover letter will not make or break your job search, so don’t bother to customize it (I do not know how often they are even read).

Tech Job Applying Advice, P3 Finding Companies

The second strategy is to find companies you are interested in. Do you do work on drug abuse and victimization? There are probably healthcare companies you will be interested in. Do you do work tangentially related to fraud? Their are positions at banks who need machine learning skills. Are you interested in illegal markets? I bet various social media platforms need help with solutions to prevent selling illegal contraband.

This goes as well for think tanks (many cities have local think tanks that do good work, think beyond just RAND). These and civil service jobs (e.g. working for children and family services as an analyst) typically do not pay as high as private sector, but are still often substantially better than entry level assistant professor salaries (you can get think-tank or civil service gigs in the 80-120k range).

After you have found a company that you are interested in, you can go and look at open positions and apply to them (same as above). But an additional strategy at this point is to identify potential people you want to work with, and cold email/message them on social media.

It is similar to the above advice – many people will not answer your cold emails. It may be only 1/10 answer those emails. But an email is easy – there is no harm. Do not overthink it, send an email that is “Hey I think you do cool things, I do cool things too and would like to work together. Can we talk?” People will respond to something like that more often than you think. And if they don’t, it is their loss.

Here the biggest issue is a stigma associated with particular companies – people think Meta is some big evil company and they don’t want to work for them. And people think being an academic has some special significance/greater purpose.

If you go and build something for Meta that helps reduce illegal contraband selling by some miniscule fraction, you will have prevented a very large number of crimes. I build models that incrementally do a better job of identifying health care claims that are mis-billed. These models consistently generate millions of dollars of revenue for my company (and save several state Medicaid systems many millions more).

The world is a better place with me building stuff like that for the private sector. No doubt in my mind I have generated more value for society in the past 3 years than I would have in my entire career as an academic. These tech companies touch so many people, even small improvements can have big impacts.

Sorry to burst some academic bubbles, but that paper you are writing does not matter. It only matters to the extent you can get someone outside the ivory tower to alter their behavior in response to that paper. You can just cut out the academic middle man and work for companies that want to do that work of making the world a better place, instead of just writing about it. And make more money while you are it.

Downloading Police Employment Trends from the FBI Data Explorer

The other day on the IACA forums, an analyst asked about comparing her agencies per-capita rate for sworn/non-sworn compared to other agencies. This is data available via the FBI’s Crime Data Explorer. Specifically they have released a dataset of employment rates, broken down by various agencies, over time.

The Crime Data Explorer to me is a bit difficult to navigate, so this post is going to show using the API to query the data in python (maybe it is easier to get via direct downloads, I am not sure). So first, go to that link above and sign up for a free API key.

Now, in python, first the API works via asking for a specific agencies ORI, as well as date ranges. (You can do a query for national and overall state as well, but I would rarely want those levels of aggregation.) So first we are just going to grab all of the agencies across 50 states. This runs fairly fast, only takes a few minutes:

import pandas as pd
import requests

key = 'Insert your key here'

state_list = ("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA",
              "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD",
              "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ",
              "NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI",
              "SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY","DC")

# Looping over states, getting all of the ORIs
fin_data = []
for s in states:
    url = f'https://api.usa.gov/crime/fbi/cde/agency/byStateAbbr/{s}?API_KEY={key}'
    data = requests.get(url)
    fin_data.append(pd.DataFrame(data.json()))

agency = pd.concat(fin_data,axis=0).reset_index(drop=True)

And the agency dataframe has just a few shy of 19k ORI’s listed. Unfortunately this does not have much else associated with the agencies (such as the most recent population). It would be nice if this list had population counts (so if you just wanted to compare yourself to other similar size agencies), but alas it does not. So the second part here – scraping all 18,000+ agencies, takes a bit (let it run overnight).

# Now grabbing the full employment data
ystart = 1960   # some have data going back to 1960
yend = 2022
emp_data = []

# try/catch, as some of these can fail
for i,o in enumerate(agency['ori']):
    print(f'Getting agency {i+1} out of {agency.shape}')
    url = ('https://api.usa.gov/crime/fbi/cde/pe/agency/'
          f'{o}/byYearRange?from={ystart}&to={yend}&API_KEY={key}')
    try:
        data = requests.get(url)
        emp_data.append(pd.DataFrame(data.json()))
    except:
        print(f'Failed to query {o}')

emp_pd = pd.concat(emp_data).reset_index(drop=True)
emp_pd.to_csv('EmployeePoliceData.csv',index=False)

And that will get you 100% of the employee data on the FBI data explorer, including data for 2022.

To plug my consulting firm here, this is something that takes a bit of work. If you have longer running scraping jobs, I paired this code example down to be quite minimial, but you want to periodically save results and have the code be able to run from the last save point. So if you scrape 1000 agencies, your internet goes out, you don’t want to have to start from 0, you want to start from the last point you left off.

If that is something you need, it makes sense to send me an email to see if I can help. For that and more, check out my website, crimede-coder.com:

Too relaxed? Naive Bayes does not improve recidivism forecasting in the NIJ challenge

So the paper Improving Recidivism Forecasting With a Relaxed Naïve Bayes Classifier (Lee et al., 2023), recently published in Crime & Delinquency, has incorrect results. Note I am not sandbagging on the authors, I reviewed this paper for JQC and Journal of Criminal Justice, so I have given the authors this same feedback already (multiple times!). The authors however did not correct their results, and just journal shopped and published the wrong findings.

I have replication code here to review. (Note I initially made a mistake in my code replication, reversed calculating p(x|y), I calculated p(y|x) by accident, see this older code I shared in my prior reviews, but I was still correct in my assertion that Lee’s results were wrong.)

So the main thing that made me go to this effort, the authors report unbelieveable results. They report Brier Scores for Females (Round 1) of 0.104 and for males 0.159 – these scores blow the competition out of the water. The leaderboard was 0.15 for Females and 0.19 for males. Note how I don’t list to the third decimal – the difference between the teams you needed to go down that low. Lee also reports unbelievably low Brier scores for the alternative logit and random forest models – their results just on their face are not believable.

If the authors really believe their results this kind of sucks for them they did not participate in the NIJ challenge, they would have won more than $150,000! But I am pretty sure they are miscalculating their Brier scores somewhere. My replication code shows them in the same ballpark as everyone else, but they would not have made the leaderboard. Here are my estimates of what their Brier scores should be reported as (the Brier column below in the two tables):

Folks can go and look at their paper and their set of spreadsheets in the supplemental material – I have posted not many more than 50 lines of (non-comment) python code that replicates their regression model coefficients and shows their Brier scores are wrong though. (And subsequently any points Lee et al. 2023 make about fairness are thus wrong as well.)

NIJ probably released papers at some point, but if you want to see other folks discussion, there is Circo & Wheeler (2022) (for mine and Gio’s results for team MCHawks), and Mohler & Porter (2021) for team PASDA.

I may put in the slate sometime to discuss naive Bayes (and other categorical encoding schemes). It is not a bad idea for data with many categories, but for this NIJ data there just isn’t that much to squeeze out of the data. So any future work will be unlikely to dramatically improve upon the competition results (it is difficult to overfit this data). Again given my analysis here, I am pretty sure a valid data analysis (not peeking) at best will “beat” the competition results in the 3rd decimal place (if they can improve at all).

Now part of the authors argument is that this method (relaxed naive Bayes) results in simpler interpretations. Typically people interpret “simple” models in terms of the end results, e.g. having a simple checklist of integer weights. The more I deal with predictive models though, I think this is maybe misguided. You could also interpret “simple” in terms of the code used for how someone derived the weights (and evaluated the final metrics). This is important when auditing code that others have written, as you will ultimately take the code and apply it to your data.

I think this “simpler to estimate the same results” is probably more important for scientists and outside groups wanting to verify the integrity of any particular machine learning model than “simple end result weights”. Otherwise scientists can make up results and say my method is better. Which is simpler I suppose, but misses the boat a bit in terms of why we want simple models to begin with.

References

Youtube interview with Manny San Pedro on Crime Analysis and Data Science

I recently did an interview with Manny San Pedro on his YouTube channel, All About Analysis. We discuss various data science projects I conducted while either working as an analyst, or in a researcher/collaborator capacity with different police departments:

Here is an annotated breakdown of the discussion, as well as links to various resources I discuss in the interview. This is not a replacement for listening to the video, but is an easier set of notes to link to more material on what particular item I am discussing.

0:00 – 1:40, Intro

For rundown of my career, went to do PhD in Albany (08-15). During that time period I worked as a crime analyst at Troy, NY, as well as a research analyst for my advisor (Rob Worden) at the Finn Institute. My research focused on quant projects with police departments (predictive modeling and operations research). In 2019 went to the private sector, and now work as an end-to-end data scientist in the healthcare sector working with insurance claims.

You can check out my academic and my data science CV on my about page.

I discuss the workshop I did at the IACA conference in 2017 on temporal analysis in Excel.

Long story short, don’t use percent change, use other metrics and line graphs.

7:30 – 13:10, Patrol Beat Optimization

I have the paper and code available to replicate my work with Carrollton PD on patrol beat optimization with workload equality constraints.

For analysts looking to teach themselves linear programming, I suggest Hillier’s book. I also give examples on linear programming on this blog.

It is different than statistical analysis, but I believe has as much applicability to crime analysis as your more typical statistical analysis.

13:10 – 14:15, Million Dollar Hotspots

There are hotspots of crime that are so concentrated, the expected labor cost reduction in having officers assigned full time likely offsets the position. E.g. if you spend a million dollars in labor addressing crime at that location, and having a full time officer reduces crime by 20%, the return on investment for hotspots breaks even with paying the officers salary.

I call these Million dollar hotspots.

14:15 – 28:25, Prioritizing individuals in a group violence intervention

Here I discuss my work on social network algorithms to prioritize individuals to spread the message in a focussed deterrence intervention. This is opposite how many people view “spreading” in a network, I identify something good I want to spread, and seed the network in a way to optimize that spread:

I also have a primer on SNA, which discusses how crime analysts typically define nodes and edges using administrative data.

Listen to the interview as I discuss more general advice – in SNA it matters what you want to accomplish in the end as to how you would define the network. So I discuss how you may want to define edges via victimization to prevent retaliatory violence (I think that would make sense for violence interupptors to be proactive for example).

I also give an example of how detective case allocation may make sense to base on SNA – detectives have background with an individuals network (e.g. have a rapport with a family based on prior cases worked).

28:25 – 33:15, Be proactive as an analyst and learn to code

Here Manny asked the question of how do analysts prevent their role being turned into more administrative role (just get requests and run simple reports). I think the solution to this (not just in crime analysis, but also being an analyst in the private sector) is to be proactive. You shouldn’t wait for someone to ask you for specific information, you need to be defining your own role and conducting analysis on your own.

He also asked about crime analysis being under-used in policing. I think being stronger at computer coding opens up so many opportunities that learning python, R, SQL, is the area I would like to see stronger skills across the industry. And this is a good career investment as it translates to private sector roles.

33:15 – 37:00, How ChatGPT can be used by crime analysts

I discuss how ChatGPT may be used by crime analysis to summarize qualitative incident data and help inform . (Check out this example by Andreas Varotsis for an example.)

To be clear, I think this is possible, but the tech I don’t think is quite up to that standard yet. Also do not submit LEO sensitive data to OpenAI!

Also always feel free to reach out if you want to nerd out on similar crime analysis questions!

Make more money

So I enjoy Ramit Sethi’s Netflix series on money management – fundamentally it is about money coming in and money going out and the ability to balance a budget. On occasion I see other budget coaches focus on trivial expenses (the money going out) whereas for me (and I suspect the majority of folks reading this blog with higher degrees and technical backgrounds) you should almost always be focused on finding a higher paying job.

Lets go with a common example people use as unnecessary discretionary spending – getting a $10 drink at Starbucks every day. If you do this, over the course of a 365 day year, you will have spent $3650 additional dollars. If you read my blog about coding and statistics and that expense bothers you, you are probably not making as much money as you should be.

Ramit regularly talks about asking for raises – I am guessing most people reading this blog if you got a raise it would be well over that Starbucks expense. But part of the motivation to write this post is in reference to formerly being a professor. I think many criminal justice (CJ) professors are underemployed, and should consider better paying jobs. I am regularly starting to see public sector jobs in CJ that have substantially better pay than being a professor. This morning was shared a position for an entry level crime analyst at the Reno Police Department with pay range from $84,000 to $102,000:

The low end of that starting pay range is competitive with the majority of starting assistant professor salaries in CJ. You can go check out what the CJ professors at Reno make (which is pretty par for the course for CJ departments in the US) in comparison. If I had stayed as a CJ professor, even with moving from Dallas to other universities and trying to negotiate raises, I would be lucky to be making over $100k at this point in time. Again, that Reno position is an entry level crime analyst – asking for a BA + 2 years of experience or a Masters degree.

Private sector data science jobs in comparison, in DFW area in 2019 entry level were often starting at $105k salary (based on personal experience). You can check out BLS data to examine average salaries in data science if you want to look at your particular metro area (it is good to see the total number in that category in an area as well).

While academic CJ salaries can sometimes be very high (over $200k), these are quite rare. There are a few things going against professor jobs, and CJ ones in particular, that depress CJ professor wages overall. Social scientists in general make less than STEM fields, and CJ departments are almost entirely in state schools that tend to have wage compression. Getting an offer at Harvard or Duke is probably not in the cards if you have a CJ degree.

In addition to this, with the increase in the number of PhDs being granted, competition is stiff. There are many qualified PhDs, making it very difficult to negotiate your salary as an early career professor – the university could hire 5 people who are just as qualified in your stead who aren’t asking for that raise.

So even if you are lucky enough to have negotiating power to ask for a raise as a CJ professor (which most people don’t have), you often could make more money by getting a public sector CJ job anyway. If you have quant skills, you can definitely make more money in the private sector.

At this point, most people go back to the idea that being a professor is the ultimate job in terms of freedom. Yes, you can pursue whatever research line you want, but you still need to teach courses, supervise students, and occasionally do service to the university. These responsibilities all by themselves are a job (the entry level crime analyst at Reno will work less overall than the assistant professor who needs to hustle to make tenure).

To me the trade off in freedom is worth it because you get to work directly with individuals who actually care what you do – you lose freedom because you need to make things within the constraints of the real world that real people will use. To me being able to work directly on real problems and implement my work in real life is a positive, not a negative.

Final point to make in this blog, because of the stiff competition for professor positions, I often see people suggesting there are too many PhDs. I don’t think this is the case though, you can apply the skills you learned in getting your CJ PhD to those public and private sector jobs. I think CJ PhD programs just need small tweaks to better prepare students for those roles, in addition to just letting people know different types of positions are available.

It is pretty much at the point that alt-academic jobs are better careers than the majority of CJ academic professor positions. If you had the choice to be an assistant professor in CJ at University of Nevada Reno, or be a crime analyst at Reno PD, the crime analyst is the better choice.

Javascript apps and ASEBP update

So for a quick update, my most recent post on ASEBP, This One Simple Trick Will Improve Attitudes Toward Police. (Note you need a ASEBP membership to read.) There are several recent studies by different groups showing follow up to victims, even if you won’t solve the crime in the end, improves overall attitudes towards police. Simple thing for PDs to do. See the reference list at the end of the post for various studies.

Besides that, no blog posts here recently as I have been working on my CRIME De-Coder site, in particular developing a few additional javascript demo’s. My most recent one is a social network app applying my dominant set algorithm (to prioritize call-ins in a group violence/focused deterrence intervention) (Wheeler et al., 2019).

The javascript apps are very nice, as they are all client side – my website just serves the text files, and your local browser does all the hard work. I don’t need to worry about dealing with LEO sensitive data in that scenario either.

I am still learning a ton of website development (will have some surveys deployed using PHP + google sheets here soonish on CRIME De-Coder). Debate on whether it is worth writing up blog posts here. The javascript network application is almost a 1:1 translate of my python code. Vectorized stuff I don’t know much about doing in javascript, but the network algorithm stuff is mostly just dictionaries, sets, and loops. If interested, you can just right click on the browser when the page is open and inspect the source.

References

  • Clark, B., Ariel, B., & Harinam, V. (2022). How Should the Police Let Victims Down? The Impact of Reassurance Call-Backs by Local Police Officers to Victims of Vehicle and Cycle Crimes: A Block Randomized Controlled Trial. Police Quarterly, Online First.
  • Curtis-Ham, S., & Cantal, C. (2022). Locks, lights, and lines of sight: an RCT evaluating the impact of a CPTED intervention on repeat burglary victimisation. Journal of Experimental Criminology, Online First.
  • Henning, Kris et al. 2023. The Impact of Online Crime Reporting on Community Trust, Police Chief Online, April 12, 2023
  • Wheeler, A. P., McLean, S. J., Becker, K. J., & Worden, R. E. (2019). Choosing representatives to deliver the message in a group violence intervention. Justice Evaluation Journal, 2(2), 93-117.

Criminology not on the brink

I enjoy reading Jukka Savolainen’s hot takes, most recently Give Criminology a Chance: Notes from a discipline on the brink. I think Jukka is wrong on a few points, but if you are a criminologist who goes to ASC conferences, definitely go and read it! To be specific, in addition to the title here are two penultimate paragraphs in full that I mostly disagree with:

I arrived in Atlanta with a pessimistic view of academic criminology. During my 30 years in the field, the scholarship has become increasingly political and intolerant of evidence that contradicts the progressive narrative. The past few years have been particularly discouraging for those who care about scientific rigor and truth. Despite these reservations, I approached the ASC meeting with an open mind.

The situation is far from hopeless. True, criminology possesses precious little viewpoint diversity. Much of the scholarship is more interested in pursuing a political agenda than objective truth. The ASC’s outward stance as a politically neutral arbiter of scientific evidence is at odds with its recent history as an activist organization.

Although his take on a generic American Society of Criminology experience is again not misleading and accurate, I am not so sure about the assessment of the trend over time, e.g. “increasingly political and intolerant”. Nor do I think criminology has too “little viewpoint diversity”.

The latter statement is to be frank absurd. For those who haven’t been to an ASC conference, there are no restrictions to who can become a member of the American Society of Criminology. The yearly conference is essentially open as well – you have to submit an abstract for review, but I have never heard of an abstract being turned down (let me know if you are aware of an example!) So you really get the kaleidoscope (as Jukka articulated). Policing scholars, abolitionists, quantitative, qualitative, ghost criminology – criminologists are a heterogeneous bunch.

About the only way to steelman the statement “precious little viewpoint diversity” is to say something more like certain opinions in the field are rewarded/punished, such as being in advanced positions at ASC, or limiting what gets published in the ASC journals (Criminology or Criminology and Public Policy). Or maybe that the average mix of the field slants one way or another (say between pro criminal justice or critical criminal justice).

I have not been around 30 years like Jukka, and I suppose I lost my card carrying criminologist privileges when I went to the private sector, but I haven’t seen any clear change in the nature of field, the ASC conference, or what has been published, in the last ~10 years I have been in a reasonable position to make that judgment. I think Jukka (or anyone) would be hard pressed to quantify his perception – but certainly open to real evidence if I am wrong here (again just my opinion based on fewer years of experience than Jukka).

As a side story, I have heard many of my friends who do work in policing state that they have been criticized for that by colleagues, and subsequently argue our field is “biased against cops”. I don’t doubt my friends personal experiences, but I have personally never been criticized for working with the police. I have been criticized by fellow policing scholars as “downloading datasets” and “not being a real policing scholar”. I know qualitative criminologists who think they are biased against in the field (based on rates of qualitative publishing). I know quantitative criminologists who have given examples of bias in the field against more rigorous empirical methods. I know Europeans who think the field is biased towards Americans. I bet the ghost criminologists think the living are biased against the (un)dead.

I think saying “Much of the scholarship is more interested in pursuing a political agenda than objective truth” is a tinge strong, but sure it happens (I am not quite sure how to map “much” to a numeric value so the statement can be confirmed or refuted). I would say being critical of some work, but then uncritically sharing equally unrigorous work that confirms your pre-conceived notions is an example of this! So if you think one or more is “much”, then I guess I don’t disagree with Jukka here – to be clear though I think the majority of criminologists I have met are interested in pursuing the truth (even if I disagree with the methods they use).

So onto the last sentence of Jukka’s I disagree with, “The ASC’s outward stance as a politically neutral arbiter of scientific evidence is at odds with its recent history as an activist organization.”. But I disagree with this because I personally have a non-normative take on science – I don’t think science is wholly defined by being a neutral arbiter of truth, and doing science in the real world literally involves things that are “activist”.

I believe if you asked most people with Phds what defines science, they would say that science is defined via the scientific method. I personally think that is wrong though. I think about the only thing we share as scientists are being critique-y assholes. The way I do my work is so different from many other criminologists (both quantitative and qualitative), let alone researchers in other scientific fields (like theoretical physics or history), that I think saying we “share a common research method” is a bit of a stretch.

When my son was younger and had science fairs, they were broken into two different types of submissions; traditional science experiments, like measure a plants growth in sunlight vs without, or engineering “build things”. The academic work I am most proud of is in the engineering “build things” camp. These modest contributions in various algorithms – a few have been implemented in major software, and I know some crime analysis units using that work as well – really have nothing to do with the scientific method. Me deriving standard errors for control charts for crime trends is only finding truth in a very tautological way – I think they are useful though.

There is no bright line between my work and “activism” – I don’t think that is a bad thing though and it was the point of the work. You could probably say Janet Lauritsen is an activist for more useful national level CJ statistics. Jukka appears to me to be making normative opinions about he thinks Janet’s activism is more rigorously motivated than Vitale’s – which I agree with, but doesn’t say much if anything about the field of criminology as a whole or recent changes in the field. (If anything it is evidence against Jukka’s opinion, I posit Janet is clearly more influential in the field than Vitale.)


To end with the note “on the brink” – it may be unfair to Jukka (sometimes you don’t get to pick your titles in magazine articles). Part of the way I view being an academic and critiquing work I imagine people find irksome – it involves taking real words people say, trying to reasonably map them to statements that can be confirmed or refuted (often people say things that are quite fuzzy), and then articulating why those statements are maybe right/maybe wrong. It can seem pedantic, but I am a Popper kind-of-guy, and being able to confirm or refute statements I think is the only way we can get closer to objective truth.

To do this with “on the brink” takes more leaps than statements such as “increasingly political and intolerant”. “Criminology” is the general study of criminal behavior – which I am pretty confident will continue on as long as people commit crimes with or without the ASC yearly conference. We can probably limit the “on the brink” statement to something more specific like the American Society of Criminology on the brink. I don’t know about the ASC financials, but I am going to guess Jukka meant by this statement more of a proclamation about the legitimacy of the organization to outside groups.

I am not so sure this is the point of ASC though – it derives its value by being a social club for people who do criminology research. At least that is my impression of going to ASC conferences from my decade as a criminologist. Part of Jukka’s point is that things are getting worse more recently – you can’t lose something you never had to begin with though.