How long to conduct your experiment: Talk at ASEBP

Upcoming at the American Society of Evidence Based Policing Conference, I have a talk Thursday morning (9:45-10:00), How long to conduct your experiment.

The talk goes over some of the simple metrics I have created to help plan how long to conduct your intervention. Such as how long to evaluate your hot spots intervention, or purchase to increase arrest rates, etc.

I have prepared a ton of different resources. The main one is a web-based application (a WASM-based app with R as the backend) where you can enter your inputs and generate a graph showing how precise your parameter estimates are:

The help page includes citations and additional materials, but here is a brief rundown:

  • I have the math details in this github repo, see the methodology.pdf. It also includes notes on how I used different LLM tools to produce the webpage and the method materials. Each of the applications allows you to download the R code used to generate the graphs and tables.

  • I have created a series of YouTube videos demonstrating the application (WDD, IRR, Proportion tests)

  • I have posted my slides for the ASEBP talk

See you all in DC at ASEBP in a few weeks!

xAI voice cloning API

xAI has just released an API to clone your voice. It is pretty simple, read a script, and then an API where you can have text to speech in that voice.

Here is the python code after you have cloned your voice.

import os
import requests
voice_id = os.environ['ANDY1_VOICE'] # my demo voice ID
text = '''this is a test demo of my voice. Be excited!
OK, how about a list of things; one, two, three.
Lets see where this takes us.'''
response = requests.post(
"https://api.x.ai/v1/tts",
headers={
"Authorization": f"Bearer {os.environ['XAI_API_KEY']}",
"Content-Type": "application/json",
},
json={
"text": llm_book,
"voice_id": voice_id,
"language": "en",
},
)
response.raise_for_status()
with open("AndyTest1.mp3", "wb") as f:
f.write(response.content)

I need to figure out my audio set up a bit better (my mic set up is probably not optimal and it produces some echo). But does a good job imitating my boring voice right out of the box!

And here is an example for longer speech from my intro to LLMs book:

# intro to llm book
llm_book = '''
Large language models (LLMs) are transforming how we work. Some of these examples include using LLMs to help write computer code, using LLMs to extract out information from irregular text sources, and creating chat-bots that can interact with various data sources and documents.
Most analysts, however, do not have any experience with these tools. This book is meant to be a general introduction to realistic examples of how individuals can use these tools; either in general software applications, or to help analysts write code to create software itself. Given the rapid pace of advancement in this area, a general introduction to help individuals who work in the knowledge economy understand the capabilities of these tools I believe is in order.
Here is a simple example of using an LLM API (*Application Programming Interface* -- just a standard way to send information and get information back on the web) using the anthropic library in python to extract key information from a free text crime narrative:
'''
response = requests.post(
"https://api.x.ai/v1/tts",
headers={
"Authorization": f"Bearer {os.environ['XAI_API_KEY']}",
"Content-Type": "application/json",
},
json={
"text": llm_book,
"voice_id": voice_id,
"language": "en",
},
)
response.raise_for_status()
with open("AndyTest_LLMIntro.mp3", "wb") as f:
f.write(response.content)

The LLM intro messed up *Application Programming Interface* section (start listening at 50 seconds in). But otherwise it is very nice.

For those worried about security, xAI did something smart here — you need to input text live into the API given their prompts. You cannot have a pre-recording audio input to do this. So cloning someone elses voice is pretty hard.

Costs are around $4 per million characters in the text to speech API. So say narrating my entire book should be under $10 I believe.

Took me a total of less than an hour to set up a voice, create the python code, and write this blog post!

Pangram is good

Many of the initial wave of “AI writing detectors” were quite bad. The biggest issue you need to be concerned about with an AI writing detector is false positives. If you are a professor and want to check students’ writing, it is very bad to falsely accuse a student.

The Pangram product, though, is quite good, and I suggest folks check it out.

The other main competitor on the market, GPTZero, is clearly lower quality (such as saying the Constitution is AI generated).

GPTZero in their documentation says they are the most accurate AI detector. One of the reasons you don’t really care about accuracy is that you cannot know the underlying rate of AI writing in any corpus except in the scenario where it is artificially generated. And that is the only scenario in which you can know the accuracy for sure. What you care about is specifically the false positive rate and the false negative rate.

Unlike GPTZero, Pangram appears to have very low false positive rates. A simple way to estimate the false positive rate is to just submit writing prior to 2022 to the tool and see how many it flags as AI. ChatGPT came out in late 2022, the tools to generate writing before that were just not even close for people to use in any serious way. So any writing flagged in the older corpus as AI is a false positive.

Here is an example examining legal briefs.

It is an independent assessment. We cannot really know the capture rate (were there more than 66 briefs generated via LLMs in that sample). We can know the false positive rate though. And it is 1/800 in this sample with Pangram.

Pangram says it has a 1 in 10,000 false positive rate across a wide array of writing samples. They even report in their own internal tests that GPTZero has a 2% false positive rate (I am pretty sure GPTZero’s false positive rate is much higher than 2%, hence the Constitution error.)

Many other checks for false negative rates involve people having various models generate writing and then classifying it. It is hard to know if those are very good benchmarks for estimating the false negative rate. But we can easily estimate the false positive rate, and in that respect Pangram is clearly better than other AI writing detectors on the market.

Should we care if writing is AI?

I have used AI tools to help me write. I promise to be forthcoming if I use AI to help me write any substantive sections of writing (in blog posts, books, social media posts, etc.) Currently I am almost always using the LLMs to copy-edit, which is often simply a prompt “check for spelling and grammar issues”.

I do not use it all the time for writing. This post was all written by hand (and then just copy-edited with Gemini CLI).

It is really not that hard to bring your own voice and use AI to aid your writing. Have the LLM read your prior work, then give it a detailed outline, and then iterate. See my transcript on a prior post for an example.

I’d note I have used Pangram to see if my LLM writing is too obviously AI, and it is not. To me, when the writing is clearly AI, this often signals a clear lack of care and effort in the writing. AI writing can be valuable, but it is quite frequently low value slop.

So you get people larping as tech experts.

You can trivially have Claude or whatever software write a Skill file, and then have an LLM write how it is super awesome. This does not make it so.

And you have salespeople write posts that literally make no sense.

This, to be clear, is obviously AI slop.

So these individuals could actually generate useful content if they spent any more than a trivial amount of time. But they don’t, and it shows.

Gathering interest in tech courses

Quick post this morning — I have a survey up gathering input on interest in short, technical courses.

Think 2-3 days, potentially in person/synchronous.

If you have taken a course with Paul Allison at Horizon’s, or an ICPSR summer course, those are similar examples. But, the main difference will be these courses are to prepare you for pursuing private sector roles.

These will be aimed at:

  • grad level social science students
  • current professors looking to pursue private sector roles
  • current data analysts looking to get into data science
  • undergrads with some more technical background

Survey lists potential courses (python for data analysis, intro to LLM APIs, SQL + Dashboards, using agent based tools for analysis), the course medium (in person vs video), price points.

If you are a university or organization interested in hosting such sessions for your students, let me know as well. Happy to chat to you about bringing this to your campus.

LinkedIn Premium Does Not Boost your Posts

One of my connections mentioned in a post on LinkedIn that since he turned off Premium, his posts have been getting less engagement. Since LinkedIn offers a month for free, and I have been trying to promote my recent book, I figured I would try my free month trial and see how many more views I could get. (Here I am not worried about Premium for applying to new jobs, it is possible it is totally worth it for that, I was not applying to jobs in this test so I do not know.)

Long story short, LinkedIn Premium does not appear to promote my material at all above the baseline.

Post Views

In a sample of 30 posts the month before I turned on Premium (turned on 3/24 in the evening, turned off 4/22 in the morning), my posts had an average of 3600 views (with a standard deviation of 7000, median 1400). Post-Premium, I had 23 posts, and the views were on average 2200 (SD 2900, median 900). Here is the full table of posts and links (Premium=1 means it was posted when my Premium subscription was turned on):

| Premium | Views | URL |
| ----:|----- :|:----- |
| 0    | 3659  | https://www.linkedin.com/posts/andrew-wheeler-46134849_llms-have-transformed-the-data-science-industry-activity-7426975341572984832-HGTA |
| 0    | 2526  | https://www.linkedin.com/posts/andrew-wheeler-46134849_no-guarantees-but-i-am-going-to-try-to-start-activity-7428418993553846272-vdjA    |
| 0    | 2290  | https://www.linkedin.com/posts/andrew-wheeler-46134849_much-of-the-hype-around-claude-code-is-having-activity-7428781380391567360-zkXr   |
| 0    | 545   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-benefits-of-my-epub-version-of-activity-7429143771109302272-_V6T       |
| 0    | 1454  | https://www.linkedin.com/posts/andrew-wheeler-46134849_claude-code-has-the-ability-to-create-hooks-activity-7429506167527088128-01Um     |
| 0    | 1326  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-prompting-flows-i-find-convenient-activity-7429868558794436609-SDgG    |
| 0    | 1278  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-main-focuses-in-the-book-is-not-activity-7430230940444057600-pqK-      |
| 0    | 5988  | https://www.linkedin.com/posts/andrew-wheeler-46134849_while-skills-in-claude-code-are-all-the-rage-activity-7430955707358818304-iqjb    |
| 0    | 1726  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-mistakes-i-see-with-agent-based-activity-7431318102212100096-rI_W      |
| 0    | 1172  | https://www.linkedin.com/posts/andrew-wheeler-46134849_from-my-experience-as-an-educator-when-presenting-activity-7431680485585580032-8h7B    |
| 0    | 1360  | https://www.linkedin.com/posts/andrew-wheeler-46134849_although-the-llm-tools-are-currently-focused-activity-7432042882770944000-AfAG    |
| 0    | 5304  | https://www.linkedin.com/posts/andrew-wheeler-46134849_when-i-was-a-professor-at-ut-dallas-i-sat-activity-7432405268874817536-qrAI       |
| 0    | 30732 | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-know-a-few-stats-folks-in-my-network-that-activity-7432767666781679617-HXnk |
| 0    | 1003  | https://www.linkedin.com/posts/andrew-wheeler-46134849_claude-code-does-not-have-an-image-model-activity-7433492422111776768-2dqY        |
| 0    | 884   | https://www.linkedin.com/posts/andrew-wheeler-46134849_while-i-have-a-section-in-the-book-devoted-activity-7433854815862013952-FIl-      |
| 0    | 888   | https://www.linkedin.com/posts/andrew-wheeler-46134849_in-the-book-i-have-a-dedicated-chapter-on-activity-7434217215450681344-hk3E       |
| 0    | 1868  | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-llm-book-is-compiled-using-quarto-so-activity-7434579595095580673-RHdx        |
| 0    | 807   | https://www.linkedin.com/posts/andrew-wheeler-46134849_llms-for-mortals-how-to-view-the-epub-activity-7434945455073169409-bNDu           |
| 0    | 1243  | https://www.linkedin.com/posts/andrew-wheeler-46134849_section-on-using-gliner-for-ner-activity-7435304382931513344-NPPC                 |
| 0    | 1745  | https://www.linkedin.com/posts/andrew-wheeler-46134849_my-first-book-data-science-for-crime-analysis-activity-7436029137695416320-aRmr   |
| 0    | 914   | https://www.linkedin.com/posts/andrew-wheeler-46134849_so-the-new-book-large-language-models-for-activity-7436376426100199424-1g8E       |
| 0    | 1593  | https://www.linkedin.com/posts/andrew-wheeler-46134849_agentic-coding-apps-like-claude-code-and-activity-7436738847054512128-qiXz        |
| 0    | 3415  | https://www.linkedin.com/posts/andrew-wheeler-46134849_many-people-are-turned-off-by-ai-writing-activity-7437101213717909504-QjOo        |
| 0    | 928   | https://www.linkedin.com/posts/andrew-wheeler-46134849_pretty-much-every-day-there-is-a-new-prompt-activity-7437463609401794560-50H-     |
| 0    | 2185  | https://www.linkedin.com/posts/andrew-wheeler-46134849_much-of-the-hype-around-skills-is-imo-people-activity-7437826000958337024-t6i3    |
| 0    | 800   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-benefits-of-my-llm-for-mortals-activity-7438550758763098112-hQUw |
| 0    | 870   | https://www.linkedin.com/posts/andrew-wheeler-46134849_large-language-models-for-mortals-preview-activity-7438913140639207424-tAli       |
| 0    | 1948  | https://www.linkedin.com/posts/andrew-wheeler-46134849_new-blog-post-using-claude-code-to-help-activity-7441087469388861440-TCKq         |
| 0    | 1160  | https://www.linkedin.com/posts/andrew-wheeler-46134849_given-all-the-rage-with-generative-ai-and-activity-7441449857480933377-uPFw       |
| 0    | 27842 | https://www.linkedin.com/posts/andrew-wheeler-46134849_stop-teaching-r-teach-python-when-i-was-activity-7441812266938826753-DywF         |
| 1    | 526   | https://www.linkedin.com/posts/andrew-wheeler-46134849_forecasting-the-future-is-difficult-especially-activity-7442537064803368960-qsVO  |
| 1    | 13096 | https://www.linkedin.com/posts/andrew-wheeler-46134849_when-using-llms-to-do-structured-data-extraction-activity-7442899426471407617-CpZz |
| 1    | 2394  | https://www.linkedin.com/posts/andrew-wheeler-46134849_ive-spoken-with-many-people-who-are-concerned-activity-7443039100477145090-v_3i   |
| 1    | 646   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-main-audience-my-book-large-language-activity-7443261810511757312-R2Jc        |
| 1    | 3030  | https://www.linkedin.com/posts/andrew-wheeler-46134849_for-the-folks-that-were-not-happy-with-my-activity-7443401497444409344-q38H       |
| 1    | 437   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-current-capabilities-of-googles-activity-7443624184120754176-YHbw      |
| 1    | 5275  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-error-i-am-seeing-devs-continually-make-activity-7443986571650973696-TZ-K     |
| 1    | 3815  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-biggest-issues-with-using-generative-activity-7444348969100664832-pX0v |
| 1    | 738   | https://www.linkedin.com/posts/andrew-wheeler-46134849_reports-of-rags-demise-are-overstated-activity-7444711358421491712-6BJu           |
| 1    | 425   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-recent-litellm-distribution-attack-highlights-activity-7445073747968999424-NEmn    |
| 1    | 3752  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-responses-to-me-writing-the-book-activity-7445436130080186369-7Fvx     |
| 1    | 1670  | https://www.linkedin.com/posts/andrew-wheeler-46134849_professors-that-follow-me-i-am-happy-to-activity-7446160904217407488-CXcZ         |
| 1    | 918   | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-have-used-claude-code-the-longest-probably-activity-7447248077435920385-Ym8A    |
| 1    | 876   | https://www.linkedin.com/posts/andrew-wheeler-46134849_gio-has-a-new-post-out-on-examining-confidence-activity-7448697613513740289-hObZ  |
| 1    | 1959  | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-mythos-technical-blog-post-on-its-cybersecurity-activity-7449059395071709184-UJOC  |
| 1    | 2201  | https://www.linkedin.com/posts/andrew-wheeler-46134849_for-folks-that-use-jupyter-notebooks-one-activity-7449422388691243008-Twxn        |
| 1    | 626   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-recommendations-i-have-in-the-activity-7450147165781426177-vceO        |
| 1    | 333   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-term-agent-is-almost-always-used-as-activity-7450509557677625344-42Ze         |
| 1    | 5787  | https://www.linkedin.com/posts/andrew-wheeler-46134849_agent-based-systems-require-bad-python-code-activity-7450871941458190336-gzSl     |
| 1    | 468   | https://www.linkedin.com/posts/andrew-wheeler-46134849_broadly-there-are-two-types-of-agent-based-activity-7451234329390678016-ZU1h      |
| 1    | 1267  | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-get-periodically-asked-what-is-the-best-activity-7451596716354707456-aFMF       |
| 1    | 346   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-saying-a-picture-is-worth-a-1000-words-activity-7451959111132299264-Tk7o      |
| 1    | 480   | https://www.linkedin.com/posts/andrew-wheeler-46134849_it-is-important-to-have-independent-benchmark-activity-7452318150701953024-ftcM        |

I would have expected a multiplier (e.g. typically 3k views, now you have 6k or 9k views per post). So you could nitpick that I have differential timing for the posts, and the pre-premium posts have some contamination (if they promoted my older posts when I activated Premium). But those are not large enough to make a difference in my findings relative to what I expected.

The posts are quite comparable in content, mostly focused on my book and LLMs. It is possible my audience is oversaturated with that content, but I think it is just as likely that LinkedIn Premium doesn’t really promote your work to any substantive extent. (I have additionally obtained more followers in this period, so that should bias the results to have more views, not less.) At least here there is no evidence I should continue to pay $20 a month to increase my reach on LinkedIn.

Posts are bursty, and in the end I have very little ability to forecast what will or will not be popular. In the pre-period, my most popular post was on a blog post I did on log-probabilities (30k views). I definitely try to post more technical stuff on LinkedIn than the typical social media influencer, so that limits the reach.

I also had a rage-bait post on professors should teach python and not R with just under 30k views. (That was a bit of social media manipulation – have a controversial opinion that divides people, you get a bunch of thumbs up and a bunch of comments.) I do not have that many potential rage-bait post topics!

In addition to this, I also did the month for free for LinkedIn Premium for my business Crime De-Coder page. The same with my business page, I did not see any increased views, increased followers, etc.

Profile Views

Although I have not seen LinkedIn explicitly say Premium boosts your posts (besides actually paying for advertising), I have seen LinkedIn explicitly advertise that Premium profiles get more views:

So how do profile views look? I did get more the week I signed up, but it was trending upward previously, and reverted to the trend after week one anyway. (A few days short, I cannot access the chart week by week since turning off Premium.)

For a bit of background, I spent most of my time posting on my LinkedIn business Crime De-Coder page, and only posted on my personal page maybe once or twice a month. But since publishing LLMs for Mortals (in February of 2026), I have posted more on my personal. Which you can see increased my profile views before I signed up for Premium.

Likely the past additional profile views are for that rage-bait Python vs R post that was popular, not due to anything Premium did.

This appears to be extremely misleading advertising on LinkedIn’s part. If they just look at Premium vs not, it is likely Premium users are more active. This should just say the explicit “boost” profile views get, like ranked higher in searches.

$100 ad credit

With premium, you get $100 ad credit for posts a month. I used this to boost my original LLM for Mortals launch post, which was stale at that point and not accumulating any additional views.

The metrics on the post were as LinkedIn said they would be. Despite having 80+ likes when I first created it, the post only had 3700 views. Spending $100 on the credits got me an additional ~3500 views and supposedly ~50 additional website clicks. (I am confused how this is calculated, as I can see the actual link in the post was clicked fewer than 10 additional times with the campaign.)

I knew going in that adverts on LinkedIn are not a net benefit given my book purchase conversion rates. What I will call “high trust” referrals, I have something like a 1/100 purchase rate for the book. For other mediums, it is more like 1/1000. As far as I can tell, these seem pretty typical for a higher dollar value book purchase ($50+).

I have debated on setting the purchase price for the epub to much lower. $50 is in line with current offerings from O’Reilly, and in my informal demand curve tests is where I think it should be. But I don’t think any realistic conversion rate would make LinkedIn advertising make sense for my book.

For reference for influencers though, this gives a rough estimate comparable to LinkedIn’s direct advertising. Basically my average post is worth $100 according to LinkedIn. I only have around 3k followers currently on LinkedIn, so I imagine folks with followings 10x that can likely do direct advertisements to their audiences for more like $1k and up.

Wrap Up

I still think LinkedIn is the best social media site currently to promote my work and business. It is not just about the raw view counts, but also about conversion to people buying my book or reaching out for additional consulting gigs.

I will continue to use LinkedIn for this, but paying for a Premium LinkedIn account does not appear to be worth it for these reasons. Even if the views were increased, it is possible that they are not good connections for these end goals.

There are additional things you get with Premium (can send cold messages to people you are not connected to, supposedly higher priority when applying to jobs). Those are maybe worth the $20 a month for some people. But focusing on what LinkedIn advertises for “boosting” your posts and profile, I did not personally see any evidence that would justify spending even $1 a month for the Premium features.

Job Advice Resources page

Minor update, I have created a page, Job Advice Resources to cumulatively list all the materials I have written on advice for social scientists and crime analysts looking to pivot into private sector tech roles.

I still get maybe ~2 folks a month ask for advice, and I am always happy to chat. I wish PhD granting institutions took this more seriously (it only takes minor changes to better prepare students).

If you are an administrator of a PhD program and actually care about getting your students jobs, also feel free to reach out and I am happy to discuss how I can help.

The race to the bottom with AI tools

What we are seeing in the AI startup space is a perfect example of the “no moat” problem: if your core product is essentially just clever prompt engineering wrapped around someone else’s frontier model, it is trivially easy for a competitor to reverse-engineer your workflow and undercut your price. Over the last few months, this lack of a defensible moat has triggered a rapid race to the bottom in automated peer review, moving from expensive managed services to open-source “bring your own key” (BYOK) scripts.

Here I am going to look at three tools specifically designed to review academic papers: Refine, IsItCredible, and Coarse.

Overview of the Tools

Refine: Refine positions itself as a premium, rigorous option for institutions, boasting testimonials from Ivy League professors and a high price point of $49.99 per review. It uses what it calls “massive parallel compute” to make hundreds of LLM calls to stress-test every line of a document.

IsItCredible: Built on the open-source Reviewer 2 pipeline, IsItCredible offers a standardized, pay-per-use middle ground with core reports starting at $5. It employs a clever “adversarial” architecture where “Red Team” agents try to find flaws and a “Blue Team” verifies them to prevent hallucinations.

Coarse: Coarse represents the logical endpoint of this race as an open-source “Bring Your Own Key” (BYOK) tool that lets you run complex multi-agent reviews locally or via OpenRouter. Because users pay the API costs directly instead of a markup, a comprehensive paper review is significantly cheaper.

The “LLM as a Judge” Problem

The hardest part of all this is evaluation. How do you know if the AI reviewer is actually good?

Refine relies almost entirely on anecdotal evidence. Their own FAQ essentially tells you to just try it and see the difference for yourself, claiming that general-purpose chatbots cannot match their depth even with expert prompting. This “try it yourself” approach is effective for marketing, but it isn’t a hard benchmark.

IsItCredible and Coarse are trying to be more systematic. The IsItCredible team released a paper, Yell at It: Prompt Engineering for Automated Peer Review, where they benchmarked their tool against five alternatives. They claim 15 wins out of 20 pairings. Similarly, Coarse claims to have been “blind-evaluated” against Refine and Reviewer 2, scoring higher on coverage and specificity.

However, we are still largely in the “LLM as a judge” era. These benchmarks often use another LLM to decide which review is better. It is circular logic. Until we have a “Ground Truth” dataset of known mathematical errors or logical fallacies in published papers, we are just measuring which AI writes the most convincing-sounding critique.

Because evaluation is so difficult, this software category risks becoming a classic market for lemons. It is incredibly difficult to identify substantive differences in quality between these tools without some external, hard benchmark. To truly evaluate if Refine’s expensive managed service is meaningfully better than Coarse’s open-source BYOK run, you have to verify the AI’s claims. But verifying those claims requires spending just as much time reading and reviewing the original paper as you would have spent just doing the review yourself from scratch. Without transparent benchmarks, users cannot easily distinguish high-quality rigorous analysis from convincing hallucinations, driving the market toward the cheapest option by default.

For those building AI tools, this entire space serves as a warning about the race to the bottom. I have previously written about deep research tools as another example of this phenomenon. If your only value proposition is a well-orchestrated prompt chain, open-source alternatives will inevitably compress your margins to zero. Eventually, the native GUI interfaces of the frontier models themselves may just become good enough that your specialized service isn’t even needed.

Meta

Did you like this post? Guess what, it was entirely generated via the Google’s API models (specifically the gemini cli). I have saved the chat session and log for how long it took here. You can see for yourself, I had a broad idea, asked it to review different materials, and then generate a post. I then iterated 25 minutes from start to finish in total.

The original post also is not flagged by Pangram as AI generated.

It definitely is not 100% my style (and to be clear this meta section is 100% hand written). The final paragraph about deep research tools I also struggled to get the model to say what I wanted – I wanted it to say “deep research tools are another example where this same situation will occur”. I am keeping the original 100% AI generated post for posterity though for folks to see what is possible with the current tools.

Policing Scholars should join ASEBP

Cross-posted on my Crime De-Coder blog.

I will be giving a talk at the upcoming American Society of Evidence Based Policing (ASEBP) conference (registration link here, May 20th-22nd in DC). My talk is How long to conduct your experiment? Check it out Thursday morning – I specifically asked for one of the short talks; 15 minutes is plenty to get the gist.

ASEBP Conference Flyer, 2026 in DC

I will be sharing a web-app to go with the talk soon (you can see my WDD tool and this blog post for background), but wanted to write a more general post about why researchers (as well as police officers who are interested in professionalization of the field) should join ASEBP.

To start, I have been involved in various ways with ASEBP for several years now, but I do not have any financial ties to ASEBP. I currently volunteer on the committee that reviews conference talks.

ASEBP is clearly the best organization for policing scholars currently in the country. The other main criminological societies (the American Society of Criminology and the Academy of Criminal Justice Sciences) are operating much as they did 30 years ago. Mostly they only exist to run journals and have a yearly conference where anyone can give a talk. They are incredibly insular, and have basically zero input from practitioners.

You can go and just look at the talks for ASC and ACJS – they are basically irrelevant to the vast majority of criminal justice operations (not only in policing, but in the CJ field as a whole). You can go look at the talks for the ASEBP conference and see they have a much clearer focus on realistic topics police departments are interested in, but presented by legitimate researchers and practitioners.

For scholars, I have developed working relationships with departments through multiple police practitioners I have met through ASEBP – and I hope to make more!

ASEBP was started by Renee Mitchell with a clear goal in mind – Renee is really the modern-day version of August Vollmer. ASEBP is intended to be a rigorous (unlike ASC, which allows almost anyone to present) conference and organization (ASEBP has training opportunities as well) to advance the use of evidence in policing operations.

If you think “I am not a policing researcher”, but have anything to do at all with criminal justice, feel free to get in touch. (Crime analysts should definitely join.) I have ideas to expand the organization – nothing equivalent currently exists in other parts of the criminal justice system as well. Being evidence-based is really the core of what Renee and everyone else is building.

If you are going to the conference and want to meet up, feel free to send me an email, andrew.wheeler@crimede-coder.com, and I will find a time to get a coffee while we are in DC.

Interview on LEAP about LLMs for Mortals

I was recently interviewed by Jason Elder on the Law Enforcement Analysts Podcast about my new book, Large Language Models for Mortals: A Practical Guide for Analysts.

Jason does an excellent job with interviewing (and does a quality editing job with audio), so suggest to follow that if you are a crime analyst or researcher working with police departments.

Basically cover large swaths of the book, through basics of APIs, structured info extraction, some high level discussion of RAG, and how AI coding tools still need a bit of human oversight and direction. Even if you are not a coder, I think picking up a copy is a good idea to get an understanding about what is possible with the current tools.

Just to catalog the different coupon codes for the book:

  • LLMDEVS to get 50% off of the epub
  • TWOFOR1 to get $30 off when purchasing two books (can be any two books)

I do give the first coupon code for the paperback version of the book in the interview. So take a listen if interested in $20 off the paperback.

You can purchase either epub or paperback from my store worldwide.

Stop Teaching R. Teach Python.

There has been a slight transition in social science teaching since I have been a student and professor over the past ~15+ years. In the aughts, it was still common to teach students in legacy, closed source statistical software (SPSS, SAS, and Stata). When I was a PhD student in criminal justice at SUNY Albany, we had a specific class to learn SPSS, although most of the rest of the quantitative courses used Stata.

The R programming language has likely usurped the use of the closed source languages in social science education after the aughts though. (I do not have hard data, but that is my impression seeing what colleagues are using and what they teach in classes.)

I am familiar with all of the major statistical programs (I have written an R package, and you can see this blog for many examples of SPSS and a few for Stata). If the goal in coursework is to teach your students skills relevant to help them get a job, academics in social science institutions should teach their students Python. The current job market for quantitative work is dominated by Python positions.

To be clear, I am not fundamentally opposed to closed source programming languages (there are scenarios where SPSS/SAS make more sense than Hadoop systems I have seen, also if you are a GIS analyst you should learn ESRI tools). This is purely just an observation given the current private sector job market – focusing primarily on Python makes the most sense for social science students.

As an experiment, I went onto LinkedIn and did a search for “data scientist”. Your results will differ (mine are tailored to the Raleigh area, and also includes more senior positions), but here is a table of the positions that came up on the first page, and a quick summary of the tech stacks they require. While this is not a systematic sample, it gives a reasonable snapshot of current expectations.

| Company             | Job Title                           | Tech Stack                                           | URL                                            |
|---------------------|-------------------------------------|------------------------------------------------------|----------------------------------------------- |
| Google              | Data Scientist (Google Voice 2)     | Python, R, SQL                                       | https://www.linkedin.com/jobs/view/4387751995/ |
| Deloitte            | AI Specialist                       | None specified                                       | https://www.linkedin.com/jobs/view/4376183670/ |
| Ascensus            | Principal Analytics                 | R, Python, SQL, GenAI/LLM                            | https://www.linkedin.com/jobs/view/4380164400/ |
| EY                  | AI Lead Engineer                    | Python, C#, R, GenAI/LLM                             | https://www.linkedin.com/jobs/view/4385954762/ |
| PwC                 | GenAI Python Systems Engineer (2)   | Python, SQL, Cloud Platforms, GenAI/LLM              | https://www.linkedin.com/jobs/view/4373604638/ |
| Affirm              | Senior Machine Learning Engineer    | Python, Spark/Ray                                    | https://www.linkedin.com/jobs/view/4326673670/ |
| Lexis Nexis         | Lead Data Scientist                 | Cloud Platforms, GenAI/LLM                           | https://www.linkedin.com/jobs/view/4316327742/ |
| EY                  | AI Finance                          | SQL, Python, Azure, GenAI/LLM                        | https://www.linkedin.com/jobs/view/4385085950/ |
| Korn Ferry          | Sr. Data Scientist                  | Python, R, Spark, AWS, GenAI/LLM                     | https://www.linkedin.com/jobs/view/4387433496/ |
| Deloitte            | Data Science Manager                | Python, Cloud                                        | https://www.linkedin.com/jobs/view/4304674642/ |
| First Citizens Bank | Senior Quant Model Developer        | Python, SAS, SQL                                     | https://www.linkedin.com/jobs/view/4365378242/ |
| First Citizens Bank | Senior Manager Quant Analysis       | Python, SAS, Tableau                                 | https://www.linkedin.com/jobs/view/4388131284/ |
| Jobot               | ML Solution Architect               | Python, Scala, Spark, AWS, Snowflake                 | https://www.linkedin.com/jobs/view/4384023540/ |
| Affirm              | Analyst II                          | SQL, Python, R, CPLEX/Gurobi, Databricks/Snowflake   | https://www.linkedin.com/jobs/view/4373303038/ |
| Red Hat             | Sr Machine Learning Engineer (vLLM) | Python, GenAI/LLM                                    | https://www.linkedin.com/jobs/view/4354827922/ |
| Alliance Health     | Director AI                         | Python (TensorFlow/PyTorch), Office Products, GenAI  | https://www.linkedin.com/jobs/view/4383011480/ |
| Nubank              | ML Data Engineer                    | Python, Ray/Spark                                    | https://www.linkedin.com/jobs/view/4376815752/ |
| Target RWE          | Senior Quant Data Scientist         | R                                                    | https://www.linkedin.com/jobs/view/4385293724/ |
| Siemens             | Senior Data Analytics               | SQL, Python, R, Tableau/PowerBI                      | https://www.linkedin.com/jobs/view/4377969531/ |
| Red Hat             | Sr Machine Learning Engineer        | Python, GenAI/LLM                                    | https://www.linkedin.com/jobs/view/4302769773/ |
| Lexis Nexis         | Director Data Sciences              | Python, R, GenAI/LLM                                 | https://www.linkedin.com/jobs/view/4387335028/ |
| Cigna               | Data Science Senior Advisor         | Python, SQL                                          | https://www.linkedin.com/jobs/view/4381766145/ |
| Thermo Fisher       | Senior Manager Data Engineering     | Fabric, PowerBI, Python, Databricks, Tableau, SAS    | https://www.linkedin.com/jobs/view/4372684009/ |

Of the positions:

  • 9/25 roles included R, but only one required R exclusively. The other 8 were Python/SQL/R
  • 22/25 included Python
  • 11/25 had a focus on Generative AI or LLMs

Python dominates R in the current job market for data science positions. Professors are doing their students a disservice teaching R, the same way they would be doing a disservice teaching their students to code in Fortran.

Another aspect I noticed for this – analyst type jobs not all that long ago really only expected Excel (and maybe SQL). Now even the majority of the analyst jobs expect Python (even more so than dashboard tools like PowerBI in this sample).

For individuals on the job market, I suggest going and doing your own experiment job search like this on LinkedIn to see the tech skills you need to be able to at least get your foot in the door for an interview. I expected GenAI to be slightly more popular (only 11/25), but there were a few other technologies sprinkled in enough it may be good to become familiar with to widen your potential pool (Cloud and Spark – I am surprised Databricks was not listed more often).

If you’re looking to build Python skills from scratch, I cover this in my book: Data Science for Crime Analysis with Python (can purchase in paperback or epub at my store).

If also interested in learning about generative AI, see my book Large Language Models for Mortals: A Practical Guide for Analysts with Python.

You can use the coupon TWOFOR1 to get $30 off when purchasing multiple books from my store.