Pangram is good

Many of the initial wave of “AI writing detectors” were quite bad. The biggest issue you need to be concerned about with an AI writing detector is false positives. If you are a professor and want to check students’ writing, it is very bad to falsely accuse a student.

The Pangram product, though, is quite good, and I suggest folks check it out.

The other main competitor on the market, GPTZero, is clearly lower quality (such as saying the Constitution is AI generated).

GPTZero in their documentation says they are the most accurate AI detector. One of the reasons you don’t really care about accuracy is that you cannot know the underlying rate of AI writing in any corpus except in the scenario where it is artificially generated. And that is the only scenario in which you can know the accuracy for sure. What you care about is specifically the false positive rate and the false negative rate.

Unlike GPTZero, Pangram appears to have very low false positive rates. A simple way to estimate the false positive rate is to just submit writing prior to 2022 to the tool and see how many it flags as AI. ChatGPT came out in late 2022, the tools to generate writing before that were just not even close for people to use in any serious way. So any writing flagged in the older corpus as AI is a false positive.

Here is an example examining legal briefs.

It is an independent assessment. We cannot really know the capture rate (were there more than 66 briefs generated via LLMs in that sample). We can know the false positive rate though. And it is 1/800 in this sample with Pangram.

Pangram says it has a 1 in 10,000 false positive rate across a wide array of writing samples. They even report in their own internal tests that GPTZero has a 2% false positive rate (I am pretty sure GPTZero’s false positive rate is much higher than 2%, hence the Constitution error.)

Many other checks for false negative rates involve people having various models generate writing and then classifying it. It is hard to know if those are very good benchmarks for estimating the false negative rate. But we can easily estimate the false positive rate, and in that respect Pangram is clearly better than other AI writing detectors on the market.

Should we care if writing is AI?

I have used AI tools to help me write. I promise to be forthcoming if I use AI to help me write any substantive sections of writing (in blog posts, books, social media posts, etc.) Currently I am almost always using the LLMs to copy-edit, which is often simply a prompt “check for spelling and grammar issues”.

I do not use it all the time for writing. This post was all written by hand (and then just copy-edited with Gemini CLI).

It is really not that hard to bring your own voice and use AI to aid your writing. Have the LLM read your prior work, then give it a detailed outline, and then iterate. See my transcript on a prior post for an example.

I’d note I have used Pangram to see if my LLM writing is too obviously AI, and it is not. To me, when the writing is clearly AI, this often signals a clear lack of care and effort in the writing. AI writing can be valuable, but it is quite frequently low value slop.

So you get people larping as tech experts.

You can trivially have Claude or whatever software write a Skill file, and then have an LLM write how it is super awesome. This does not make it so.

And you have salespeople write posts that literally make no sense.

This, to be clear, is obviously AI slop.

So these individuals could actually generate useful content if they spent any more than a trivial amount of time. But they don’t, and it shows.

LinkedIn Premium Does Not Boost your Posts

One of my connections mentioned in a post on LinkedIn that since he turned off Premium, his posts have been getting less engagement. Since LinkedIn offers a month for free, and I have been trying to promote my recent book, I figured I would try my free month trial and see how many more views I could get. (Here I am not worried about Premium for applying to new jobs, it is possible it is totally worth it for that, I was not applying to jobs in this test so I do not know.)

Long story short, LinkedIn Premium does not appear to promote my material at all above the baseline.

Post Views

In a sample of 30 posts the month before I turned on Premium (turned on 3/24 in the evening, turned off 4/22 in the morning), my posts had an average of 3600 views (with a standard deviation of 7000, median 1400). Post-Premium, I had 23 posts, and the views were on average 2200 (SD 2900, median 900). Here is the full table of posts and links (Premium=1 means it was posted when my Premium subscription was turned on):

| Premium | Views | URL |
| ----:|----- :|:----- |
| 0    | 3659  | https://www.linkedin.com/posts/andrew-wheeler-46134849_llms-have-transformed-the-data-science-industry-activity-7426975341572984832-HGTA |
| 0    | 2526  | https://www.linkedin.com/posts/andrew-wheeler-46134849_no-guarantees-but-i-am-going-to-try-to-start-activity-7428418993553846272-vdjA    |
| 0    | 2290  | https://www.linkedin.com/posts/andrew-wheeler-46134849_much-of-the-hype-around-claude-code-is-having-activity-7428781380391567360-zkXr   |
| 0    | 545   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-benefits-of-my-epub-version-of-activity-7429143771109302272-_V6T       |
| 0    | 1454  | https://www.linkedin.com/posts/andrew-wheeler-46134849_claude-code-has-the-ability-to-create-hooks-activity-7429506167527088128-01Um     |
| 0    | 1326  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-prompting-flows-i-find-convenient-activity-7429868558794436609-SDgG    |
| 0    | 1278  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-main-focuses-in-the-book-is-not-activity-7430230940444057600-pqK-      |
| 0    | 5988  | https://www.linkedin.com/posts/andrew-wheeler-46134849_while-skills-in-claude-code-are-all-the-rage-activity-7430955707358818304-iqjb    |
| 0    | 1726  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-mistakes-i-see-with-agent-based-activity-7431318102212100096-rI_W      |
| 0    | 1172  | https://www.linkedin.com/posts/andrew-wheeler-46134849_from-my-experience-as-an-educator-when-presenting-activity-7431680485585580032-8h7B    |
| 0    | 1360  | https://www.linkedin.com/posts/andrew-wheeler-46134849_although-the-llm-tools-are-currently-focused-activity-7432042882770944000-AfAG    |
| 0    | 5304  | https://www.linkedin.com/posts/andrew-wheeler-46134849_when-i-was-a-professor-at-ut-dallas-i-sat-activity-7432405268874817536-qrAI       |
| 0    | 30732 | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-know-a-few-stats-folks-in-my-network-that-activity-7432767666781679617-HXnk |
| 0    | 1003  | https://www.linkedin.com/posts/andrew-wheeler-46134849_claude-code-does-not-have-an-image-model-activity-7433492422111776768-2dqY        |
| 0    | 884   | https://www.linkedin.com/posts/andrew-wheeler-46134849_while-i-have-a-section-in-the-book-devoted-activity-7433854815862013952-FIl-      |
| 0    | 888   | https://www.linkedin.com/posts/andrew-wheeler-46134849_in-the-book-i-have-a-dedicated-chapter-on-activity-7434217215450681344-hk3E       |
| 0    | 1868  | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-llm-book-is-compiled-using-quarto-so-activity-7434579595095580673-RHdx        |
| 0    | 807   | https://www.linkedin.com/posts/andrew-wheeler-46134849_llms-for-mortals-how-to-view-the-epub-activity-7434945455073169409-bNDu           |
| 0    | 1243  | https://www.linkedin.com/posts/andrew-wheeler-46134849_section-on-using-gliner-for-ner-activity-7435304382931513344-NPPC                 |
| 0    | 1745  | https://www.linkedin.com/posts/andrew-wheeler-46134849_my-first-book-data-science-for-crime-analysis-activity-7436029137695416320-aRmr   |
| 0    | 914   | https://www.linkedin.com/posts/andrew-wheeler-46134849_so-the-new-book-large-language-models-for-activity-7436376426100199424-1g8E       |
| 0    | 1593  | https://www.linkedin.com/posts/andrew-wheeler-46134849_agentic-coding-apps-like-claude-code-and-activity-7436738847054512128-qiXz        |
| 0    | 3415  | https://www.linkedin.com/posts/andrew-wheeler-46134849_many-people-are-turned-off-by-ai-writing-activity-7437101213717909504-QjOo        |
| 0    | 928   | https://www.linkedin.com/posts/andrew-wheeler-46134849_pretty-much-every-day-there-is-a-new-prompt-activity-7437463609401794560-50H-     |
| 0    | 2185  | https://www.linkedin.com/posts/andrew-wheeler-46134849_much-of-the-hype-around-skills-is-imo-people-activity-7437826000958337024-t6i3    |
| 0    | 800   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-benefits-of-my-llm-for-mortals-activity-7438550758763098112-hQUw |
| 0    | 870   | https://www.linkedin.com/posts/andrew-wheeler-46134849_large-language-models-for-mortals-preview-activity-7438913140639207424-tAli       |
| 0    | 1948  | https://www.linkedin.com/posts/andrew-wheeler-46134849_new-blog-post-using-claude-code-to-help-activity-7441087469388861440-TCKq         |
| 0    | 1160  | https://www.linkedin.com/posts/andrew-wheeler-46134849_given-all-the-rage-with-generative-ai-and-activity-7441449857480933377-uPFw       |
| 0    | 27842 | https://www.linkedin.com/posts/andrew-wheeler-46134849_stop-teaching-r-teach-python-when-i-was-activity-7441812266938826753-DywF         |
| 1    | 526   | https://www.linkedin.com/posts/andrew-wheeler-46134849_forecasting-the-future-is-difficult-especially-activity-7442537064803368960-qsVO  |
| 1    | 13096 | https://www.linkedin.com/posts/andrew-wheeler-46134849_when-using-llms-to-do-structured-data-extraction-activity-7442899426471407617-CpZz |
| 1    | 2394  | https://www.linkedin.com/posts/andrew-wheeler-46134849_ive-spoken-with-many-people-who-are-concerned-activity-7443039100477145090-v_3i   |
| 1    | 646   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-main-audience-my-book-large-language-activity-7443261810511757312-R2Jc        |
| 1    | 3030  | https://www.linkedin.com/posts/andrew-wheeler-46134849_for-the-folks-that-were-not-happy-with-my-activity-7443401497444409344-q38H       |
| 1    | 437   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-current-capabilities-of-googles-activity-7443624184120754176-YHbw      |
| 1    | 5275  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-error-i-am-seeing-devs-continually-make-activity-7443986571650973696-TZ-K     |
| 1    | 3815  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-biggest-issues-with-using-generative-activity-7444348969100664832-pX0v |
| 1    | 738   | https://www.linkedin.com/posts/andrew-wheeler-46134849_reports-of-rags-demise-are-overstated-activity-7444711358421491712-6BJu           |
| 1    | 425   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-recent-litellm-distribution-attack-highlights-activity-7445073747968999424-NEmn    |
| 1    | 3752  | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-responses-to-me-writing-the-book-activity-7445436130080186369-7Fvx     |
| 1    | 1670  | https://www.linkedin.com/posts/andrew-wheeler-46134849_professors-that-follow-me-i-am-happy-to-activity-7446160904217407488-CXcZ         |
| 1    | 918   | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-have-used-claude-code-the-longest-probably-activity-7447248077435920385-Ym8A    |
| 1    | 876   | https://www.linkedin.com/posts/andrew-wheeler-46134849_gio-has-a-new-post-out-on-examining-confidence-activity-7448697613513740289-hObZ  |
| 1    | 1959  | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-mythos-technical-blog-post-on-its-cybersecurity-activity-7449059395071709184-UJOC  |
| 1    | 2201  | https://www.linkedin.com/posts/andrew-wheeler-46134849_for-folks-that-use-jupyter-notebooks-one-activity-7449422388691243008-Twxn        |
| 1    | 626   | https://www.linkedin.com/posts/andrew-wheeler-46134849_one-of-the-recommendations-i-have-in-the-activity-7450147165781426177-vceO        |
| 1    | 333   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-term-agent-is-almost-always-used-as-activity-7450509557677625344-42Ze         |
| 1    | 5787  | https://www.linkedin.com/posts/andrew-wheeler-46134849_agent-based-systems-require-bad-python-code-activity-7450871941458190336-gzSl     |
| 1    | 468   | https://www.linkedin.com/posts/andrew-wheeler-46134849_broadly-there-are-two-types-of-agent-based-activity-7451234329390678016-ZU1h      |
| 1    | 1267  | https://www.linkedin.com/posts/andrew-wheeler-46134849_i-get-periodically-asked-what-is-the-best-activity-7451596716354707456-aFMF       |
| 1    | 346   | https://www.linkedin.com/posts/andrew-wheeler-46134849_the-saying-a-picture-is-worth-a-1000-words-activity-7451959111132299264-Tk7o      |
| 1    | 480   | https://www.linkedin.com/posts/andrew-wheeler-46134849_it-is-important-to-have-independent-benchmark-activity-7452318150701953024-ftcM        |

I would have expected a multiplier (e.g. typically 3k views, now you have 6k or 9k views per post). So you could nitpick that I have differential timing for the posts, and the pre-premium posts have some contamination (if they promoted my older posts when I activated Premium). But those are not large enough to make a difference in my findings relative to what I expected.

The posts are quite comparable in content, mostly focused on my book and LLMs. It is possible my audience is oversaturated with that content, but I think it is just as likely that LinkedIn Premium doesn’t really promote your work to any substantive extent. (I have additionally obtained more followers in this period, so that should bias the results to have more views, not less.) At least here there is no evidence I should continue to pay $20 a month to increase my reach on LinkedIn.

Posts are bursty, and in the end I have very little ability to forecast what will or will not be popular. In the pre-period, my most popular post was on a blog post I did on log-probabilities (30k views). I definitely try to post more technical stuff on LinkedIn than the typical social media influencer, so that limits the reach.

I also had a rage-bait post on professors should teach python and not R with just under 30k views. (That was a bit of social media manipulation – have a controversial opinion that divides people, you get a bunch of thumbs up and a bunch of comments.) I do not have that many potential rage-bait post topics!

In addition to this, I also did the month for free for LinkedIn Premium for my business Crime De-Coder page. The same with my business page, I did not see any increased views, increased followers, etc.

Profile Views

Although I have not seen LinkedIn explicitly say Premium boosts your posts (besides actually paying for advertising), I have seen LinkedIn explicitly advertise that Premium profiles get more views:

So how do profile views look? I did get more the week I signed up, but it was trending upward previously, and reverted to the trend after week one anyway. (A few days short, I cannot access the chart week by week since turning off Premium.)

For a bit of background, I spent most of my time posting on my LinkedIn business Crime De-Coder page, and only posted on my personal page maybe once or twice a month. But since publishing LLMs for Mortals (in February of 2026), I have posted more on my personal. Which you can see increased my profile views before I signed up for Premium.

Likely the past additional profile views are for that rage-bait Python vs R post that was popular, not due to anything Premium did.

This appears to be extremely misleading advertising on LinkedIn’s part. If they just look at Premium vs not, it is likely Premium users are more active. This should just say the explicit “boost” profile views get, like ranked higher in searches.

$100 ad credit

With premium, you get $100 ad credit for posts a month. I used this to boost my original LLM for Mortals launch post, which was stale at that point and not accumulating any additional views.

The metrics on the post were as LinkedIn said they would be. Despite having 80+ likes when I first created it, the post only had 3700 views. Spending $100 on the credits got me an additional ~3500 views and supposedly ~50 additional website clicks. (I am confused how this is calculated, as I can see the actual link in the post was clicked fewer than 10 additional times with the campaign.)

I knew going in that adverts on LinkedIn are not a net benefit given my book purchase conversion rates. What I will call “high trust” referrals, I have something like a 1/100 purchase rate for the book. For other mediums, it is more like 1/1000. As far as I can tell, these seem pretty typical for a higher dollar value book purchase ($50+).

I have debated on setting the purchase price for the epub to much lower. $50 is in line with current offerings from O’Reilly, and in my informal demand curve tests is where I think it should be. But I don’t think any realistic conversion rate would make LinkedIn advertising make sense for my book.

For reference for influencers though, this gives a rough estimate comparable to LinkedIn’s direct advertising. Basically my average post is worth $100 according to LinkedIn. I only have around 3k followers currently on LinkedIn, so I imagine folks with followings 10x that can likely do direct advertisements to their audiences for more like $1k and up.

Wrap Up

I still think LinkedIn is the best social media site currently to promote my work and business. It is not just about the raw view counts, but also about conversion to people buying my book or reaching out for additional consulting gigs.

I will continue to use LinkedIn for this, but paying for a Premium LinkedIn account does not appear to be worth it for these reasons. Even if the views were increased, it is possible that they are not good connections for these end goals.

There are additional things you get with Premium (can send cold messages to people you are not connected to, supposedly higher priority when applying to jobs). Those are maybe worth the $20 a month for some people. But focusing on what LinkedIn advertises for “boosting” your posts and profile, I did not personally see any evidence that would justify spending even $1 a month for the Premium features.

The race to the bottom with AI tools

What we are seeing in the AI startup space is a perfect example of the “no moat” problem: if your core product is essentially just clever prompt engineering wrapped around someone else’s frontier model, it is trivially easy for a competitor to reverse-engineer your workflow and undercut your price. Over the last few months, this lack of a defensible moat has triggered a rapid race to the bottom in automated peer review, moving from expensive managed services to open-source “bring your own key” (BYOK) scripts.

Here I am going to look at three tools specifically designed to review academic papers: Refine, IsItCredible, and Coarse.

Overview of the Tools

Refine: Refine positions itself as a premium, rigorous option for institutions, boasting testimonials from Ivy League professors and a high price point of $49.99 per review. It uses what it calls “massive parallel compute” to make hundreds of LLM calls to stress-test every line of a document.

IsItCredible: Built on the open-source Reviewer 2 pipeline, IsItCredible offers a standardized, pay-per-use middle ground with core reports starting at $5. It employs a clever “adversarial” architecture where “Red Team” agents try to find flaws and a “Blue Team” verifies them to prevent hallucinations.

Coarse: Coarse represents the logical endpoint of this race as an open-source “Bring Your Own Key” (BYOK) tool that lets you run complex multi-agent reviews locally or via OpenRouter. Because users pay the API costs directly instead of a markup, a comprehensive paper review is significantly cheaper.

The “LLM as a Judge” Problem

The hardest part of all this is evaluation. How do you know if the AI reviewer is actually good?

Refine relies almost entirely on anecdotal evidence. Their own FAQ essentially tells you to just try it and see the difference for yourself, claiming that general-purpose chatbots cannot match their depth even with expert prompting. This “try it yourself” approach is effective for marketing, but it isn’t a hard benchmark.

IsItCredible and Coarse are trying to be more systematic. The IsItCredible team released a paper, Yell at It: Prompt Engineering for Automated Peer Review, where they benchmarked their tool against five alternatives. They claim 15 wins out of 20 pairings. Similarly, Coarse claims to have been “blind-evaluated” against Refine and Reviewer 2, scoring higher on coverage and specificity.

However, we are still largely in the “LLM as a judge” era. These benchmarks often use another LLM to decide which review is better. It is circular logic. Until we have a “Ground Truth” dataset of known mathematical errors or logical fallacies in published papers, we are just measuring which AI writes the most convincing-sounding critique.

Because evaluation is so difficult, this software category risks becoming a classic market for lemons. It is incredibly difficult to identify substantive differences in quality between these tools without some external, hard benchmark. To truly evaluate if Refine’s expensive managed service is meaningfully better than Coarse’s open-source BYOK run, you have to verify the AI’s claims. But verifying those claims requires spending just as much time reading and reviewing the original paper as you would have spent just doing the review yourself from scratch. Without transparent benchmarks, users cannot easily distinguish high-quality rigorous analysis from convincing hallucinations, driving the market toward the cheapest option by default.

For those building AI tools, this entire space serves as a warning about the race to the bottom. I have previously written about deep research tools as another example of this phenomenon. If your only value proposition is a well-orchestrated prompt chain, open-source alternatives will inevitably compress your margins to zero. Eventually, the native GUI interfaces of the frontier models themselves may just become good enough that your specialized service isn’t even needed.

Meta

Did you like this post? Guess what, it was entirely generated via the Google’s API models (specifically the gemini cli). I have saved the chat session and log for how long it took here. You can see for yourself, I had a broad idea, asked it to review different materials, and then generate a post. I then iterated 25 minutes from start to finish in total.

The original post also is not flagged by Pangram as AI generated.

It definitely is not 100% my style (and to be clear this meta section is 100% hand written). The final paragraph about deep research tools I also struggled to get the model to say what I wanted – I wanted it to say “deep research tools are another example where this same situation will occur”. I am keeping the original 100% AI generated post for posterity though for folks to see what is possible with the current tools.

Interview on LEAP about LLMs for Mortals

I was recently interviewed by Jason Elder on the Law Enforcement Analysts Podcast about my new book, Large Language Models for Mortals: A Practical Guide for Analysts.

Jason does an excellent job with interviewing (and does a quality editing job with audio), so suggest to follow that if you are a crime analyst or researcher working with police departments.

Basically cover large swaths of the book, through basics of APIs, structured info extraction, some high level discussion of RAG, and how AI coding tools still need a bit of human oversight and direction. Even if you are not a coder, I think picking up a copy is a good idea to get an understanding about what is possible with the current tools.

Just to catalog the different coupon codes for the book:

  • LLMDEVS to get 50% off of the epub
  • TWOFOR1 to get $30 off when purchasing two books (can be any two books)

I do give the first coupon code for the paperback version of the book in the interview. So take a listen if interested in $20 off the paperback.

You can purchase either epub or paperback from my store worldwide.

Using Claude Code to help me write

Using LLMs to help you write is understandably a touchy subject for many. There is quite a bit of AI slop coming out now, as it is really easy to just have the LLM tools think for you and write superficially OK but ultimately garbage prose.

My recent book, LLMs for Mortals, I used Sonnet 4.1 to write the initial draft of the book (for around $5). My prior book took around a year, whereas I was able to finish this book in around two months. I definitely did a ton of copy-editing (maybe around 20-30 hours per chapter on average), but I believe around 50% of the book material is the original Sonnet generated prose.

LLMs are a tool – they can be used poorly, but I think they can be used quite well. Pangram, a tool used to detect AI writing, does not flag any of the passages in LLMs for Mortals as AI generated.

This blog post goes over my notes on how I used Claude Code to help me write (although it really is applicable to any of the current coding tools, like Codex or Gemini as well). As a meta-reference, this blog post is 100% written by myself directly, but I will link to a draft written using Claude Code later in the post for a frame of reference.

Copy Editing

First, even if you do not agree with having an LLM write for you directly, there is a use case that should be relatively uncontroversial – having an LLM take a copy-edit pass on your work.

Here is an example I used this for recently, the blog post on Crime De-Coder goes over the benefits of using an API vs local LLMs. In this conversation, you can see my original draft, and the suggestions that Claude’s desktop tool (the free version) gave.

Again this is not really specific to Claude (this would have worked fine in ChatGPT as well). LLMs are good for not only spelling errors, but grammatical issues that spell check will not catch, as well as just more general copy-editing advice on the content.

One point of this – to replicate my setup, you need to write in plain text. Most of the things I write are in some form of markdown (plain markdown for blog posts, and Quarto for longer reports/books/etc). This makes it much easier to use the tools, especially the command line interface (CLI) tools like Claude Code.

Writing New Content

There are two big issues currently with LLM writing:

  • it is potentially wrong
  • current LLM writing has a particular style that is itself becoming noticeable

The first bullet, you need to review what it writes. It is much easier to have it write on content you are an expert in, so it is easier to review and spot errors. (It is the same current problem with using the tools to help you write computer code – they are boons for seniors but can write a ton of slop that more neophyte coders have a hard time spotting issues.)

The second bullet, having the style mimic your own, is what I am going to discuss here. It is worth understanding at a high level how generative AI LLMs work – if you ask “answer question X” vs “here is a book, …., answer question X” the LLM will generate a different response. The first part in the former prompt, “here is a book, …” is what is referred to the context. Current models have context windows (how large of a potential input) at around 500,000 words (technically they are around 1 million tokens, one word is often multiple tokens though).

You generally do not want to fill up the context window 100%, but 500,000 words is a very large number – just including text it would be multiple books. Another common prompting technique is what is called k-shot examples. It will typically go like

example input1: ...text... expected_output: ...blah...
example input2: ...diff text... expected_output: ...blah2...
....

This is what you place in the context window, then submit your usual prompt, and have the LLM generate the content. It is giving prior examples to help guide the LLM what you expect the final output to look like. This works the same way with writing – give the LLM prior examples of your writing to help it mimic your future style.

To keep it simple, I have created an example on github to follow along. Basically just have your prior writing (in text!), and then ask Claude Code something like:

review my prior blog posts in folder /blogposts, I am going to have you write a new blog post on topic X given the outline *after* you review the text

Then after your prior work is in the context window, feed the LLM an outline for what you want to write. In this example, I put the outline in an actual text file and said:

In the ClaudeWritingPost folder, review the outline.txt, then create a new md 
file, called ClaudeWritingExample.md, filling in the sections based on the 
outline

Claude Code will then go and review the text file with the outline and write the post. In the github repo I have my original outline for this same post, so you can see side-by-side.

You can technically write custom commands and skills with Claude Code (or the other CLI tools) to save the steps of typing two prompts, but to keep it simple for folks I am just showing the two steps manually. It is really just those two steps – get your prior examples into the context window, and then feed an outline for what else to write.

In the Github repo you can see some additional Claude.md files – these are files that include additional instructions. A common one I say is “do not include emojis”. LLM writing also tends to be verbose and have excessive lists. So I have instructions to avoid those as well.

The written blog post is not bad – I would suggest to go and read it as a proof of concept (I exported the session, can see it cost around fifty cents). Part of the reason I do not typically worry about blog posts is that I often add in things/change things in the process of writing. So you can see my personally written post is longer and has a few more elements.

So when would you use it? Technical writing, like writing tutorials in python, it works very well. Hence I could have it write the first pass on my LLM book and keep 50% of the content. I may use it for blog posts in the future (if I felt compelled to write something every day). But will not take that plunge for now.

For longer pieces, like an entire paper or a book, I suggest to not only make a detailed outline, but to also have the LLM write it in smaller sections. This both helps with reviewing the content, as well as to keep the LLM on track if you make edits/changes as you go. (Longer conversations it is more likely to degrade and make repeated errors.)

An Extra Note About Citations

I am not writing academic papers much anymore, but another fundamental problem with LLM writing is hallucinating citations. If you write in text markdown files, my suggestion is fairly simple – have the papers you want to cite in a bibtex file, and in-line in markdown, only cite papers in the form:

Citation, @item1 says blah [@item1; @item2]. For a specific page quote [@item1 p. 34-35].

The way I write my outlines, it typically is like write a paragraph about X, cite papers a,b,c. So my personal style of progressively filling in an outline works well with LLMs.

So this presumes you already have a list of papers (and are not using the LLM to dynamically write your lit review based on papers you have not read). Next time I actually need to write an academic paper, I may write up an MCP tool to query Semantic Scholar’s API and create a nice bibtex file.

But the solution here is again you need to review the output for accuracy. People without these tools are lazy and cite things they have not read already, so that will continue to happen (the tools just make it easier). Those that figure out how to use the tools appropriately though can be much more productive writing.

Large Language Models for Mortals book

I have published a new book, Large Language Models for Mortals: A Practical Guide for Analysts with Python. The book is available to purchase in my store, either as a paperback (for $59.99) or an epub (for $49.99).

The book is a tutorial on using python with all the major LLM foundation model providers (OpenAI, Anthropic, Google, and AWS Bedrock). The book goes through the basics of API calls, structured outputs, RAG applications, and tool-calling/MCP/agents. The book also has a chapter on LLM coding tools, with example walk throughs for GitHub Copilot, Claude Code (including how to set it up via AWS Bedrock), and Google’s Antigravity editor. (It also has a few examples of local models, which you can see Chapter 2 I discuss them before going onto the APIs in Chapter 3).

You can review the first 60 some pages (PDF link here if on Iphone).

While many of the examples in the book are criminology focused, such as extracting out crime elements from incident narratives, or summarizing time series charts, the lessons are more general and are relevant to anyone looking to learn the LLM APIs. I say “analyst” in the title, but this is really relevant to:

  • traditional data scientists looking to expand into LLM applications
  • PhD students (in all fields) who would like to use LLM applications in their work
  • analysts looking to process large amounts of unstructured textual data

Basically anyone who wants to build or create LLM applications, this is the book to help you get started.

I wrote this book partially out of fear – the rapid pace of LLM development has really upended my work as a data scientist. It is really becoming the most important set of skills (moreso than traditional predictive machine learning) in just the past year or two. This book is the one I wish I had several years ago, and will give analysts a firm grounding in using LLMs in realistic applications.

Again, the book is available in:

For purchase worldwide. Here are all the sections in the book – whether you are an AWS or Google shop, or want to learn the different database alternatives for RAG, or want more self contained examples of agents with python code examples for OpenAI, Anthropic, or Google, this should be a resource you highly consider purchasing.

To come are several more blog posts in the near future, how I set up Claude Code to help me write (and not sound like a robot). How to use conformal inference and logprobs to set false positive rates for classification with LLM models, and some pain points with compiling a Quarto book with stochastic outputs (and points of varying reliability for each of the models).

But for now, just go and purchase the book!


Below is the table of contents to review – it is over 350 pages for the print version (in letter paper), over 250 python code snippets and over 80 screenshots.

Large Language Models for Mortals: A Practical Guide for Analysts with Python
by Andrew Wheeler
TABLE OF CONTENTS
Preface
Are LLMs worth all the hype?
Is this book more AI Slop?
Who this book is for
Why write this book?
What this book covers
What this book is not
My background
Materials for the book
Feedback on the book
Thank you
1 Basics of Large Language Models
1.1 What is a language model?
1.2 A simple language model in PyTorch
1.3 Defining the neural network
1.4 Training the model
1.5 Testing the model
1.6 Recapping what we just built
2 Running Local Models from Hugging Face
2.1 Installing required libraries
2.2 Downloading and using Hugging Face models
2.3 Generating embeddings with sentence transformers
2.4 Named entity recognition with GLiNER
2.5 Text Generation
2.6 Practical limitations of local models
3 Calling External APIs
3.1 GUI applications vs API access
3.2 Major API providers
3.3 Calling the OpenAI API
3.4 Controlling the Output via Temperature
3.5 Reasoning
3.6 Multi-turn conversations
3.7 Understanding the internals of responses
3.8 Embeddings
3.9 Inputting different file types
3.10 Different providers, same API
3.11 Calling the Anthropic API
3.12 Using extended thinking with Claude
3.13 Inputting Documents and Citations
3.14 Calling the Google Gemini API
3.15 Long Context with Gemini
3.16 Grounding in Google Maps
3.17 Audio Diarization
3.18 Video Understanding
3.19 Calling the AWS Bedrock API
3.20 Calculating costs
4 Structured Output Generation
4.1 Prompt Engineering
4.2 OpenAI with JSON parsing
4.3 Assistant Messages and Stop Sequences
4.4 Ensuring Schema Matching Using Pydantic
4.5 Batch Processing For Structured Data Extraction using OpenAI
4.6 Anthropic Batch API
4.7 Google Gemini Batch
4.8 AWS Bedrock Batch Inference
4.9 Testing
4.10 Confidence in Classification using LogProbs
4.11 Alternative inputs and outputs using XML and YAML
4.12 Structured Workflows with Structured Outputs
5 Retrieval-Augmented Generation (RAG)
5.1 Understanding embeddings
5.2 Generating Embeddings using OpenAI
5.3 Example Calculating Cosine similarity and L2 distance
5.4 Building a simple RAG system
5.5 Re-ranking for improved results
5.6 Semantic vs Keyword Search
5.7 In-memory vector stores
5.8 Persistent vector databases
5.9 Chunking text from PDFs
5.10 Semantic Chunking
5.11 OpenAI Vector Store
5.12 AWS S3 Vectors
5.13 Gemini and BigQuery SQL with Vectors
5.14 Evaluating retrieval quality
5.15 Do you need RAG at all?
6 Tool Calling, Model Context Protocol (MCP), and Agents
6.1 Understanding tool calling
6.2 Tool calling with OpenAI
6.3 Multiple tools and complex workflows
6.4 Tool calling with Gemini
6.5 Returning images from tools
6.6 Using the Google Maps tool
6.7 Tool calling with Anthropic
6.8 Error handling and model retry
6.9 Tool Calling with AWS Bedrock
6.10 Introduction to Model Context Protocol (MCP)
6.11 Connecting Claude Desktop to MCP servers
6.12 Examples of Using the Crime Analysis Server in Claude Desktop
6.13 What are Agents anyway?
6.14 Using Multiple Tools with the OpenAI Agents SDK
6.15 Composing and Sequencing Agents with the Google Agents SDK
6.16 MCP and file searching using the Claude Agents SDK
6.17 LLM as a Judge
7 Coding Tools and AI-Assisted Development
7.1 Keeping it real with vibe coding
7.2 VS Code and GitHub Install
7.3 GitHub Copilot
7.4 Claude Code Setup
7.5 Configuring API access
7.6 Using Claude Code to Edit Files
7.7 Project context with CLAUDE.md
7.8 Using an MCP Server
7.9 Custom Commands and Skills
7.10 Session Management
7.11 Hooks for Testing
7.12 Claude Headless Mode
7.13 Google Antigravity
7.14 Best practices for AI-assisted coding
8 Where to next?
8.1 Staying current
8.2 What to learn next?
8.3 Forecasting the near future of foundation models
8.4 Final thoughts

Part time product design positions to help with AI companies

Recently on the Crime Analysis sub-reddit an individual posted about working with an AI product company developing a tool for detectives or investigators.

The Mercor platform has many opportunities that may be of interest to my network, so I am sharing them here. These include not only for investigators, but GIS analysts, writers, community health workers, etc. (The eligibility interviewers I think if you had any job in gov services would likely qualify, it is just reviewing questions.)

All are part time (minimum of 15 hours per week), remote, and can be in the US, Canada, or UK. (But cannot support H1-B or OPT visas in the US).

Additional for professionals looking to get into the tech job market, see these two resources:

I actually just hired my first employee at Crime De-Coder. Always feel free to reach out if you think you would be a good fit for the types of applications I am working on (python, GIS, crime analysis experience). I will put you in the list to reach out to when new opportunities are available.


Detectives and Criminal Investigators

Referral Link

$65-$115 hourly

Mercor is recruiting Detectives and Criminal Investigators to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Detective and Criminal Investigator. Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum of 15 hours per week

Community Health Workers

Referral Link

$60-$80 hourly

Mercor is recruiting Community Health Workers to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Community Health Worker. Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum 15 hours per week

Writers and Authors

Referral Link

$60-$95 hourly

Mercor is recruiting Writers and Authors to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Writer and Author.

Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum 15 hours per week

Eligibility Interviewers, Government Programs

Referral Link

$60-$80 hourly

Mercor is recruiting Eligibility Interviewers, Government Programs to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Eligibility Interviewers, Government Program. Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum 15 hours per week

Cartographers and Photogrammetrists

Referral Link

$60-$105 hourly

Mercor is recruiting Cartographers and Photogrammetrists to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Cartographer and Photogrammetrist. Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum 15 hours per week

Geoscientists, Except Hydrologists and Geographers

$85-$100 hourly

Referral Link

Mercor is recruiting Geoscientists, Except Hydrologists and Geographers to work on a research project for one of the world’s top AI companies. This project involves using your professional experience to design questions related to your occupation as a Geoscientists, Except Hydrologists and Geographers Applicants must:

  • Have 4+ years full-time work experience in this occupation;
  • Be based in the US, UK, or Canada
  • minimum of 15 hours per week

I translated my book for $7 using openai

The other day an officer from the French Gendarmerie commented that they use my python for crime analysis book. I asked that individual, and he stated they all speak English. But given my book is written in plain text markdown and compiled using Quarto, it is not that difficult to pipe the text through a tool to translate it to other languages. (Knowing that epubs under the hood are just html, it would not suprise me if there is some epub reader that can use google translate.)

So you can see now I have available in the Crime De-Coder store four new books:

ebook versions are normally $39.99, and print is $49.99 (both available worldwide). For the next few weeks, can use promo code translate25 (until 11/15/2025) to purchase epub versions for $19.99.

If you want to see a preview of the books first two chapters, here are the PDFs:

And here I added a page on my crimede-coder site with testimonials.

As the title says, this in the end cost (less than) $7 to convert to French (and ditto to convert to Spanish).

Here is code demo’ing the conversion. It uses OpenAI’s GPT-5 model, but likely smaller and cheaper models would work just fine if you did not want to fork out $7. It ended up being a quite simple afternoon project (parsing the markdown ended up being the bigger pain).

So the markdown for the book in plain text looks like this:

It ends up that because markdown uses line breaks to denote different sections, that ends up being a fairly natural break to do the translation. These GenAI tools cannot repeat back very long sequences, but a paragraph is a good length. Long enough to have additional context, but short enough for the machine to not go off the rails when trying to just return the text you input. Then I just have extra logic to not parse code sections (that start/end with three backticks). I don’t even bother to parse out the other sections (like LaTeX or HTML), and I just include in the prompt to not modify those.

So I just read in the quarto document, split by “”, then feed in the text sections into OpenAI. I did not test this very much, just use the current default gpt-5 model with medium reasoning. (It is quite possible a non-reasoning smaller model will do just as well. I suspect the open models will do fine.)

You will ultimately still want someone to spot check the results, and then do some light edits. For example, here is the French version when I am talking about running code in the REPL, first in English:

Running in the REPL

Now, we are going to run an interactive python session, sometimes people call this the REPL, read-eval-print-loop. Simply type python in the command prompt and hit enter. You will then be greeted with this screen, and you will be inside of a python session.

And then in French:

Exécution dans le REPL

Maintenant, nous allons lancer une session Python interactive, que certains appellent le REPL, boucle lire-évaluer-afficher. Tapez simplement python dans l’invite de commande et appuyez sur Entrée. Vous verrez alors cet écran et vous serez dans une session Python.

So the acronym is carried forward, but the description of the acronym is not. (And I went and edited that for the versions on my website.) But look at this section in the intro talking about GIS:

There are situations when paid for tools are appropriate as well. Statistical programs like SPSS and SAS do not store their entire dataset in memory, so can be very convenient for some large data tasks. ESRI’s GIS (Geographic Information System) tools can be more convenient for specific mapping tasks (such as calculating network distances or geocoding) than many of the open source solutions. (And ESRI’s tools you can automate by using python code as well, so it is not mutually exclusive.) But that being said, I can leverage python for nearly 100% of my day to day tasks. This is especially important for public sector crime analysts, as you may not have a budget to purchase closed source programs. Python is 100% free and open source.

And here in French:

Il existe également des situations où les outils payants sont appropriés. Les logiciels statistiques comme SPSS et SAS ne stockent pas l’intégralité de leur jeu de données en mémoire, ils peuvent donc être très pratiques pour certaines tâches impliquant de grands volumes de données. Les outils SIG d’ESRI (Système d’information géographique) peuvent être plus pratiques que de nombreuses solutions open source pour des tâches cartographiques spécifiques (comme le calcul des distances sur un réseau ou le géocodage). (Et les outils d’ESRI peuvent également être automatisés à l’aide de code Python, ce qui n’est pas mutuellement exclusif.) Cela dit, je peux m’appuyer sur Python pour près de 100 % de mes tâches quotidiennes. C’est particulièrement important pour les analystes de la criminalité du secteur public, car vous n’avez peut‑être pas de budget pour acheter des logiciels propriétaires. Python est 100 % gratuit et open source.

So it translated GIS to SIG in French (Système d’information géographique). Which seems quite reasonable to me.

I paid an individual to review the Spanish translation (if any readers are interested to give me a quote for the French version copy-edits, would appreciate it). She stated it is overall very readable, but just has many minor things. Here is a a sample of suggestions:

Total number of edits she suggested were 77 (out of 310 pages).

If you are interested in another language just let me know. I am not sure about translation for the Asian languages, but I imagine it works OK out of the box for most languages that are derivative of Latin. Another benefit of self-publishing, I can just have the French version available now, but if I am able to find someone to help with the copy-edits I will just update the draft after I get their feedback.

LinkedIn is the best social media site

The end goals I want for a social media site are:

  • promote my work
  • see other peoples work

Social media for other people may have other uses. I do comment and have minor interactions on the social media sites, but I do not use them primarily for that. So my context is more business oriented (I do not have Facebook, and have not considered it). I participate some on Reddit as well, but that is pretty sparingly.

LinkedIn is the best for both relative to X and BlueSky currently. So I encourage folks with my same interests to migrate to LinkedIn.

LinkedIn

So I started Crime De-Coder around 2 years ago. I first created a website, and then second started a LinkedIn page.

When I first created the business page, I invited most of my criminal justice contacts to follow the page. I had maybe 500 followers just based on that first wave of invites. At first I posted once or twice a week, and it was very steady growth, and grew to over 1500 followers in maybe just a month or two.

Now, LinkedIn has a reputation for more spammy lifecoach self promotion (for lack of a better description). I intentionally try to post somewhat technical material, but keep it brief and understandable. It is mostly things I am working on that I think will be of interest to crime analysts or the general academic community. Here is one of my recent posts on structured outputs:

Current follower count on LinkedIn for my business page (which in retrospect may have been a mistake, I think they promote business pages less than personal pages), is 3230, and I have fairly consistent growth of a few new followers per day.

I first started posting once a week, and with additional growth expanded to once every other day and at one point once a day. I have cut back recently (mostly just due to time). I did get more engagement, around 1000+ views per day when I was posting every day.

Probably the most important part though of advertising Crime De-Coder is the types of views I am getting. My followers are not just academic colleagues I was previously friends with, it is a decent outside my first degree network of police officers and other non-profit related folks. I have landed several contracts where I know those individuals reached out to me based on my LinkedIn posting. It could be higher, as my personal Crime De-Coder website ranks very poorly on Bing search, but my LinkedIn posts come up fairly high.

When I was first on Twitter I did have a few academic collaborations that I am not sure would have happened without it (a paper with Manne Gerell, and a paper with Gio Circo, although I had met Gio in real life before that). I do not remember getting any actual consulting work though.

I mentioned it is not only better for me for advertising my work, but also consuming other material. I did a quick experiment, just opened the home page and scrolled the first 3 non-advertisement posts on LinkedIn, X, and BlueSky. For LinkedIn

This is likely a person I do not want anything to do with, but their comment I agree with. Whenever I use Service Now at my day job I want to rage quit (just send a Teams chat or email and be done with it, LLMs can do smarter routing anymore). The next two are people are I am directly connected with. Some snark by Nick Selby (which I can understand the sentiment, albeit disagree with, I will not bother to comment though). And something posted by Mindy Duong I likely would be interested in:

Then another advert, and then a post by Chief Patterson of Raleigh, whom I am not directly connected with, but was liked by Tamara Herold and Jamie Vaske (whom I am connected with).

So annoying for the adverts, but the suggested (which the feeds are weird now, they are not chronological) are not bad. I would prefer if LinkedIn had a “general” and “my friends” sections, but overall I am happier with the content I see on LinkedIn than I am the other sites.

X & BlueSky

I first created a personal then Twitter account in 2018. Nadine Connell suggested it, and it was nice then. When I first joined I think it was Cory Haberman tweeted and said to follow my work, and I had a few hundred followers that first day. Then over the next two years, just posting blog posts and papers for the most part, I grew to over 1500 followers IIRC. I also consumed quite a bit of content from criminal justice colleagues. It was much more academic focused, but it was a very good source of recent research, CJ relevant news and content.

I then eventually deleted the Twitter account, due to a colleague being upset I liked a tweet. To be clear, the colleague was upset but it wasn’t a very big deal, I just did not want to deal with it.

I started a Crime De-Coder X account last year. I made an account to watch the Trump interview, and just decided to roll with it. I tried really hard to make X work – I posted daily, the same stuff I had been sharing on LinkedIn, just shorter form. After 4 months, I have 139 followers (again, when I joined Twitter in 2018 I had more than that on day 1). And some of those followers are porn accounts or bots. Majority of my posts get <=1 like and 0 reposts. It just hasn’t resulted in getting my work out there the same way in 2018 or on LinkedIn now.

So in terms of sharing work, the more recent X has been a bust. In terms of viewing other work, my X feed is dominated by short form video content (a mimic of TikTok) I don’t really care about. This is after extensively blocking/muting/saying I don’t like a lot of content. I promise I tried really hard to make X work.

So when I open up the Twitter home feed, it is two videos by Musk:

Then a thread by Per-Olof (whom I follow), and then another short video Death App joke:

So I thought this was satire, but clicking that fellows posts I think he may actually be involved in promoting that app. I don’t know, but I don’t want any part of it.

BlueSky I have not been on as long, but given how easy it was to get started on Twitter and X, I am not going to worry about posting so much. I have 43 followers, and posts similar to X have basically been zero interaction for the most part. The content feed is different than X, but is still not something I care that much about.

We have Jeff Asher and his football takes:

I am connected with Jeff on LinkedIn, in which he only posts his technical material. So if you want to hear Jeff’s takes on football and UT-Austin stuff then go ahead and follow him on BlueSky. Then we have a promotional post by a psychologist (this person I likely would be interested in following his work, this particular post though is not very interesting). And a not funny Onion like post?

Then Gavin Hales, whom I follow, and typically shares good content. And another post I leave with no comment.

My BlueSky feed is mostly dominated by folks in the UK currently. It could be good, but it currently just does not have the uptake to make it worth it like I had with Twitter in 2018. It may be the case given my different goals, to advertise my consulting business, Twitter in 2018 would not be good either though.

So for folks who subscribe to this blog, I highly suggest to give LinkedIn a try for your social media consumption and sharing.

The story of my dissertation

My dissertation is freely available to read on my website (Wheeler, 2015). I still open up my hardcover I purchased every now and then. No one cites it, because no one reads dissertations, but it is easily the work I am the most proud of.

Most of the articles I write there is some motivating story behind the work you would never know about just from reading the words. I think this is important, as the story often is tied to some more fundamental problem, which solving specific problems is the main way we make progress in science. The stifling way that academics write peer reviewed papers currently doesn’t allow that extra narrative in.

For example, my first article (and what ended up being my masters thesis, Albany at that time you could go directly into PhD from undergrad and get your masters on the way), was an article about the journey to crime after people move (Wheeler, 2012). The story behind that paper was, while working at the Finn Institute, Syracuse PD was interested in targeted enforcement of chronic offenders, many of whom drive around without licenses. I thought, why not look at the journey to crime to see where they are likely driving. When I did that analysis, I noticed a few hundred chronic offenders had something like a 5 fold number of home addresses in the sample. (If you are still wanting to know where they drive, they drive everywhere, chronic offenders have very wide spatial footprints.)

Part of the motivation behind that paper was if people move all the time, how can their home matter? They don’t really have a home. This is a good segue into the motivation of the dissertation.

More of my academic reading at that point had been on macro and neighborhood influences on crime. (Forgive me, as I am likely to get some of the timing wrong in my memory, but this writing is as best as I remember it.) I had a class with Colin Loftin that I do not remember the name of, but discussed things like the southern culture of violence, Rob Sampson’s work on neighborhoods and crime, and likely other macro work I cannot remember. Sampson’s work in Chicago made the biggest impression on me. I have a scanned copy of Shaw & McKay’s Juvenile Delinquency (2nd edition). I also took a spatial statistics class with Glenn Deane in the sociology department, and the major focus of the course was on areal units.

When thinking about the dissertation topic, the only advice I remember receiving was about scope. Shawn Bushway at one point told me about a stapler thesis (three independent papers bundled into a single dissertation). I just wanted something big, something important. I intentionally sought out to try to answer some more fundamental question.

So I had the first inkling of “how can neighborhoods matter if people don’t consistently live in the same neighborhood”? The second was that my work at the Finn Institute working with police departments, hot spots were the only thing any police department cared about. It is not uncommon even now for an academic to fit a spatial model at the neighborhood level to crime and demographics, and have a throwaway paragraph in the discussion about how it would help police better allocate resources. It is comically absurd – you can just count up crimes at addresses or street segments and rank them and that will be a much more accurate and precise system (no demographics needed).

So I wanted to do work on micro level units of analysis. But I had on my dissertation Glenn and Colin – people very interested in macro and some neighborhood level processes. So I would need to justify looking at small units of analysis. Reading the literature, Weisburd and Sherman did not have to me clearly articulated reasons to be interested in micro places, beyond just utility for police. Sherman had the paper counting up crimes at addresses (Sherman et al., 1989), and none of Weisburd’s work had to me any clear causal reasoning to look at micro places to explain crime.

To be clear wanting to look at small units as the only guidepost in choosing a topic is a terrible place to start. You should start from a more specific, articulable problem you wish to solve. (If others pursuing Phds are reading.) But I did not have that level of clarity in my thinking at the time.

So I set out to articulate a reason why we would be interested to look at micro level areas that I thought would satisfy Glenn and Colin. I started out thinking about doing a simulation study, similar to what Stan Openshaw did (1984) that was motivated by Robinson’s (1950) ecological fallacy. While doing that I realized there was no point in doing the simulation, you could figure it out all in closed form (as have others before me). So I proved that random spatial aggregation would not result in the ecological fallacy, but aggregating nearby spatial areas would, assuming there is a spatial covariance between nearby areas. I thought at the time it was a novel proof – it was not (Footnote 1 on page 9 were all things I read after this). Even now the Wikipedia page on the ecological fallacy has an unsourced overview of the issue, that cross-spatial correlations make the micro and macro equations not equal.

This in and of itself is not interesting, but in the process did clearly articulate to me why you want to look at micro units. The example I like to give is as follows – imagine you have a bar you think causes crime. The bar can cause crime inside the bar, as well as the bar diffusing risk into the nearby area. Think people getting in fights in the bar, vs people being robbed walking away from a night of drinking. If you aggregate to large units of analysis, you cannot distinguish between “inside bar crime” vs “outside bar crime”. So that is a clear causal reasoning for when you want to look at particular units of analysis – the ability to estimate diffusion/displacement effects are highly dependent on the spatial unit of analysis. If you have an intervention that is “make the bar hire better security” (ala John Eck’s work) that should likely not have any impact outside the bar, only inside the bar. So local vs diffusion effects are not entirely academic, they can have specific real world implications.

This logic does not explicitly always value smaller spatial units of analysis though. Another example I liked to give is say you are evaluating a city wide gun buy back. You could look at more micro areas than the entire city, e.g. see if it decreased in neighborhood A and increased in neighborhood B, but it likely does not invalidate the macro city wide analysis. Which is just an aggregate estimate over the entire city – which in some cases is preferable.

Glenn Deane at some point told me that I am a reductionist, which was the first time I heard that word, but it did encapsulate my thinking. You could always go smaller, there is no atom to stop at. But often it just doesn’t matter – you could examine the differences in crime between the front stoop and the back porch, but there is not likely meaningful causal reasons to do so. This logic works for temporal aggregation and aggregating different crime types as well.

I would need to reread Great American city, but I did not take this to be necessarily contradictory to Sampson’s work on neighborhood processes. Rob came to SUNY Albany to give a talk at the sociology department (I don’t remember the year). Glenn invited me to whatever they were doing after the talk, and being a hillbilly I said I need to go back to work at DCJS, I am on my lunch break. (To be clear, no one at DCJS would have cared.) I am sure I would have not been able to articulate anything of importance to him, but I do wish I had taken that opportunity in retrospect.

So with the knowledge of how aggregation bias occurs in hand, I had formulated a few different empirical research projects. One was the idea behind bars and crime I have already given an example of. I had a few interesting findings, one of which is that diffusion effects are larger than the local effects. I also estimated the bias of bars selecting into high crime areas via a non-equivalent dependent variable design – the only time I have used a DAG in any of my work.

I gave a job talk at Florida State before the dissertation was finished. I had this idea in the hotel room the night before my talk. It was a terrible idea to add it to my talk, and I did not prepare what I was going to say sufficiently, so it came out like a jumbled mess. I am not sure whether I would want to remember or forget that series of events (which include me asking Ted Chiricos if you can fish in the Gulf of Mexico at dinner, I feel I am OK in one-on-one chats, group dinners I am more awkward than you can possibly imagine). It also included nice discussions though, Dan Mear’s asked me a question about emergent macro phenomenon that I did not have a good answer to at the time, but now I would say simple causal processes having emergent phenomenon is a reason to look at micro, not the macro. Eric Stewart asked me if there is any reason to look at neighborhood and I said no at the time, but I should have said my example gun buy back analogy.

The second empirical study I took from broken windows theory (Kelling & Wilson, 1982). So the majority of social science theories some spatial diffusion is to be expected. Broken windows theory though had a very clear spatial hypothesis – you need to see disorder for it to impact your behavior. So you do not expect spatial diffusion, beyond someones line of site, to occur. To measure disorder, I used 311 calls (I had this idea before I read Dan O’Brien’s work, see my prospectus, but Dan published his work on the topic shortly thereafter, O’Brien et al. 2015).

I confirmed this to be the case, conditional on controlling for neighborhood effects. I also discuss how if the underlying process is smooth, using discrete neighborhood boundaries can result in negative spatial autocorrelation, which I show some evidence of as well.

This suggests that using a smooth measure of neighborhoods, like Hipp’s idea of egohoods (Hipp et al., 2013), I think is probably more reasonable than discrete neighborhood boundaries (which are often quite arbitrary).

While I ended up publishing those two empirical applications (Wheeler, 2018; 2019), which was hard, I was too defeated to even worry about posting a more specific paper on the aggregation idea. (I think I submitted this paper to Criminology, but it was not well received.) I was partially burned out from the bars and crime paper, which went at least one R&R at Criminology and was still rejected. And then I went through four rejections for the 311 paper. I had at that point multiple other papers that took years to publish. It is a slog and degrading to be rejected so much.

But that is really my only substantive contribution to theoretical criminology in any guise. After the dissertation, I just focused on either policy work or engineering/method applications. Which are much easier to publish.

References