Testing day-of-week crime randomness paper published

My paper, Testing Serial Crime Events for Randomness in Day of Week Patterns with Small Samples, was recently published in the Journal of Investigative Pyschology and Offender Profiling. Here is the pre-print version on SSRN if you can’t get access to that journal.

The main idea behind the paper was if you had a series of a few crime events that you know are linked to the same offender, can we tell if those patterns are random with respect to the day of the week? We know spatial patterns are often clustered, but police responses such as surveillance are conditioned not only on a spatial location, but take place during certain days and times. I wanted to know when I could go to command staff and say, yeah you should BOLO on Saturday. Or just as importantly say in response, no the observed patterns could easily happen if the offender were just randomly picking days.

In the paper I show that if you have only 3 events and they all occur on the same day, you would reject the null that crimes have an equal probability across all seven days of the week at a p-value of less than 0.05. I also show that the exact test I propose has pretty good power for as few as 8 events in the series. So if you have, say 10 events and you fail to reject the null that each day of the week has equal probability of being chosen, it is pretty good evidence that a police response should not have any preference for a particular day.

To illustrate how one would use the test, I have a simple spreadsheet posted here (in the zip file has my other SPSS code to reproduce the results in the paper) in which you can type in the days of the week that the crimes are occurring on, and it calculates the hypothesis test.

The spreadsheet contains both the G-test and Kuiper’s V test. If you don’t read the paper and understand the difference, just use the G-test and ignore the Kuiper’s V results. For crime analysts, this is basically the minimum of what you need to know.


For analysts who are more into the nitty gritty, I also have some R code that is a bit more flexible, and calculates the exact test for varying numbers of bins and provides some code to conduct power analysis. So you can either download the code from GitHub and insert it to define the functions, or simply copy-paste it into the console. The only library dependency is the partitions library, so make sure that is installed before following along.

So if you have downloaded the code, you can use something like below to insert the functions and load the partitions library.

library(partitions)
mydir <- "C:\\Users\\andrew.wheeler\\Dropbox\\Documents\\BLOG\\ExactTest_Weekdays"
setwd(mydir)
source("Exact_Dist.R")

Now, say you had a series of crimes that had 4 on Saturday, 3 on Tuesday, and 1 on Sunday. You can test this for randomness by simply using:

crime <- c(1,0,3,0,0,0,4)
res <- SmallSampTest(d=crime)
res

Which prints at the console:

Small Sample Test Object 
Test Type is G 
Statistic is 15.5455263389754 
p-value is:  0.0182662  
Data are:  1 0 3 0 0 0 4 
Null probabilities are:  0.14 0.14 0.14 0.14 0.14 0.14 0.14 
Total permutations are:  3003  

This defaults to using the likelihood ratio G-test, but you can also use Kuiper’s V, the chi-square test, or the Komolgrov-Smirnov test. Also you can change the null hypothesis to not equal probability in the bins. I default to the G-test in my paper because it is more powerful than the more typical chi-square after 8 crimes for 7 day-of-week bins, but equal in power to the chi-square for smaller sample sizes. So to do the chi-square test on the same data, use:

resChi <- SmallSampTest(d=crime, type="Chi")
resChi
chisq.test(crime) #for comparison to base R 
chisq.test(crime, simulate.p.value = TRUE, B = 10000)

Which you can see the test statistic mimics base R’s chisq.test, and the p-value is slightly higher than the asymptotic p-value (the exact test should always have a higher p-value than the asympotic distribution, and here it is lower than the simulated p-value). This situation the simulation approach would have been fine. I prefer the exact approach when feasible though, because it is exact, and you don’t need to worry about convergence for the simulation (which most everyone simply picks a large number and hopes for the best).

I’ve also made some code that allows for easy evaluation of the power of the exact test. Coding wise it was easiest to simply use the original object created with the test, so I know it invites post-hoc power analysis – forgive me for my slothness in coding practices. So say you wanted to do apriori power analysis with the Kuiper’s V test for 10 bins and 15 observations (so over 1.3 million permutations, i.e. n <- 15; m <- 10; choose(n+m-1,m-1)). You can simply make an original object (with any observed values across the bins).

test10_data <- c(15,rep(0,9))
test10_perm <- SmallSampTest(d=test10_data, type="KS")
#takes around a minute

The default null is equal probability across the bins, and to do a power analysis you have to specify an alternative. Lets say for the alternative there is equal probability in 5 of the bins, and zero probability in the other 5. (Most of the work is done in making the original permutation object, the power analysis is quite fast, hence why I coded it to work this way.)

p_alt <- c(rep(1/5,5),rep(0,5))
Pow_test <- PowAlt(SST=test10_perm,p_alt=p_alt)
Pow_test

This prints out at the console:

Power for Small Sample Test 
Test statistic is: KS  
Power is: 0.1822815  
Null is: 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1  
Alt is: 0.2 0.2 0.2 0.2 0.2   0   0   0   0   0  
Alpha is: 0.05  
Number of Bins: 10  
Number of Observations: 15  

So for this alternative there is quite low power, only 0.18. But if we change it to only have mass in four of the bins, the power goes way up to over 0.99.

> p_alt2  Pow_test2  Pow_test2
Power for Small Sample Test 
Test statistic is: KS  
Power is: 0.9902265  
Null is: 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1  
Alt is: 0.25 0.25 0.25 0.25   0   0   0   0   0   0  
Alpha is: 0.05  
Number of Bins: 10  
Number of Observations: 15 

So this shows how the exact test R code can be extended beyond just 7 day-of-week bins. I have not done really any exploration of the power of the KS test or differing numbers of bins though.

License plate readers and the trade off in privacy

As a researcher in criminal justice, tackling ethical questions is a difficult task. There are no hypotheses to test, nor models to fit, just opinions bantering around. I figured I would take my best shot and writing some coherent thoughts on the topic of the data police collect and its impacts on personal privacy – and my blog is really the best outlet.

What prompted this is a recent Nick Selby post which suggested the use of license plate readers (LPRs) to target Johns in LA is one of the worst ideas ever and a good example of personal privacy invasion by law enforcement. (Also see this Washington Post opinion article.)

I have a bit of a different and more neutral take on the program, and will try to articulate some broader themes in personal privacy invasion and the collection/use of data by police. I think it is an important topic and will continue to be with the continual expansion of public sensor data being collected by the police (with body worn cameras, stationary cameras, cell phone data, GPS traces being some examples). Basically, much of the negative sentiment I’ve seen so far of this hypothetical intervention are for reasons that don’t have to do with privacy. I’ll articulate these points by presenting alternative, currently in use police programs that use similar means, but have different ends.

To describe the LA program in a nutshell, the police use what are called license plate readers to identify particular vehicles being driven in known prostitution areas. LPRs are just cameras that take a snapshot of a license plate, automatically code the alpha-numeric plate, and then place that [date-time-location-plate-car image] in a database. Linking up this data with registered vehicles, in LA the idea is to have the owner of the vehicle sent a letter in the mail. The letter itself won’t have any legal consequences, just a note that says the police know you have been spotted. The idea in theory is that you will think you are more likely to be caught in the future, and may have some public shaming also if your family happens to see the letter, so you will be less likely to solicit a prostitute in the future.

To start with, some of the critiques of the program focus on the possibilities of false positives. Probably no reasonable person would think this is a worthwhile idea if the false positive rate is anything but small – people will be angry with being falsely accused, there are negative externalities in terms of family relationships, and any potential crime reduction upside would be so small that it is not worthwhile. But, I don’t think that itself is damning to this idea – I think you could build a reasonable algorithm to limit false positives. Say the car is spotted multiple times at a very specific location, and specific times, and the home owners address is not nearby the location. It would be harder to limit false positives in areas where people conduct other legitimate business, but I think it has potential with just LPR data, and would likely improve by adding in other information from police records.

If you have other video footage, like from a stationary camera, I think limiting false positives can definitely be done by incorporating things like loitering behavior and seeing the driver interact with an individual on the street. Eric Piza has done similar work on human coding/monitoring video footage in Newark to identify drug transactions, and I have had conversations with an IBM Smart City rep. and computer scientists about automatically coding audio and video to identify particular behaviors that are just as complicated. False negatives may still be high, but I would be pretty confident you could create a pretty low false positive rate for identifying Johns.

As a researcher, we often limit our inquiries to just evaluating 1) whether the program works (e.g. reduces crime) and 2) if it works whether it is cost-effective. LPR’s and custom notifications are an interesting case compared to say video cameras because they are so cheap. Camera’s and the necessary data storage infrastructure are so expensive that, to be frank, are unlikely to be a cost-effective return on investment in any short term time frame even given the best case scenario crime reductions (ditto for police body worn cameras). LPR’s and mailing letters on the other hand are cheap (both in terms of physical capital and human labor), so even small benefits could be cost-effective.

So in short, I don’t think the idea should be dismissed outright because of false positives, and the idea of using public video/sensor footage to proactively identify criminal behavior could be expanded to other areas. I’m not saying this particular intervention would work, but I think it has better potential than some programs police departments are currently spending way more money on.

Assuming you could limit the false positives, the next question then is it ok for the police to intrude on the privacy of individuals who have not committed any particular crime? The answer to this I don’t know, but there are other examples of police sending letters that are similar in nature but haven’t generated much critique. One is the use of letters to trick offenders with active warrants to turning themselves in. Another more similar example though are custom notifications. These are very similar in that often the individuals aren’t identified because of specific criminal charges, but are identified using data analytics and human intelligence to place them as high risk and gang involved offenders. Intrusion to privacy is way higher for these custom notifications than the suggested Dear John letters, but individuals did much more to precipitate police action as well.

When the police stop you in the car or on the street the police are using discretion to intrude in your privacy under circumstances where you have not necessarily committed a crime. Is there any reason a cop has to take that action in person versus seeing it on a video? Automatic citations at red light cameras are similar in mechanics to what this program is suggesting.

The note about negative externalities to legitimate businesses in the areas and the cost of letters I consider hyperbole. Letters are cheap, and actual crime data is frequently available that could already be used to redline neighborhoods. But Nick’s critique of the information being collated by outside agencies and used in other actuarial aspects, such as loans and employment decisions, I think is legitimate. I have no good answers to this problem – I have mixed feelings as I think open data is important (which ironically I can’t quantify in any meaningful way), and I think perpetual online criminal histories are a problem as well. Should we not have public crime maps though because businesses are less likely to invest in high crime neighborhoods? I think doing a criminal background check for many businesses is a legitimate query as well.

I have mixed feelings about familial shaming being an explicit goal of the letters, but compared to an arrest the letter is mundane. It is even less severe than a citation (which given some state laws you could be given a citation for loitering in a high prostitution area). Is a program that intentionally tries to shame a person – which I agree could have incredible family repercussions – a legitimate goal of the criminal justice system? Fair question, but in terms of privacy issues though I think it is a red herring – you can swap out different letters that would not have those repercussions but still uses the same means.

What if instead of the "my eyes are on you" letter the police simply sent a PSA like post-card that talked about the blight of sex workers? Can police never send out letters? How about if police send out letters to people who have previous victimizations about ways to prevent future victimization? I have a feeling much of the initial negative reactions to the Dear John program are because of the false positive aspect and the "victimless" nature of the crime. The ethical collection and use of data is a bit more subtle though.

LPR data was initially intended to passively identify stolen cars, but it is pretty ripe for mission creep. One example is that the police could use LPR data to actively track a cars location without a warrant. It is easy to think of both good and other bad examples of its use. For good examples, retrospectively identifying a car at the scene of a crime I think is reasonable, or to notify the police of a vehicle associated with a kidnapping.

For another example use of LPR data, what if the police did not send custom notifications, but used such LPR data to create a John list of vehicles, and then used that as information to profile the cars? If we think using LPR data to identify stolen cars is a legitimate use should we ignore the data we have for other uses? Does the potential abuse of the data outweigh the benefits – so LPR collection shouldn’t be allowed at all?

For equivalent practices, most police departments have chronic offender or gang lists that use criminal history, victimizations, where you have been stopped and who you have been stopped with to create similar databases. This is all from data the police routinely collect. The LPR data can be reasonably questioned whether it is available for such analytics use – police RMS data is often available in large swaths to the general public though.

Although you can question whether police should be allowed to collect LPR data, I am going to assume LPR data is not going to go away, and cameras definitely are not. So how do you regulate the use of such data within police departments? In New York, when you conduct an online criminal history check you have to submit a reason for doing the check. That is a police officer or a crime analyst can’t do a check of your next door neighbor because you are curious – you are supposed to have a more relevant reason related to some criminal investigation. You could have a similar set up with LPR that prevents actively monitoring a car except in particular circumstances and to purge the data after a particular time frame. It would be up to the state though to enact legislation and monitor its use. There is currently some regulation of gang databases, such as sending notifications to individuals if they are on the list and when to take people off the list.

Similar questions can be extended beyond public cameras though to other domains, such as DNA collection and cell phone data. Cell phone data is regularly collected with warrants currently. DNA searching is going beyond the individual to familial searches (imagine getting a DUI, and then the police use your DNA to tell that a close family member committed a rape).

Going forward, to frame the discussion of police behavior in terms of privacy issues, I would ask two specific questions:

  • Should the police be allowed to collect this data?
  • Assuming the police have said data, what are reasonable uses of that data?

I think the first question, should the police be allowed to collect this data, should be intertwined with how well does the program work and how cost-effective is the program (or potential if the program has not been implemented yet). There are no bright lines, but there will always be a trade off between personal privacy and public intrusion. Higher personal intrusion would demand a higher level of potential benefits in terms of safety. Given that LPR’s are passively collecting data I consider it an open question whether they meet a threshold of whether it is reasonable for the police to collect such data.

Some data police now collect, such as public video and DNA, I don’t see going away whether or not they meet a reasonable trade-off. In those cases I think it is better to ask what are reasonable uses of that data and how to prevent abuses of it. Basically any police technology can be given extreme examples where it saved a life or where a rogue agent used it in a nefarious way. Neither extreme case should be the only information individuals use to evaluate whether such data collection and use is ethical though.

Spatial analysis course in CJ (graduate) – Spring 2016 SUNY Albany

This spring I am teaching a graduate level GIS course for the school of criminal justice on the downtown SUNY campus. There are still seats available, so feel free to sign up. Here is the page with the syllabus, and I will continue to add additional info./resources to that page.

Academics tend to focus on regression of lattice/areal data (e.g. see Matt Ingrams course over in Poli. Sci.), and in this course I tried to mix in more things I regularly encountered while working as a crime analyst that I haven’t seen coverage of in other GIS courses. For example I have a week devoted to the journey to crime and geographic offender profiling. I also have a week devoted to introducing the current most popular models used to forecast crime.

I’ve started a specific wordpress page for courses, which I will update with additional courses I prepare.

Accepted position at University of Texas at Dallas

After being on the job market 2+ years, I have finally landed a tenure-track job at the University of Texas at Dallas in their criminology department. Long story short, I’m excited for the opportunity at Dallas, and I’m glad I’m done with the market.

I will refrain from giving job advice, I doubt I did a good job in many circumstances in all stages. But now that it is over I wanted to map all the locations I applied to. Red balloons are places I had an in person interview for a tenure track position.

In the end I applied to around 80 ads over the two year period (about 40 per each wave), and I had 8 in person interviews before I landed the Dallas position. My rate is worse than all of my friends/colleagues on the market during the past few years (hence why I shouldn’t give advice), but around 4~5 interviews before getting an offer is the norm among the small sample size of my friends (SUNY Albany CJ grads that is).

So folks soon to be on the market this is one data point of what to expect.

Presentation at ASC 2015

Later this week I will be at the American Society of Criminology meetings in D.C. I am presenting some of the work from my dissertation on the correlation between 311 calls for service and crime as a test of the broken windows thesis. I have an updated pre-print on SSRN based on some reviewer feedback, the title is

The Effect of 311 Calls for Service on Crime in D.C. at Micro Places

and here is the structured abstract:

Objectives: This study tests the broken windows theory of crime by examining the relationship between 311 calls for service and crime at the street segment and intersection level in Washington, D.C.

Methods: Using data on 311 calls for service in 2010 and reported Part 1 crimes in 2011, this study predicts the increase in counts of crime per street unit per additional reported 311 calls for service using negative binomial regression models. Neighborhood fixed effects are used to control for omitted neighborhood level variables.

Results: 311 calls for service based on detritus and infrastructure complaints both have a positive but very small effect on Part 1 crimes while controlling for unobserved neighborhood effects.

Conclusions: Results suggest that 311 calls for service are a valid indicator of physical disorder where available. The findings partially confirm the broken windows hypothesis, but reducing physical disorder is unlikely to result in appreciable declines in crime.

Not in the paper (but in my presentation), here is the marginal relationship between infrastructure related 311 complaints and crime

I am presenting the paper on Wednesday at 11 am. The panel title is Environmental Approaches to Crime Prevention and Intervention, and it is located at Hilton, E – Embassy, Terrace Level. There are two other presentations as well, all related to the spatial analysis of crime. (Kelly Edmiston has followed up and stated he can not make it.)

I will be in D.C. from Wednesday until Friday afternoon, so if you want to get together in that time frame feel free to send me an email.

Poster presentations should have a minimum font size of 25 points

A fairly generic problem I’ve been trying to do some research on is how large should fonts be for posters and PowerPoint presentations. The motivation is my diminishing eyesight over the years, and in particular default labels for statistical graphics are almost always too small in my opinion. Projected presentations just exacerbate the problem.

First, to tackle the project we need to find research about the the sizes that individuals can comfortably read letters. You don’t measure size of letters in absolute distance terms though, you measure it in the subtended angle that an object commands in your vision. That is, it is both a function of the height of the letters as well as the distance you are away from the object. I.e. in the below diagram angle A is larger than angle B.

The best guide for the size of this angle I have found for letters is an article by Sidney Smith, Letter Size and Legibility. Smith (1979) had a set of students make various labels and then have people stand too far away to be able to read them. Then the participants walked towards the labels until they could read them. Here is the histogram of those subtended angles (in radians) Smith produced:

From this Smith gives the recommendation as 0.007 radians as a good bet for pretty much everyone to be able to read the text. My research into other recommendations (eye tests, highway symbols) tends to be smaller, and between mine and Smith’s other sources tends to produce a range of 0.003 to 0.010 radians. Personal experimentation for me is that 0.007 is a good size, although up to 0.010 is not uncomfortably large. Most everyone with corrective vision can clearly see under 0.007, but we shouldn’t be making our readers strain to read the text.

For comparison, I sit approximately 22 inches away from my computer screen. A subtended angle of 0.007 produces a font size of just over 11 points at that distance. At my usual sitting distance I can read fonts down to 7 points, but I would prefer not to under usual circumstances.

This advice can readily translate to font sizes in poster presentations, since there is a limited range in which people will attempt to read them. Block’s (1996) suggestion that most people are around 4 feet away when they read a poster seems pretty reasonable to me, and so this produces a letter height of 0.34 inches needed to correspond to a 0.007 subtended angle. One point of font is 1/72 inches in letter height, so this converts to a 25 point font as the minimum to which most individuals can comfortably read the words in a poster. (R Functions at the end of the post for conversions, although it is based on relatively simple geometry.)

This advice is larger than Block’s (which is 20 point), but fits in line with Colin Purrington’s templates, which use 28 point for the smallest font. Again note that this is the minimum font for the poster, things like titles and author names should clearly be larger than the minimum to create a hierarchy. Again a frequent problem are axis labels for statistical graphics.

It will take more work to extend this advice to projected presentations, since there is more variability in projected sizes as well as rooms. So if you see a weirdo with a measuring tape at the upcoming ASC conference, don’t be alarmed, I’m just collecting some data!


Here are some R functions, the first takes a height and distance and return the subtended angle (in radians). The second takes the distance and radians to produce a height.

visual_angleR <- function(H,D){ 
   x <- 2*atan(H/(2*D))
   return(x)
}

visual_height <- function(D,Rad) {
  x <- 2*D*tan(Rad/2) #can use sin as well instead of tan
  return(x)
}

Since a point of font is 1/72 of an inch, the code to calculate the recommended font size is visual_height(D=48,Rad=0.007)*72 and I take the ceiling of this value for the 25 point recommendation.

Music and distractions in the workplace

I was recently re-reading Zen and the Art of Motorcycle Maintenance, and it re-reminded me of why I do not like to listen to music in the workplace. The thesis in Pirsig’s book (in regards to listening to music) is simple; you can’t concentrate entirely on the task at hand if you have music distracting you. So those who value their work tend to not have idle distractions like music playing (and be all engrossed in their work).

I have worked in various shared workspaces (cubicles and shared offices) for quite a while now, and I do have a knack for going off into space and ignoring all of the background noise around me. But I still do not like listening to music, even though I have learned to cope with the situation. At this point I prefer the open office workspace, as there at least is no illusion of privacy. When I worked at a cubicle someone coming behind me and scaring me was basically a daily thing.

Scott Adams, the artist of the Dilbert comic, had a recent blog post saying that music is the lesser evil compared to constant distractions via the internet (email, facebook, twitter, etc.) This I can understand as well, and sometimes I turn off the wi-fi to try to get work done without distraction. I don’t see how turning on music helps, but given its prevalence it may just be differences between myself and other people. I should probably turn off the wi-fi for all but an hour in the morning and an hour in the afternoon everyday, but I’m pretty addicted to the internet at this point.

It partly depends on the task I am currently working on though how easily I am distracted. Sometimes I can get really engrossed in a particular problem and become obsessed with it to the point you could probably set the office on fire and I wouldn’t notice. For example this programming problem dominated my thoughts for around two days, and I ended up thinking of the general solution while I did not have access to the computer (while I was waiting for my car to get inspected). Most of the time though I can only give that type of concentration for an hour or two a day though, and the rest of the time I am working in a state of easy distraction.

Background music I don’t like, and other ambient noises I can manage to drown out, but background TV drives me crazy. My family was watching videos (on TV and tablets) the other day while I was reading Zen and ironically I became angry, because I was really into the book and wanted to give it my full concentration. I know people who watch TV in bed to go to sleep, and it is giving me a headache just thinking about it while I am writing this blog post.

I highly recommend both Zen and the Art of Motorcycle Maintenance and Scott Adam’s blog. I’m glad I revisited Zen, as it is an excellent philosophical book on the logic of science that did not make much of an impression on me as an undergrad, but I have a much better grasp of it after having my PhD and reading some other philosophy texts (like Popper).

New working paper: What We Can Learn from Small Units of Analysis

I’ve posted a new working paper, What We Can Learn from Small Units of Analysis to SSRN. This is a derivative of my dissertation (by the same title). Below is the abstract:

This article provides motivation for examining small geographic units of analysis based on a causal logic framework. Local, spatial, and contextual effects are confounded when using larger units of analysis, as well as treatment effect heterogeneity. I relate these types of confounds to all types of aggregation problems, including temporal aggregation, and aggregation of dependent or explanatory variables. Unlike prior literature critiquing the use of aggregate level data, examples are provided where aggregation is unlikely to hinder the goals of the particular research design, and how heterogeneity of measures in smaller units of analysis is not a sufficient motivation to examine small geographic units. Examples of these confounds are presented using simulation with a dataset of crime at micro place street units (i.e. street segments and intersections) in Washington, D.C.

As always, if you have comments or critiques let me know.

Tables and Graphs paper rejection/update – and on the use of personal pronouns in scientific writing

My paper, Tables and Graphs for Monitoring Temporal Crime Patterns was recently rejected from Policing: An International Journal of Police Strategies & Management. I’ve subsequently updated the SSRN draft based on feedback from the review, and here I post the reviews and my responses to those reviews (in the text file).

One of the main critiques by both reviewers was that the paper was too informal, mainly because of the use of "I" in the paper. I use personal pronouns in writing intentionally, despite typical conventions in scientific writing, so I figured a blog post about why I do this is in order. I’ve been criticized for it on other occasions as well, but this is the first time it was listed as a main reason to reject an article of mine.

My main motivation comes from Michael Billig’s book Learn to Write Badly: How to Succeed in the Social Sciences (see a prior blog post I wrote on the contents). In a nut-shell, when you use personal pronouns it is clear that you, the author, are doing something. When you rewrite the sentence to avoid personal pronouns, you often obfuscate who the actor is in a particular sentence.

For an example of Billig’s point that personal pronouns can be more informative, I state in the paper:

I will refer to this metric as a Poisson z-score.

I could rewrite this sentence as:

This metric will be referred to as a Poisson z-score.

But that is ambiguous as to its source. Did someone else coin this phrase, and I am borrowing it? No – it is a phrase I made up, and using the personal pronoun clearly articulates that fact.

Pretty much all of the examples where I eliminated first person in the updated draft were of the nature,

In this article I discuss the use of percent change in tables.

which I subsequently changed to:

This article discusses the use of percent changes as a metric in tables.

Formal I suppose, but insipid. All rewriting the sentence to avoid the first person pronoun does is make the article seem like a sentient being, as well as forces me to use the passive tense. I don’t see how the latter is better in any way, shape, or form – yet this is one of the main reasons my paper is rejected above. The use of "we" in academic articles seems to be more common, but using "we" when there is only one author is just silly. So I will continue to use "I" when I am the only author.

Favorite maps and graphs in historical criminology

I was reading Charles Booth’s Life and Labour of the People in London (available entirely at Google books) and stumbled across this gem of a connected dot plot (between pages 18-19, maybe it came as a fold out in the book?)

(You will also get a surprise of the hand of the scanner in the page prior!) This reminded me I wanted to make a collection of my favorite historical examples of maps and graphs for criminology and criminal justice. If you read through Calvin Schmid’s Handbook of Graphical Presentation (available for free at the internet archive) it was a royal pain to create such statistical graphics by hand before computers. It makes you appreciate the effort all that much more, and many of the good ones will rival the quality of any graphic you can make on the computer.

Calvin Schmid himself has some of my favorite example maps. See for instance this gem from Urban Crime Areas: Part II (American Sociological Review, 1960):

The most obvious source of great historical maps in criminology though is from Shaw and McKay’s Juvenile Delinquency in Urban Areas. It was filled with incredible graphs and maps throughout. Here are just a few examples. (These shots are taken from the second edition in 1969, but they are all from the first part of the book, so were likely in the 1942 edition):

Dot maps

Aggregated to grid cells

The concentric zonal model

And they even have some binned scatterplots to ease in calculating linear regression equations

Going back further, Friendly in A.-M. Guerry’s moral statistics of France: Challenges for multivariable spatial analysis has some examples of Guerry’s maps and graphs. Besides choropleth maps, Guerry has one of the first examples of a ranked bumps chart (as later coined by Edward Tufte) of the relative rankings of the counts of crime at different ages (1833):

I don’t have access to any of Quetelet’s historical maps, but Cook and Wainer in A century and a half of moral statistics in the United Kingdom: Variations on Joseph Fletcher’s thematic maps have examples of Joseph Fletcher’s choropleth maps (as of 1849):

Going to more recent mapping examples, the Brantingham’s most notable I suspect is their crime pattern nodes and paths diagram, but my favorites are the ascii glyph contour maps in Crime seen through a cone of resolution (1976):

The earliest example of a journey-to-crime map I am aware of is Capone and Nichols Urban structure and criminal mobility (1976) (I wouldn’t be surprised though if there are earlier examples)

Besides maps, one other famous criminology graphic that came to mind was the age-crime curve. This is from Age and the Explanation of Crime (Hirschi and Gottfredson, 1983) (pdf here). This I presume was made with the computer – although I imagine it was still a pain in the butt to do it in 1983 compared to now! Andresen et al.’s reader Classics in Environmental Criminology in the Quetelet chapter has an age crime curve recreated in it (1842), but I will see if I can find an original scan of the image.

Edit: Was able to find an online scan of Quetelet’s original work in French. This has a fitted sine curve as one of the figures, but if you check out the tables he has binned arrest rates (page 65).

Quetelet_AgeCrimeCurve

I will admit I have not read Wolfgang’s work, but I imagine he had graphs of the empirical cumulative distribution of crime offenses somewhere in Delinquency in a Birth Cohort. But William Spelman has many great examples of them for both people and places. Here is one superimposing the two from Criminal Careers of Public Places (1995):

Michael Maltz has spent much work on advocating for visual presentation as well. Here is an example from his chapter, Look Before You Analyze: Visualizing Data in Criminal Justice (pdf here) of a 2.5d kernel density estimate. Maltz discussed this in an earlier publication, Visualizing Homicide: A Research Note (1998), but the image from the book chapter is nicer.

Here is an album with all of the images in this post. I will continue to update this post and album with more maps and graphs from historical work in criminology as I find them. I have a few examples in mind — I plan on adding a multivariate scatterplot in Don Newman’s Defensible Space, and I think Sampson’s work in Great American City deserves to be mentioned as well, because he follows in much of the same tradition as Shaw and McKay and presents many simple maps and graphs to illustrate the patterns. I would also like to find the earliest network sociogram of crime relationships. Maltz’s book chapter has a few examples, and Papachristo’s historical work on Al Capone should be mentioned as well (I thought I remembered some nicer network graphs though in Papachristos’s book chapter in the Morselli reader).

Let me know if there are any that I am missing or that you think should be added to the list!