Update for Aoristic Macro in SPSS

I’ve substantially updated the aoristic macro for SPSS from what I previously posted. The updated code can be found here. The improvements are;

  • Code is much more modularized, it is only 1 function and takes an Interval parameter to determine what interval summaries you want.
  • It includes Agresti-Coull binomial error intervals (95% Confidence Intervals). It also returns a percentage estimate and the total number of cases the estimate is based off of, besides the usual info for time period, split file, and the absolute aoristic estimate.
  • allows an optional command to save the reshaped long dataset

Functionality dropped are default plots, and saving of begin, end and middle times for the same estimates. I just didn’t find these useful (besides academic purposes).

The main motivation was to add in error bars, as I found when I was making many of these charts it was obvious that some of the estimates were highly variable. While the Agresti-Coull binomial proportions are not entirely justified in this novel circumstance, they are better than nothing to at least illustrate the error in the estimates (it seems to me that they will likely be too small if anything, but I’m not sure).

I think a good paper I might work on in the future when I get a chance to is 1) show how variable the estimates are in small samples, and 2) evaluate the asympotic coverages of various estimators (traditional binomial proportions vs. bootstrap I suppose). Below is an example output of the updated macro, again with the same data I used previously. I make the small multiple chart by different crime types to show the variability in the estimates for given sample sizes.

Informational Asymmetries in my role as Crime Analyst

One aspect I’ve come to realize in my job as crime analyst, and really in any technical job I’ve had, is that I face large informational asymmetries between myself and my employers (and colleagues). What exactly do I mean? Well, I consider a prime example of informational asymmetry when I have a large body of knowledge about some particular topic or task I need to conduct, and the person asking for the task has relatively little.

I believe this is problematic in one major way with my job: That people don’t know what is or is not reasonable to ask me to do, or similarly how long it takes me to conduct particular tasks. I believe most of the time this makes people hesitate to ask me particular questions or ask me to conduct particular analysis. The obverse happens though not entirely infrequently, I get asked nonchalantly to do something that is a considerable investment.

I’m not sure how to best solve this situation (especially the not asking part) besides by developing relationships with colleagues and the boss, and through experience elucidating what I can (or can’t do). To a certain extent I can’t know what people want if they don’t ask me.

The situation in which someone asks me to do something that takes more of in investment is easier, in that I can directly tell the person that this request is either unreasonable or will take along time. A good example of tasks that on the outside may look similar in scope, but are largely different are descriptive vs. causal analysis.

Examples of the difference are “How many calls for service occurred at this particular apartment in the last year?” (descriptive), or “Is there more crime around 15 Main St. than we would normally expect?” (causal). The first is typically just a query or the database and a table or map, and this will typically satisfy the answer. The other though is much more difficult, I have to dream up a reasonable comparison, else the information I provide may be potentially out of context.

The information I produce also depends on who is asking. If someone within the PD asks for descriptive statistics, that is usually all I provide. If someone from the public asks for descriptive statistics, I frequently (at least attempt to) provide more context for those statistics (i.e. some reasonable comparisons or historical trends that form the basis for causal analysis).

This is because I assume people within the PD have the necessary external context to evaluate the information, whereas people outside the PD don’t. If I just stated how many calls for service occurred on your street block, you may think your street is crime ridden, because you don’t have a good internal baseline to judge what is a reasonable number of calls for service. In such requests to the public I try to provide historical numbers over a long period (as people are often worried about newer trends) or comparisons to neighboring areas.

The informational asymmetry problem stills persists though, and filters into other areas of work. In particular how am I evaluated within the PD itself.