All posts in category Crime Analysis

Online Crime Mapping for Troy PD

One of the big projects I have been working on since joining the Troy Police Department as a crime analyst last fall is producing timely geocoded data. I am happy to say that a fruit of this labor is the public crime map, via RAIDS Online, that has finally gone public (and can be viewed here). The credit for the online map mainly goes to BAIR Analytics and their free online mapping platform. I merely serve up the data for them to put on the map.

I’ve come to believe that more open data is the way of the future, and in particular an online crime map is a way to engage and enlighten the public to the realities of crime statistics. Although this comes with some potential negative externalities for the police department, such as complaints about innacurracy, decreasing home prices, and misleading symbology and offset geocoding. I firmly believe though that providing this information empowers the public to be more engaged in matters of crime and safety within their communities.

I thank the Troy Police Department for supporting the project in spite of these potential negative consequences, and Chief Tedesco for his continual support of the project. I also thank Capt. Cooney for arranging for all of the media releases. Below is the current online news stories (will update with CW15 if they post a story).

Here I end with a list of reading materials I consider necessary for any other crime analyst pondering the decision whether to public crime statistics online. And I end by again thanking Troy PD for allowing me to publish this data, and BAIR for providing the online service that makes it possible with a zero dollar budget.

Chainey, Spencer & Lisa Tompson. 2012. Engagement, empowerment and transparency: Publishing crime statistics using online crime mapping. Policing 6(3): 228-239.
Field, Kenneth. 2011. Reflections on a criminal crime map. The Cartographic Journal 48(1): 1-3.
Groff, Elizabeth R., Brook Kearley, Heather Fogg, Penny Beatty, Heather Couture & Julie Wartell. 2005. A randomized experimental study of sharing crime data with citizens: Do maps produce more fear? Journal of Experimental Criminology 1(1): 87-115. Online PDF Here.
Ratcliffe, Jerry H. 2002. Damned if you don’t, damned if you do: Crime mapping and its implications in the real world. Policing and Society 12(3): 211-225. Online PDF Here.
Ratcliffe, Jerry H. 2004. Geocoding crime and a first estimate of a minimum acceptable hit rate. International Journal of Geographical Information Science 18(1): 61-72. Online PDF Here.
Wartell, Julie & J. Thomas McEwen. 2001. Privacy in the information age: A guide for sharing crime maps and spatial data. U.S. Department of Justice. Office of Justice Programs. National Institute of Justice, Washington, D.C. Online PDF Here.

Let me know if I should add any papers to the list! Privacy implications (such as this work by Michael Leitner and colleagues) might be worth a read as well for those interested. See my geomasking tag at CiteUlike for various other references.

1 Comment

by Andy Wheeler on October 3, 2013 • Permalink

Posted in Crime Analysis, Crime Mapping

Tagged Crime Analysis, crime-mapping, Criminal Justice

Posted by Andy Wheeler on October 3, 2013

https://andrewpwheeler.com/2013/10/03/online-crime-mapping-for-troy-pd/

Querying Graph Neighbors in SPSS

The other day I showed how one could make an edge list in SPSS, which is needed to generate network graphs. Today, I will show how one can use an edge list in long format to identify neighbors for higher degree relationships.

So to start, what do I mean by a neighbor of higher degree relationship? Lets say I have a relationship between two nodes A B. Now lets also say I have another relationship between nodes B C. I might say that A and C don’t have a direct relationship, but they are related in that they both have a relationship to B. So A is a first degree neighbor of B, and A is a second degree neighbor of C. If I drew a graph of the listed network, the degree relationship between A and C would be the minimum number of edges one would have to traverse to get from the A node to the C node.

A  B  C

Why would a criminologist or crime analyst care about relationships of higher degrees? Well, here are two examples I am familiar with in criminology;

Giulia Berlusconi at the American Society of Criminology meeting (Chicago, 2012) presented an analysis in which highly connected individuals in a drug trafficking case were targeted for prosecution, in an attempt to have the maximal disruption to the drug network (ties were ID’ed by phone calls).
Some of the people at the UCLA mathematical and simulation modelling of crime group have some work predicting offenders for violent victimizations based on network ties.

For more simple and practical motivation for crime analysts, you may just have some particular individuals who you want to have targeted enforcement towards (known chronic offenders, violent gang members) and you would like to compile a more extended network of individuals related to those particular offenders to keep an eye on, or further investigate for possible ties to co-offending or gang activity.

So to start in SPSS, lets say that we have a edge list in long format, where there is a column that ID’s each person, and another column that shows if those two people are related at the same event. Exampe ties for a crime analyst may be victimizations, or co-offending, or being stopped for field interviews at the same time.

*Long dataset marking people sharing same incident (ID).
data list free / IncID (F2.0) Person (A15).
begin data
1 John 
1 Mary
2 John 
2 Frank
3 John 
3 William
4 John 
4 Andrew
5 Mary 
5 Frank
6 Mary 
6 William
7 Frank 
7 Kelly
8 Andrew 
8 Penny
9 Matt 
9 Andrew
10 Kelly 
10 Andrew
end data.
dataset name long.
dataset activate long.

Now, lets say we want to grab higher degree neighbors for Mary, first I will ID the first degree neighbors by creating a flag, and then aggregating within the incident ID. That is, cases that share an incident with Mary.


*ID Mary and then aggregate to get first degree.
compute degree1 = (Person = "Mary").
*Now aggregate to get all degree1s.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=IncID
  /degree1 = MAX(degree1).

To identify if a person is a second degree neighbor of Mary, I can first aggregate within person, to ID that both John and Frank are first degree neighbors, and then pick their first degree neighbors, who I will then be able to tell are second degree neighbors of Mary.


*Aggregate within edge ID to get second degrees.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=Person
  /degree2 = MAX(degree1).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=IncID
  /degree2 = MAX(degree2).

I can continue to do the same procedure for third degree neighbors.


*Aggregate within edge ID to get third degrees.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=Person
  /degree3 = MAX(degree2).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=IncID
  /degree3 = MAX(degree3).

So now this should be clear how I can make a recursive structure to identify neighbors of however many degrees I want. I end the post with a general MACRO to estimate all neighbors of a certain degree given an edge list in long format. Since this will expand to very many cases, you will likely only want to use a smaller list, or I provided an option in the macro to only check certain flagged individuals for neighbors.

I’d love to see or hear about other applications crime analysts are using such social networks for. On the academic bucket list to learn more about graph layout algorithms, so hopefully you see more posts about that from me in the future.


*Current requirement - personid needs to be a string variable.
*Flag argument will return people who have a value of one for that variable and all of there
neighbors in the long list.
DEFINE !neighbor (incid = !TOKENS(1)
                           /personid = !TOKENS(1)
                           /number = !TOKENS(1) 
                           /flag = !DEFAULT ("") !TOKENS(1)   )

dataset copy neighbor.
dataset activate neighbor.
match files file = *
/keep = !incid !personid !flag.

rename variables (!incid = IncID)
(!personid = Person).

*I need to make a stacked dataset for all cases.
compute XXconstXX = 1.

*Making wide dataset of Persons in the long list.
dataset copy XXwideXX.
dataset activate XXwideXX.

*eliminating duplicate people.
sort cases by Person.
match files file = *
/first = XXkeepXX
/by Person
/drop IncID.
select if XXkeepXX = 1.

*reshaping long to wide - could use flip here but that requires numeric PersonIDs.
*flip variables = Person.
!IF (!flag  !NULL) !THEN
select if !flag = 1.
!IFEND
casestovars
/ID = XXconstXX
/seperator = ""
/drop XXkeepXX !flag.
*Similar here you could just replace with a list of all unique offender nodes - just needs to be in wide format.

*Match back to the original long dataset.
dataset activate neighbor.
match files file = *
/table = 'XXwideXX'
/by XXconstXX.
dataset close XXwideXX.

*Reshape wide to long - @ is for filler so I dont need to know how many people - it gets dropped by default in varstocases.
string @ (A1).
varstocases
/make DegreePers from Person1 to @
/drop XXconstXX !flag.

sort cases by DegreePers IncID Person.

*Make first degree.
compute degree1 = (Person = DegreePers).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=IncID DegreePers
  /degree1 = MAX(degree1).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=Person DegreePers
  /degree1 = MAX(degree1).
*dropping self checks.
select if Person  DegreePers.

!LET !past = "degree1"
!DO !i = 2 !TO !number
!LET !current = !CONCAT("degree",!i)
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=IncID DegreePers
  /!current = MAX(!past).
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
  /BREAK=Person DegreePers
  /!current = MAX(!current).
!LET !past = !current
!DOEND
*Clean up and delete duplicates.
compute degree = (!number + 1) - SUM(degree1 to !current).
string P1 P2 (A100).
DO IF Person <= DegreePers.
    compute P1 = Person.
    compute P2 = DegreePers.
ELSE.
    compute P1 = DegreePers.
    compute P2 = Person.
END IF.
sort cases by P1 P2.
match files file = *
/first = XXkeepXX
/by P1 P2
/drop DegreePers Person.
*will be [1 + degrees searched] if not a neighbor.
select if XXkeepXX = 1 and degree <= !number.
match files file = *
/drop degree1 to !current XXkeepXX IncID.
formats degree (!CONCAT("F",!LENGTH(!number),".0")).
!ENDDEFINE.

*Example use case - uncomment to check it out.
*dataset close ALL.
*Long dataset marking people sharing same incident (ID).
*data list free / IncID (F2.0) Person (A15).
*begin data
1 John 
1 Mary
2 John 
2 Frank
3 John 
3 William
4 John 
4 Andrew
5 Mary 
5 Frank
6 Mary 
6 William
7 Frank 
7 Kelly
8 Andrew 
8 Penny
9 Matt 
9 Andrew
10 Kelly 
10 Andrew
*end data.
*dataset name long.
*dataset activate long.
*compute myFlag = 1.
*set mprint on.
*output close ALL.
*neighbor incid = IncID personid = Person number = 3.
*set mprint off.
*dataset activate long.
*dataset close neighbor.
*compute myFlag = (Person = "Mary" or Person = "Andrew").
*set mprint on.
*output close ALL.
*neighbor incid = IncID personid = Person number = 3 flag = myFlag.
*set mprint off.

2 Comments

by Andy Wheeler on July 19, 2013 • Permalink

Posted in Crime Analysis, SPSS

Tagged data-manipulation, MACRO, network, SPSS

Posted by Andy Wheeler on July 19, 2013

https://andrewpwheeler.com/2013/07/19/querying-graph-neighbors-in-spss/

Update for Aoristic Macro in SPSS

I’ve substantially updated the aoristic macro for SPSS from what I previously posted. The updated code can be found here. The improvements are;

Code is much more modularized, it is only 1 function and takes an Interval parameter to determine what interval summaries you want.
It includes Agresti-Coull binomial error intervals (95% Confidence Intervals). It also returns a percentage estimate and the total number of cases the estimate is based off of, besides the usual info for time period, split file, and the absolute aoristic estimate.
allows an optional command to save the reshaped long dataset

Functionality dropped are default plots, and saving of begin, end and middle times for the same estimates. I just didn’t find these useful (besides academic purposes).

The main motivation was to add in error bars, as I found when I was making many of these charts it was obvious that some of the estimates were highly variable. While the Agresti-Coull binomial proportions are not entirely justified in this novel circumstance, they are better than nothing to at least illustrate the error in the estimates (it seems to me that they will likely be too small if anything, but I’m not sure).

I think a good paper I might work on in the future when I get a chance to is 1) show how variable the estimates are in small samples, and 2) evaluate the asympotic coverages of various estimators (traditional binomial proportions vs. bootstrap I suppose). Below is an example output of the updated macro, again with the same data I used previously. I make the small multiple chart by different crime types to show the variability in the estimates for given sample sizes.

3 Comments

by Andy Wheeler on March 28, 2013 • Permalink

Posted in Crime Analysis, SPSS

Tagged MACRO, SPSS

Posted by Andy Wheeler on March 28, 2013

https://andrewpwheeler.com/2013/03/28/update-for-aoristic-macro-in-spss/

Informational Asymmetries in my role as Crime Analyst

One aspect I’ve come to realize in my job as crime analyst, and really in any technical job I’ve had, is that I face large informational asymmetries between myself and my employers (and colleagues). What exactly do I mean? Well, I consider a prime example of informational asymmetry when I have a large body of knowledge about some particular topic or task I need to conduct, and the person asking for the task has relatively little.

I believe this is problematic in one major way with my job: That people don’t know what is or is not reasonable to ask me to do, or similarly how long it takes me to conduct particular tasks. I believe most of the time this makes people hesitate to ask me particular questions or ask me to conduct particular analysis. The obverse happens though not entirely infrequently, I get asked nonchalantly to do something that is a considerable investment.

I’m not sure how to best solve this situation (especially the not asking part) besides by developing relationships with colleagues and the boss, and through experience elucidating what I can (or can’t do). To a certain extent I can’t know what people want if they don’t ask me.

The situation in which someone asks me to do something that takes more of in investment is easier, in that I can directly tell the person that this request is either unreasonable or will take along time. A good example of tasks that on the outside may look similar in scope, but are largely different are descriptive vs. causal analysis.

Examples of the difference are “How many calls for service occurred at this particular apartment in the last year?” (descriptive), or “Is there more crime around 15 Main St. than we would normally expect?” (causal). The first is typically just a query or the database and a table or map, and this will typically satisfy the answer. The other though is much more difficult, I have to dream up a reasonable comparison, else the information I provide may be potentially out of context.

The information I produce also depends on who is asking. If someone within the PD asks for descriptive statistics, that is usually all I provide. If someone from the public asks for descriptive statistics, I frequently (at least attempt to) provide more context for those statistics (i.e. some reasonable comparisons or historical trends that form the basis for causal analysis).

This is because I assume people within the PD have the necessary external context to evaluate the information, whereas people outside the PD don’t. If I just stated how many calls for service occurred on your street block, you may think your street is crime ridden, because you don’t have a good internal baseline to judge what is a reasonable number of calls for service. In such requests to the public I try to provide historical numbers over a long period (as people are often worried about newer trends) or comparisons to neighboring areas.

The informational asymmetry problem stills persists though, and filters into other areas of work. In particular how am I evaluated within the PD itself.

1 Comment

by Andy Wheeler on March 15, 2013 • Permalink

Posted in Crime Analysis, Criminal Justice

Tagged Crime Analysis, Criminal Justice

Posted by Andy Wheeler on March 15, 2013

https://andrewpwheeler.com/2013/03/15/informational-asymmetries-in-my-role-as-crime-analyst/

Search for:
Recent Posts
Categories
Categories
Site RSS Feeds
- RSS - Posts
- RSS - Comments
Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Email Address:

Join 392 other subscribers
aoristic big-data cartography census choropleth citeulike consulting cost-benefit courses crime-mapping crime-trends Crime Analysis Criminal Justice data-manipulation data visualization deep-learning ESRI excel flow-data folium geocoding github google-streetview-api grammar of graphics group-based-trajectory gun-violence healthcare homicide-rates hot spots hypothesis-testing linear programming LLM logistic-regression machine-learning MACRO mapping matplotlib meta network NetworkX officer-involved-shooting open-science paper Papers peer-review Poisson prediction Predictive-Policing preprint presentation Python Python-programability pytorch quasi-experiment r recidivism regression resources scholarly scraping seaborn shootings simulation small-multiples social-media social-networking SPSS stackexchange Stata statistics survey time-series uncertainty wdd web-scraping
Top Posts & Pages
Stack Exchange

Andrew Wheeler

All posts in category Crime Analysis

Online Crime Mapping for Troy PD

Querying Graph Neighbors in SPSS

Update for Aoristic Macro in SPSS

Informational Asymmetries in my role as Crime Analyst

Recent Posts

Categories

Site RSS Feeds

Follow Blog via Email

Top Posts & Pages

Stack Exchange