Odds Ratios NEED To Be Graphed On Log Scales

Andrew Gelman blogged the other day about an example of Odds Ratios being plotted on a linear scale. I have seen this mistake a couple of times, so I figured it would be worth the time to further elaborate on.

Reported odds ratios are almost invariably from the output of a generalized linear regression model (e.g. logistic, poisson). Graphing the associated exponentiated coefficients and their standard errors (or confidence intervals) is certainly a reasonable thing to want to do – but unless someone wants to be misleading they need to be on a log scale. When the coefficients (and the associated intervals) are exponeniated they are no longer symmetric on a linear scale.

To illustrate a few nefarious examples, lets pretend our software spit out a series of regression coefficients. The table shows the original coefficients on the log odds scale, and the subsequent exponentiated coefficients +- 2 Standard Errors.

Est.  Point  S.E. Exp(Point) Exp(-2*S.E.) Exp(+2*S.E.)
  1   -0.7   0.1    0.5        0.4            0.6
  2    0.7   0.1    2.0        1.6            2.5
  3    0.2   0.1    1.2        1.0            1.5
  4    0.1   0.8    1.1        0.2            5.5
  5   -0.3   0.9    0.7        0.1            4.5

Now, to start lets graph the exponentiated estimates (the odds ratios) for estimates 1 and 2 and their standard errors on an arithmetic scale, and see what happens.

This graph would give the impression that 2 is both a larger effect and has a wider variance than effect 1. Now lets look at the same chart on a log scale.

By construction effects 1 and 2 are exactly the same (this is clear on the original log odds scale before the coefficients were exponentiated). Changes in the ratio of the odds can not go below zero, and a change from an odds ratio between 0.5 and 0.4 is the same relative change as that between 2.0 and 2.5. On the linear scale though the former is a difference of 0.1, and the latter a difference of 0.5.

Such visual discrepancies get larger the further towards zero you go, and as what goes in the denominator and what goes in the numerator is arbitrary, displaying these values on a linear scale is very misleading. Consider a different example:

Well, what would we gather from this? Estimates 4 and 5 both have a wide variance, and the majority of their error bars are both above 1. This is an incorrect interpretation though, as the point estimate of 5 is below 1, and more of its error bar is on the below 1 side.

Looking up some more examples online this may be a problem more often than I thought (doing a google image search for “plot odds ratios” turns up plenty of examples to support my position). I even see some examples of forest plots of odds ratios fail to do this. An oft critique of log scales is that they are harder to understand. Even if I acquiesce that this is true, plotting odds ratios on a linear scale is misleading and should never be done.


To make a set of charts in SPSS with log scales for your particular data you can simply enter in the model estimates using DATA LIST and then use GGRAPH to make the plot. In particular for GGRAPH see the SCALE lines to set the base of the logarithms. Example below:

*Can input your own data.
DATA LIST FREE / Id  (A10) PointEst  SEPoint Exp_Point CIExp_L CIExp_H.
BEGIN DATA
  1   -0.7   0.1    0.5        0.4            0.6
  2    0.7   0.1    2.0        1.6            2.5
  3    0.2   0.1    1.2        1.0            1.5
  4    0.1   0.8    1.1        0.2            5.5
  5   -0.3   0.9    0.7        0.1            4.5
END DATA.
DATASET NAME OddsRat.

*Graph of Confidence intervals on log scale.
FORMATS Exp_Point CIExp_L CIExp_H (F2.1).
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=Id Exp_Point CIExp_L CIExp_H
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: Id=col(source(s), name("Id"), unit.category())
  DATA: Exp_Point=col(source(s), name("Exp_Point"))
  DATA: CIExp_L=col(source(s), name("CIExp_L"))
  DATA: CIExp_H=col(source(s), name("CIExp_H"))
  GUIDE: axis(dim(1), label("Point Estimate and 95% Confidence Interval"))
  GUIDE: axis(dim(2))
  GUIDE: form.line(position(1,*), size(size."2"), color(color.darkgrey))
  SCALE: log(dim(1), base(2), min(0.1), max(6))
  ELEMENT: edge(position((CIExp_L+CIExp_H)*Id))
  ELEMENT: point(position(Exp_Point*Id), color.interior(color.black), 
           color.exterior(color.white))
END GPL.

Defending Prospectus

The defense date for my prospectus, What we can learn from small units of analysis, is finally set, November 1st at 9:30 (location TBD). You can find an electronic copy of the prospectus here and below is the abstract. So bring your slings and arrows (and I’ll bring some hydrogen peroxide and gauze?)

What we can learn from small units of analysis
Andrew Wheeler Prospectus Defense 11/1/2013

The dissertation is aimed at advancing knowledge of the correlates of crime at small geographic units of analysis. I begin the prospectus by detailing what motivates examining crime at small places, and focus on how aggregation creates confounds that limit causal inference. Local and spatial effects are confounded when using aggregate units, so to the extent the researcher wishes to distinguish between these two types of effects it should guide what unit of analysis is chosen. To illustrate these differences, I propose data analysis to examine local, spatial and contextual effects for bars, broken windows and crime using publicly available data from Washington D.C. I also propose a second set of data analysis focusing on estimating the effects of various measures of the built environment on crime.

How art can influence info viz.

The role of art on info viz. is a tortuous topic. Frequently, renditions of infographics have clear functional shortcomings as tools to convey quantitative data, but are lauded as beautiful pieces of art in spite of this. Thus the topic gets presented in overtones of function versus aesthetic, and any scientist worried about function would surely not choose something pretty over something obviously more functional (however you define functional). Thus the topic itself has some negative contextual history that impedes its discussion. But this is a false dichotomy; beauty need not impede function.

Here I want to bring to light some examples of how art actually has positive influences on the function of information visualization. I will break up the examples into two topics: the use of color and the rendering of graphics.

Color

The use of color to visualize discrete items in information visualizations is perhaps the most regular, but one of the most arbitrary decisions a designer makes. Here I will point to the work of Sidonie Christophe, who embraces the arbitrariness of using a color palette and uses popular pieces of artwork to create aesthetically pleasing color choices. Christophe makes the presumption that the colors in popular pieces of art provide ample contrast in the colors to effectively visualize different attributes, but are publicly vouched as aesthetically beautiful. Here is an example using a palette from one of Van Gogh’s paintings to apply to a street map (taken from Sidonie’s dissertation);

I won’t make any argument for Van Gogh’s palatte being more functional than other potential ones, but it is better than being guided by nothing (Van Gogh does have the added benefit of being color blind safe.)

Rendering

One example of artistic rendering of information I previously talked about was the logic behind the likability of XKCD graphs. There the motivation is both memorability of graphs and data reduction/simplification. Despite the minimalist straw man often painted of Tufte, in his later books he provides a variety of examples of diagrams that are artistic embellishments (e.g. the cover of Leviathan) but takes them as positive inspiration for GUI design.

Another recent example I came across is the use of curved lines in network diagrams (I have related academics interest in this for visualizing geographic flow data) which have motivation based on the work of Mark Lombardi.

The reason curved lines look nicer is not entirely aesthetic, it has functional values for displacing overlapping lines and (related) making in-bound edges easier to distinguish.

Much ado is made about network layout algorithms, but some interesting work is being done on visualizing the lines themselves. Interesting applications that are often lauded as beautiful are Circos and Hive Plots. Even Ben Shneiderman, creator of the treemap graphic, is getting in on the graphs as art wave.

I’m sure many other examples exist, so feel free to let me know in the comments.

Presenting at ASC this fall

The preliminary program for the American Society of Criminology meeting (this November in Atlanta) is up and my scheduled presentation time is 3:30 on Wednesday Nov. 20th. The title of my talk is Visualization Techniques for Journey to Crime Flow Data, and the associated pre-print is available on SSRN.

The title of the panel is Spatial and Temporal Analysis (a bit of a hodge podge I know), and is being held at Room 8 at the international level. The other presentations are;

  • Analyzing Spatial Interactions in Homicide Research Using a Spatial Durbin Model by Matthew Ruther and John McDonald (UPenn Demography and Criminology respectively)
  • Space-time Case-control Study of Violence in Urban Landscapes by Douglas Wiebe et al. (Some more folks from UPenn but not from the Criminology dept.!)
  • Spatial and Temporal Relationships between Violence, Alcohol Outlets and Drug Markets in Boston, 2006-2010 by Robert Lipton et al. (UMich Injury Center)

So come to see the other presenters (and stay for mine)! If anyone would like to meet up during the conference, feel free to shoot me an email. If I don’t cut my hair in the meantime maybe me and Robert Lipton can start a craziest hair for ASC award.

Note, I have no idea who the panel chair is, so perhaps we are still open for volunteers for that job.

Viz. JTC Flow lines – Paper for ASC this fall

Partly because I would go crazy if I worked only on my dissertation, I started a paper about visualizing JTC flow lines awhile back, and I am going to present what I have so far at the American Society of Criminology (ASC) meeting at Atlanta this fall.

My paper is still quite rough around the edges (so not quite up for posting to SSRN), but here is the current version. This actually started out as an answer I gave to a question on the GIS stackexchange site, and after I wrote it up I figured it would be worthwhile endeavor to write an article. Alasdair Rae has a couple of viz. flow data papers currently, but I thought I could extend those papers and write for a different audience of criminologists using journey to crime (JTC) data.

As always, I would still appreciate any feedback. I’m hoping to send this out to a journal in the near future, and so far I have only goated one of my friends into reviewing the paper.

My experience blogging in 2012

I figured I would write a brief post about my experience blogging. I created this blog and published my first post in December of 2011. Since then, in 2012, I published 30 blog posts, and totaled 7,200 views. While I thought the number was quite high (albeit a bit dissapointing compared to the numbers of Larry Wasserman), it is still many more people than would have listened to what I had to say if I didn’t write a blog. When starting out I averaged under 10 views a day, but throughout the year it steadily grew, and now I average about 30 views per day. The post that had the most traffic in one day was When should we use a black background for a map?, and that was largely because of some twitter traffic (a result of Steven Romalewski tweeting it and then it being re-tweeted by Kenneth Field), and it had 73 views.

I started the blog because I really loved reading alot of others blogs, and so I hope to encourage others to do so as well. It is a nice venue to share work and opinions for an academic, as it is more flexible and can be less formal than articles. Also much of what I write about I would just consider helpful tips or generic discussion that I wouldn’t get to discuss otherwise (SPSS programming and graph tips will never make it into a publication). One of my main motivations was actually R-Bloggers and the SAS blog roll; I would like a similarly active community for SPSS, but there is none really that I have found outside of the NABBLE forum (some exceptions are Andy Field, The Analysis Factor, Jon Peck and these few posts by a Louis K I only found through the labyrinth that is the IBM developerworks site (note I think you need to be signed in to even see that site), but they certainly aren’t very active and/or don’t write much about SPSS). I assume the best way to remedy that is to lead by example! Most of my more popular posts are ones about SPSS, and I frequently get web-traffic via general google searches of SPSS + something else I blogged about (hacking the template and comparing continuous distributions are my two top posts).

Also the blog is also just another place to highlight my academic work and bring more attention to it. WordPress tells me how often someone clicks a link on the blog, and someone has clicked the link to my CV close to 40 times since I’ve made the blog. Hopefully I have some pre-print journal articles to share on the blog in the near future (as well as my prospectus). My post on my presentation at ASC did not generate much traffic, but I would love to see a similar trend for other criminologists/criminal justicians in the future. My work isn’t perfect for sure, but why not get it out there at least for it to be judged and hopefully get feedback.

I would like to blog more, and I actively try to write something if I haven’t in a few weeks, but I don’t stress about it too much. I certainly have an infinite pool of posts to write about programming and generating graphs in SPSS. I have also thought about talking about historical graphics in criminology and criminal justice, or generally talking about some historical and contemporary crime mapping work. Other potential posts I’d like to write about are a more formal treatment about why I loathe most difference-in-differences designs, and perhaps about the sillyness that can ensue when using null-hypothesis significance testing to determine racial bias. But they will both take more careful elaboration on, so might not be anytime soon.

So in short, SPSSer’s, crime mapper’s, criminologist’s/criminal justician’s, I want you to start blogging, and I will eagerly consume your work (and in the meantime hopefully produce some more useful stuff on my end)!

Presentation at ASC – November, 2012

At the American Society of Criminology conference in Chicago in a few weeks I will be presenting (I can’t link to the actual presentation it appears, but you can search the program for Wheeler and my session will come up). Don’t take this as a final product, but I figured I would put out there the working paper/chapters of my dissertation that are the motivation for my presentation and my current set of slides.

Here is my original abstract I submitted a few months ago, The title of the talk is The Measurement of Small Place Correlates of Crime;

This presentation addresses several problems related with attempting to identify correlates of crime at small units of analysis, such as street segments. In particular the presentation will focus on articulating what we can potentially learn from smaller units of analysis compared to larger aggregations, and relating a variety of different measures of the built environment and demographic characteristics of places to theoretical constructs of interest to crime at places. Preliminary results examining the discriminant and convergent validity of theoretical constructs pertinent to theories for the causes of crime using data from Washington, D.C. will be presented.

This was certainly an over-ambitious abstract (I was still in the process of writing my prospectus when I submitted it). The bulk of the talk will be focused on “What we can learn from small units of analysis?”, and as of now after that as time allows I will present some illustrations of the change of support problem. Sorry to dissapoint, but nothing about convergent or divergent validity of spatial constructs will be presented (I have done no work of interest yet, and I don’t think I would have time to present any findings in anymore than a superficial manner anyway).

Note don’t be scared off by how dull the working paper is, the presentation will certainly be more visual and less mathematical (I will need to update my dissertation to incorporate some more graphical presentations).

Maps and graphis at the end of the talk demonstrating the change of support problem are still in the works (and I will continue to update the presentation on here). Here is a preview though of the first map I made that demonstrates how D.C. disseminates geo-date aggregated and snapped to street segments, making it problematic to mash up with census data.

  

 

The presentation time is on Friday at 9:30, and I’m excited to see the other presentations as well. It looks like to me that Pizarro et al.’s related research was recently published in Justice Quarterly, so if you don’t care for my presentation come to see the other presenters!

JQC paper on the moving home effect finally published!

My first solo publication, The moving home effect: A quasi experiment assessing effect of home location on the offence location, after being in the online first que nearly a year, has finally been published in the Journal of Quantitative Criminology 28(4):587-606. It was the oldest paper in the online first section (along with the paper by Light and Harris published on the same day)!

This paper was the fruits of what was basically the equivalent of my Masters thesis, and I would like to take this opportunity to thank all of the individuals whom helped me with the project, as I accidently ommitted such thanks from the paper (entirely my own fault). I would like to thank my committee members, Rob Worden, Shawn Bushway, and Janet Stamatel. I would also like to thank Robert Apel and Greg Pogarsky for useful feedback I had recieved on in class papers based on the same topic, as well as the folks in the Worden meeting group (for not only feedback but giving me motivation to do work so I had something to say!)

Rob Worden was the chair of my committee, and he deserves extra thanks not only for reviewing my work, but also for giving me a job at the Finn Institute, which otherwise I would have never had access to such data and opportunity to conduct such a project. I would also like to give thanks to the Syracuse PD and Chief Fowler for letting me use the data and reveal the PD’s identity in the publication.

I would also like to thank Alex Piquero and Cathy Widom for letting me make multiple revisions and accepting the paper for publication. For the publication itself I recieved three very excellent and thoughtful peer reviews. The excellence of the reviews were well above the norm for feedback I have otherwise encountered, and demonstrated that the reviewers not only read the paper but read it carefully. I was really happy with the improvements as well as how fair and thoughtful the reviews were. I am also very happy it was accepted for publication in JQC, it is the highest quality venue I would expect the paper to be on topic at, and if it wasn’t accepted there I was really not sure where I would send it otherwise.

In the future I will publish pre-prints online, so the publication before editing can still be publicly available to everyone. But, if you can not get a copy of this (or any of the other papers I have co-authored so far) don’t hesitate to shoot me an email for a copy of the off-print. Hopefully I have some more work to share in the new future on the blog! I currently have two papers I am working on with related topics, one with visualizing journey to crime flow data, and another paper with Emily Owens and Matthew Feedman of Cornell comparing journey to work data with journey to crime data.

For a teaser for this paper here is the structured abstract from the paper and a graph demonstrating my estimated moving home effect.

Objectives
This study aims to test whether the home location has a causal effect on the crime location. To accomplish this the study capitalizes on the natural experiment that occurs when offender’s move, and uses a unique metric, the distance between sequential offenses, to determine if when an offender moves the offense location changes.

Methods
Using a sample of over 40,000 custodial arrests from Syracuse, NY between 2003 and 2008, this quasi-experimental design uses t test’s of mean differences, and fixed effects regression modeling to determine if moving has a significant effect on the distance between sequential offenses.

Results
This study finds that when offenders move they tend to commit crimes in locations farther away from past offences than would be expected without moving. The effect is rather small though, both in absolute terms (an elasticity coefficient of 0.02), and in relation to the effect of other independent variables (such as the time in between offenses).

Conclusions
This finding suggests that the home has an impact on where an offender will choose to commit a crime, independent of offence, neighborhood, or offender characteristics. The effect is small though, suggesting other factors may play a larger role in influencing where offenders choose to commit crime.

CJ blog watch! Any ones I’m missing?

I follow alot of blogs. Although I don’t personally write alot about criminology or criminal justice related matters (maybe in the future when I have more time or inclination), but I figured I would share some of my favorites and query the crowd for more recommendations.

So a few with general discussion related to criminology and criminal justice matters are;

Both sites are well known criminologists/criminal justicians. I am aware of a few blogs written by current/former police chiefs;

  • Tom Casady’s The Director’s Desk. Tom Casady is currently the director of public safety for Lincoln, Nebraska and was previously the Police Chief at Lincoln’s department for quite some time. Tom is also very active in a variety of criminology/criminal justice organizations (so if you go to a related conference there is a good chance he is around somewhere!)
  • Chief’s Blog by Chief Ramsay of the Duluth Police Dept in Minnesota.

There are also a few that are highly focused on crime mapping & analysis;

  • Location Based Policing by Drew Dasher. He is a crime analyst for the Lincoln Nebraska PD.
  • Saferview – crime, fear and mapping: A blog by a retired police officer who is a student at University College London.
  • Diego Valle-Jones: Although his blog has a wider variety of topics, he has a series of very detailed posts and analysis on violence in Mexico and central american nations. I know crime stats are frequent fodder for generic statistical demonstrations, but this is real insightful analysis. My favorite is his investigation into the validity of homicide data statistics.

Are there others I am missing out on or should know about? Let me know in the comments if you have other suggestions.

FYI – the title of the blog post was motivated by Hans Toch’s new book, Cop Watch.

Dressing for success in academic interviews

On the academia stack exchange site a recent question came up about how to dress for academic interviews. The verdict was over-dressing is better than under-dressing. I had come across similar (although somewhat discordant) advice previously, but it is always good to have second opinions.

I will mention the previous blog posts I had come across, as they have good advice for giving academic job talks in general;

I think my advice is just give a talk so awesome no one cares about what you wear! But, take that advice with care, it is coming from a person who not only doesn’t know what he wore yesterday, but has on occasion wore shirts backwards (in public) all day long. I would probably only notice if you showed up dressed like Dilbert in this strip;

If you have other questions related to academia head on over and check out the Academia stack exchange site. Here a few examples of some of my favorite discussions so far;