All posts tagged mapping

Some more on Network distances vs Geographic distances intra-city

A prior post on analyzing distances looked at geographic versus network (road) distances between zip codes in New York and one particular location. Over the large distances the correlation ended up being 0.99. But most crime analysis applications will be within one city, so restricting the distances there will the correlation be just as high? I conducted some analysis in Albany, NY to see if this was the case.

First I took a set of 2,640 street segments and intersections in Albany, defined as basically having over 1 reported crime between 2000 through 2013. (This is a pretty good proxy for places where people are actually located in the city, so places where people might actually travel from/to.) Here is a map of those points showing the coverage.

I then made the 2,640^2 pairs, and then took a random sample of 2,300 of those pairs to calculate the geographic versus the network distance (calculating the network distance using the google distance API). Here is a flow map, again showing it has pretty good coverage of the city.

In this sample the correlation between the network distance and the geographic distance is 0.94, and below is the scatterplot. The red line is the line of equality, so we can see the network distance is always larger.

Making the graph on log scales basically takes away the heteroscedasticity, and shows some short distance outliers.

I then fit a regression of equation of log(Network Distance) ~ Intercept + b_0*log(Geo Distance), and then calculate the studentized residuals. Here is a small multiple flow map of those locations categorized by the truncated studentized residuals. I plotted flows under 200 meters as a red dot, as otherwise they basically have no area on the map to visualize. There are a few notable patterns, the -1 residuals (so closer network and geo distances) are locations along what looks like Central, Washington, Western and New Scotland (running east-west) and Broadway/Pearl (running north-south). So basically straight, major thoroughfares.

It is probable that if more locations in the isthmus and the south western part of the city were selected the distances would be not so nice, but the isthmus itself is largely the Pine Bush park, and the south western part is on the periphery of residential neighborhoods. Exporting the high residuals, what happens in the google distance API is that they are short trips on one way streets, and the to and from and going against the one way. I will have to investigate if you can set the google API to use walking distances to ignore this (as this wasn’t intended as a directed flow like that). Or just learn how to use the CrimeStat or network analysis in ArcMap distance calculation tools!

So although using network distances consistently increases the distances between points, they are still highly correlated, even for shorter in city patterns. If I fixed the flows going against one way streets it would likely be an even higher correlation.

1 Comment

by Andy Wheeler on April 4, 2015 • Permalink

Posted in Mapping

Tagged data visualization, mapping, network

Posted by Andy Wheeler on April 4, 2015

https://andrewpwheeler.com/2015/04/04/some-more-on-network-distances-vs-geographic-distances-intra-city/

Using Google Fusion Tables to make some maps!

In the past to share interactive maps with others I’ve used BatchGeo and CartoDB. BatchGeo is super easy to geocode a few incidents, and CartoDB has a few more stylistic options (including some very cool animations). Both of these projects have a limit on the number of points you can map with the free service though. The new Google maps allows you make similar products to BatchGeo and CartoDB, in that you can upload a csv file or kml and then do some light editing of the points, and then embed an iframe in a website if you want (I wish Google Maps had a time slider like Google Earth does). Here is an example from my PhD of a few locations that one of my original models did a very poor job of predicting the amount of crime at the street midpoint or intersection.

But a few recent projects I wanted to place many more geographies on the map than these free versions allow. ArcGIS online is pretty slick in my few tests, but I am settling on Google Fusion tables for the ability to link the geographies and data tables (plus the ability to filter is very nice). Basically you can upload your data table and kml in seperate Fusion tables and then merge them to create your own polygons with associated data. Here is another example from my dissertation and embedded map below.

Basically what I do is make a set of units of analysis based on street mid-points and intersections. I then divide the city up based on the Thiessen polygons of those sets of points for the allocation of different areal measures. E.g. I can calculate the overlap of the Thiessen polygon with the area of sidewalks.

I’m using Google Fusion tables for some other projects in which I want people within the PD to be able to interactively explore the data. My main interest in these slippy maps are that you can pan and zoom – and with a static map it is hard to recreate all of the potential views a consumer of the map wants. I can typically make a nicer overview map of the forest or any general data patterns in a static map, but if I think the user of the map will want to zoom in to particular locations these interactive maps meet that challenge. Pop-ups allow for a brief digging into the data as well, but don’t allow for visualizing patterns. Fusion tables are very limited though it the styling of the geography. (All of these free versions are pretty limited, but the Fusion tables are especially restrictive for point symbology and creating choropleth classes).

Using these maps has a trade off when sharing with the PD though. They are what I would call semi-public, in that if you want others to be able to view the map you can share a link, but anyone with the link can see the map. This prevents sharing of intimate information on the map that might be possibly leaked. (For the ability to have access control to more sensitive information, e.g. a user has to sign on to a secure website, I know Bair analytics offers paid for products like that – probably some of the prior web map apps I mentioned do so as well.) I’ve made them in the past for Troy P.D., but I really have no idea how often they were used – so other analysts let me know in the comments if you’ve had success with maps like these disseminating info. within the police department.

I’m getting devilishly close to finishing my dissertation, and I will post an update and link when the draft is complete. My prospectus can be seen here, and the linked maps are part of some supplemental material I compiled. The supplemental info. should provide a little more details on what the maps are showing.

Leave a comment

by Andy Wheeler on July 18, 2014 • Permalink

Posted in Crime Mapping, Mapping

Tagged crime-mapping, mapping, online-maps

Posted by Andy Wheeler on July 18, 2014

https://andrewpwheeler.com/2014/07/18/using-google-fusion-tables-to-make-some-maps/

What is up with 3d graphics for book covers?

The other day in Google books I noticed Graphics for Statistics and Data Analysis with R by Kevin Keen in the related book section. What caught my eye was not the title (there have to be 100+ related R books at this point) but the really awful 3d pie chart.

Looking at the preview on google books this appears to be an unfortunate substitution. The actual cover has a much more reasonable set of surface plots and other online book stores (e.g. Amazon) appear to have the correct cover.

I suspect someone at CRC Press used some stock imagery for the cover, and unfortunately the weird 3d pie graph has been propagated to the google book preview without correction.

This reminded me of a few other book covers in cartography and data visualization though that I find less than appealing. Now, I’m not saying here to judge a book by its cover, and I have not read all of the books I will point to here. But I find the use of 3d graphics in book covers in the data visualization field to be strange and bordering cognitive dissonance with the advice most of the authors give.

First I’ll start with a book I have read, and would suggest to everyone, Thematic cartography and geographic visualization by Slocum et al. I have the 2005 version, and it is dawned by this 3d landscape. (Sorry this is the largest image I can find online – other editions I believe have different covers.)

The multivariate display of data is admirable – so for exploratory analysis you could make a reasonable argument for the use of proportional sized circles superimposed on the choropleth map. The use of 3d in this circumstance though is gratuitous, and the extreme perspective hides much of the data while highlighting the green hills in the background.

The second mapping book I have slight reservations about critiquing the cover (I am on the job market!). I have not read the book, so I can not say anything about its contents. But roaming the book displays at an ASC conference I remember seeing this cover, GIS and Spatial Analysis for the Social Sciences: Coding, Mapping, and Modeling by Nash and Asencio.

This probably should not count in the other 3d graphics I am showing. The bar columns do have shading for 3d perspective – but the map otherwise is 2d. But the spectral color scheme is an awful choice. The red in the map tends to stand out the most – which places with zero crimes I don’t think you want to make that impression. The choropleth colors appear to be displaying the same data as the point data. The point data are so clustered that the choropleth can only be described as misleading – which may be a good point in text for side by side maps – but on the cover? Bar locations seem to be unrelated (as we might expect for juvenile crime) but they are again aggregated to the (probably) census units – making me question if the aggregation obfuscates the relationship. Bars are not available from the census – so it is likely this aggregation was intentional. I have no idea about the content of the book and I will likely get it and do an overall review of all crime mapping books sometime. But the cover is unambiguously a bad map.

The last book cover with 3d graphics (related to data-visualization) that I immediately remembered was R For Dummies by Meys and de Vries.

Now this when you look close really is not bad. It is not a graph on the cover, but a set of winding, hexagon cylinder stairsteps. So the analogy of taking small steps is fine – but the visual similarity to other statistical 3d graphics is clear. Consider the SPSS For Dummies book by Griffith.

Now that is an intentional, 3d chart made up of tiny blocks, with a trend line if you look closely, shadowed by cigarette like red bars in the background. At least this is so strange (and not possible in statistical software) that this example would never be confused with an actual reasonable statistical graphic. The Dummies series has such brand recognition as well that the dominant part of the cover might be the iconic yellow and type, as opposed to the inset graphic.

Not wanting to leave other software out of the loop, I looked for examples for SAS and Stata. SAS has a reasonable 3d cover in SAS System for Statistical Graphics by Friendly.

Short sidetrack story: I first learned statistical programming using SAS back in undergrad days at Bloomsburg University. Default graphics for SAS at that point (04-08) I believe were still the ASCII art looking things (at least that is what I remember). During our last meeting for my last statistics class – one of the other students showed me you could turn on the ODS output to html – and tables and graphs were by default pretty nice. I since have not had a need to use SAS.

This 3d cover by Friendly is arguably a reasonable use of 3d. 3d graphs are hard to navigate, and the use the anchors connecting the observations to the non-linear surface more easily associate a point with below or above the surface. It is certainly difficult though to understand the fit of the function – so likely a series of bivariate graphs would be more intuitive – especially given the meager number of points. I suspect the 3d on the cover is for the same reason 3d graphics were used in the other covers – because it looks cooler to book marketers!

Stata managed to debunk the 3d graph trends – I could not find any example Stata books with 3d graphics. Nick Cox’s newer collection of his Speaking Stata series though has some interesting embellishments.

While in isolation all of the graphs are fine – I’m sure Cox would not endorse the gratuitous use of color gradients in the graphics (I don’t think svg like gradients like that are even possible in Stata graphics). The ternary diagrams show nothing but triangles as well – so I don’t think such gradients are a good idea in any case for simply the background of the plot. Such embellishments could actually decode data, but in the case of bar graphs do not likely hurt or help with understanding the plot. When such gradients are used as the background though they likely compete with the actual data in the plot. Stata apparently can do 3d graphs – so I might suggest I write a book on crime modelling (published by Stata press) and insert a 3d graph on the cover (as this is clearly a niche in the market not currently filled!) I might have to make room for Chernoff faces somewhere on the front or back cover as well.

So maybe I am just seeing things in the examples of 3d covers. If anyone has any insight into how these publishers choose the covers let me know – or if you have other examples of bad book cover examples of data vizualization! Since most of my maps and graphs are pretty dull in 2d I might just outsource the graphic design if I made a book.

1 Comment

by Andy Wheeler on June 16, 2014 • Permalink

Posted in Data Visualization, Mapping, scholarly

Tagged cartography, data visualization, mapping, scholarly

Posted by Andy Wheeler on June 16, 2014

https://andrewpwheeler.com/2014/06/16/what-is-up-with-3d-graphics-for-book-covers/

Using Python to geocode data in SPSS

This is the first time since I’ve been using SPSS that I have regular access to Python and R programmability in all of the different places I use SPSS (home and multiple work computers). So I’ve been exploring more solutions to use these tools in regular data analysis and work-flows – of course to accomplish things that can not be done directly in native SPSS code.

The example I am going to show today is using geopy, a Python library that places several geocoding API’s all in a convenient set of scripts. So first once geopy is installed you can call Python code within SPSS by placing it within a BEGIN PROGRAM and END PROGRAM blocks. Here is an example modified from geopy’s tutorial.


BEGIN PROGRAM.
from geopy import geocoders
g = geocoders.GoogleV3()
place, (lat, lng) = g.geocode("135 Western Ave. Albany, NY")  
a = [place, lat, lng]
print a
END PROGRAM.

Now what we want to do is to geocode some address data that is currently stored in SPSS case data. So here is an example dataset with some addresses in Albany.


DATA LIST LIST ("|") / Address (A100).
BEGIN DATA
135 Western Ave. Albany, NY
Western Ave. and Quail St Albany, NY
325 Western Ave. Albany, NY
END DATA.
DATASET NAME Add.

Here I will use the handy SPSSINC TRANS function (provided when installing Python programmability – and as of SPSS 22 installed by default with SPSS) to return the geocoded coordinates using the Google API. The geocode function from geopy does not return the data in an array exactly how I want it, so what I do is create my own function, named g, and it coerces the individual objects (place, lat and lng) into an array and returns that.


BEGIN PROGRAM.
from geopy import geocoders
def g(a):
  g = geocoders.GoogleV3()
  place, (lat, lng) = g.geocode(a)
  return [place, lat, lng]
print g("135 Western Ave. Albany, NY")
END PROGRAM.

Now I can use the SPSSINC TRANS function to return the associated place string, as well as the latitude and longitude coordinates from Google.


SPSSINC TRANS RESULT=Place Lat Lng TYPE=100 0 0
  /FORMULA g(Address).

Pretty easy. Note that (I believe) the Google geocoding API has a limit of 2,500 cases – so don’t go submitting a million cases to be geocoded (use an offline solution for that). Also a mandatory mention should be made of the variable reliability of online geocoding services.

2 Comments

by Andy Wheeler on March 14, 2014 • Permalink

Posted in Python, SPSS

Tagged data-manipulation, geocoding, mapping, Python, SPSS

Posted by Andy Wheeler on March 14, 2014

https://andrewpwheeler.com/2014/03/14/using-python-to-geocode-data-in-spss/

Article: Viz. techniques for JTC flow data

My publication Visualization techniques for journey to crime flow data has just been posted in the online first section of the Cartography and Geographic Information Science journal. Here is the general doi link, but Taylor and Francis gave me a limited number of free offprints to share the full version, so the first 50 visitors can get the PDF at this link.

Also note that:

The pre-print is posted to SSRN. The pre-print has more maps that were cut for space, but the final article is surely cleaner (in terms of concise text and copy editing) and has slightly different discussion in various places based on reviewer feedback.
Materials I used for the article can be downloaded from here. The SPSS code to make the vector geometries for a bunch of the maps is not terribly friendly. So if you have questions feel free – or if you just want a tutorial just ask and I will work on a blog post for it.
If you ever want an off-print for an article just send me an email (you can find it on my CV. I plan on continuing to post pre-prints to SSRN, but I realize it is often preferable to cite the final in print version (especially if you take a quote).

The article will be included in a special issue on crime mapping in the CaGIS due to be published in January 2015.

Leave a comment

by Andy Wheeler on March 7, 2014 • Permalink

Posted in Crime Mapping, Data Visualization, Papers, scholarly

Tagged cartography, crime-mapping, data visualization, mapping, Papers

Posted by Andy Wheeler on March 7, 2014

https://andrewpwheeler.com/2014/03/07/article-viz-techniques-for-jtc-flow-data/

Some more about black backgrounds for maps

I am at it again discussing black map backgrounds. I make a set of crime maps for several local community groups as part of my job as a crime analyst for Troy PD. I tend to make several maps for each group, seperating out violent, property and quality of life related crimes. Within each map I try to attempt to make a hierarchy between crime types, with more serious crimes as larger markers and less severe crimes as smaller markers.

Despite critiques, I believe the dark background can be useful, as it creates greater contrast for map elements. In particular, the small crime dots are much easier to see (and IMO in these examples the streets and street name labels are still easy to read). Below are examples of the white background, a light grey background, and a black background for the same map (only changes are the black point marker is changed to white in the black background map, streets and parks are drawn with a heavy amount of transparency to begin with so don’t need to be changed).

Surprisingly to me, ink be damned, even printing out the black background looks pretty good! (I need to disseminate paper copies at these meetings) I think if I had to place the legend on the black map background I would be less thrilled, but currently I have half the page devoted to the map and the other half devoted to a table listing the events and the time they occurred, along with the legend (ditto for the scale bar and the North arrow not looking so nice).

I could probably manipulate the markers to provide more contrast in the white background map (e.g. make them bigger, draw the lighter/smaller symbols with dark outlines, etc.) But, I was quite happy with the black background map (and the grey background may be a useful in-between the two as well). It took no changes besides changing the background in my current template (and change black circles to white ones) to produce the example maps. I also chose those sizes for markers for a reason (so the map did not appear flooded with crime dots, and more severe and less severe crimes were easily distinguished), and so I’m hesistant to think that I can do much better than what I have so far with the white background maps (and I refuse to put those cheesy crime marker symbols, like a hand gun or a body outline, on my maps).

In terms of differentiating between global and local information in the maps, I believe the high contrast dark background map is nice to identify local points, but does not aid any in identifying general patterns. I don’t think general patterns are a real concern though for the local community groups (displaying so many points on the same map in general isn’t good for distinguishing general patterns anyway).

I’m a bit hesitant to roll out the black maps as of yet (maybe if I get some good feedback on this post I will be more daring). I’m still on the fence, but I may try out the grey background maps for the next round of monthly meetings. I will have to think if I can devise a reasonable experiment to differentiate between the maps and whether they meet the community groups goals and/or expectations. But, all together, the black background maps should certainly be given serious consideration for similar tasks. Again, as I said previously, the high contrast with smaller elements makes them more obvious (brings them more to the foreground) than with the white background, which as I show here can be useful in some circumstances.

Leave a comment

by Andy Wheeler on December 20, 2012 • Permalink

Posted in Crime Mapping, Data Visualization, Mapping

Tagged cartography, crime-mapping, data visualization, mapping

Posted by Andy Wheeler on December 20, 2012

https://andrewpwheeler.com/2012/12/20/some-more-about-black-backgrounds-for-maps/

Presentation at ASC – November, 2012

At the American Society of Criminology conference in Chicago in a few weeks I will be presenting (I can’t link to the actual presentation it appears, but you can search the program for Wheeler and my session will come up). Don’t take this as a final product, but I figured I would put out there the working paper/chapters of my dissertation that are the motivation for my presentation and my current set of slides.

Here is my original abstract I submitted a few months ago, The title of the talk is The Measurement of Small Place Correlates of Crime;

This presentation addresses several problems related with attempting to identify correlates of crime at small units of analysis, such as street segments. In particular the presentation will focus on articulating what we can potentially learn from smaller units of analysis compared to larger aggregations, and relating a variety of different measures of the built environment and demographic characteristics of places to theoretical constructs of interest to crime at places. Preliminary results examining the discriminant and convergent validity of theoretical constructs pertinent to theories for the causes of crime using data from Washington, D.C. will be presented.

This was certainly an over-ambitious abstract (I was still in the process of writing my prospectus when I submitted it). The bulk of the talk will be focused on “What we can learn from small units of analysis?”, and as of now after that as time allows I will present some illustrations of the change of support problem. Sorry to dissapoint, but nothing about convergent or divergent validity of spatial constructs will be presented (I have done no work of interest yet, and I don’t think I would have time to present any findings in anymore than a superficial manner anyway).

Here is the working paper, based on two chapters of my dissertation.
Here is a copy of the presentation and here is the same copy with my notes attached.

Note don’t be scared off by how dull the working paper is, the presentation will certainly be more visual and less mathematical (I will need to update my dissertation to incorporate some more graphical presentations).

Maps and graphis at the end of the talk demonstrating the change of support problem are still in the works (and I will continue to update the presentation on here). Here is a preview though of the first map I made that demonstrates how D.C. disseminates geo-date aggregated and snapped to street segments, making it problematic to mash up with census data.

The presentation time is on Friday at 9:30, and I’m excited to see the other presentations as well. It looks like to me that Pizarro et al.’s related research was recently published in Justice Quarterly, so if you don’t care for my presentation come to see the other presenters!

Leave a comment

by Andy Wheeler on November 4, 2012 • Permalink

Posted in Mapping, Papers, scholarly

Tagged mapping, paper, presentation, scholarly

Posted by Andy Wheeler on November 4, 2012

https://andrewpwheeler.com/2012/11/04/presentation-at-asc-november-2012/

JQC paper on the moving home effect finally published!

My first solo publication, The moving home effect: A quasi experiment assessing effect of home location on the offence location, after being in the online first que nearly a year, has finally been published in the Journal of Quantitative Criminology 28(4):587-606. It was the oldest paper in the online first section (along with the paper by Light and Harris published on the same day)!

This paper was the fruits of what was basically the equivalent of my Masters thesis, and I would like to take this opportunity to thank all of the individuals whom helped me with the project, as I accidently ommitted such thanks from the paper (entirely my own fault). I would like to thank my committee members, Rob Worden, Shawn Bushway, and Janet Stamatel. I would also like to thank Robert Apel and Greg Pogarsky for useful feedback I had recieved on in class papers based on the same topic, as well as the folks in the Worden meeting group (for not only feedback but giving me motivation to do work so I had something to say!)

Rob Worden was the chair of my committee, and he deserves extra thanks not only for reviewing my work, but also for giving me a job at the Finn Institute, which otherwise I would have never had access to such data and opportunity to conduct such a project. I would also like to give thanks to the Syracuse PD and Chief Fowler for letting me use the data and reveal the PD’s identity in the publication.

I would also like to thank Alex Piquero and Cathy Widom for letting me make multiple revisions and accepting the paper for publication. For the publication itself I recieved three very excellent and thoughtful peer reviews. The excellence of the reviews were well above the norm for feedback I have otherwise encountered, and demonstrated that the reviewers not only read the paper but read it carefully. I was really happy with the improvements as well as how fair and thoughtful the reviews were. I am also very happy it was accepted for publication in JQC, it is the highest quality venue I would expect the paper to be on topic at, and if it wasn’t accepted there I was really not sure where I would send it otherwise.

In the future I will publish pre-prints online, so the publication before editing can still be publicly available to everyone. But, if you can not get a copy of this (or any of the other papers I have co-authored so far) don’t hesitate to shoot me an email for a copy of the off-print. Hopefully I have some more work to share in the new future on the blog! I currently have two papers I am working on with related topics, one with visualizing journey to crime flow data, and another paper with Emily Owens and Matthew Feedman of Cornell comparing journey to work data with journey to crime data.

For a teaser for this paper here is the structured abstract from the paper and a graph demonstrating my estimated moving home effect.

Objectives
This study aims to test whether the home location has a causal effect on the crime location. To accomplish this the study capitalizes on the natural experiment that occurs when offender’s move, and uses a unique metric, the distance between sequential offenses, to determine if when an offender moves the offense location changes.

Methods
Using a sample of over 40,000 custodial arrests from Syracuse, NY between 2003 and 2008, this quasi-experimental design uses t test’s of mean differences, and fixed effects regression modeling to determine if moving has a significant effect on the distance between sequential offenses.

Results
This study finds that when offenders move they tend to commit crimes in locations farther away from past offences than would be expected without moving. The effect is rather small though, both in absolute terms (an elasticity coefficient of 0.02), and in relation to the effect of other independent variables (such as the time in between offenses).

Conclusions
This finding suggests that the home has an impact on where an offender will choose to commit a crime, independent of offence, neighborhood, or offender characteristics. The effect is small though, suggesting other factors may play a larger role in influencing where offenders choose to commit crime.

Leave a comment

by Andy Wheeler on November 2, 2012 • Permalink

Posted in Crime Mapping, Mapping, Papers, scholarly

Tagged crime-mapping, data visualization, mapping, paper, scholarly

Posted by Andy Wheeler on November 2, 2012

https://andrewpwheeler.com/2012/11/02/jqc-paper-on-the-moving-home-effect-finally-published/

Using Bezier curves to draw flow lines

As I talked about previously, great circle lines are an effective way to visualize flow lines, as the bending of the arcs creates displacement among over-plotted lines. A frequent question that comes up though (see an example on GIS.stackexchange and on the flowing data forums) is that great circle lines don’t provide enough bend over short distances. Of course for visualizing journey to crime data (one of the topics I am interested in), one has the problem that most known journey’s are within one particular jurisdiction or otherwise short distances.

In the GIS question I linked to above I suggested to utilize half circles, although that seemed like over-kill. I have currently settled on drawing an arcing line utilizing quadratic Bezier curves. For a thorough demonstration of Bezier curves, how to calculate them, and to see one of the coolest interactive websites I have ever come across, check out A primer on Bezier curves – by Mike "Pomax" Kamermans. These are flexible enough to produce any desired amount of bend (and are simple enough for me to be able to program!) Also I think they are more aesthetically pleasing than irregular flows. I’ve seen some programs use hook like bends (see an example of this flow mapping software from the Spatial Data Mining and Visual Analytics Lab), but I’m not all that fond of that for either aesthetic reasons or for aiding the visualization.

I won’t go into too great of details here on how to calculate them, (you can see the formulas for the quadratic equations from the Mike Kamermans site I referenced), but you basically, 1) define where the control point is located at (origin and destination are already defined), 2) interpolate an arbitrary number of points along the line. My SPSS macro is set to 100, but this can be made either bigger or smaller (or conditional on other factors as well).

Below is an example diagram I produced to demonstrate quadratic Bezier curves. For my application, I suggest placing a control point perpindicular to the mid point between the origin and destination. This creates a regular arc between the two locations, and conditional on the direction one can control the direction of the arc. In the SPSS function provided the user then provides a value of a ratio of the height of the control point to the distance between the origin and destination location (so points further away from each other will be given higher arcs). Below is a diagram using Latex and the Tikz library (which has a handy function to calulate Bezier curves).

Here is a simpler demonstration of the controlling the direction and adjusting the control point to produce either a flatter arc or an arc with more eccentricity.

Here is an example displaying 200 JTC lines from the simulated data that comes with the CrimeStat program. The first image are the original straight lines, and the second image are the curved lines using a control point at a height half the distance between the origin and destination coordinate.

Of course, both are most definately still quite crowded, but what do people think? Are my curved lines suggestion benificial in this example?

Here I have provided the SPSS function (and some example data) used to calculate the lines (I then use the ET Geowizards add-on to turn the points into lines in ArcGIS). Perhaps in the future I will work on an R function to calculate Bezier curves (I’m sure they could be of some use), but hopefully for those interested this is simple enough to program your own function in whatever language of interest. I have the starting of a working paper on visualizing flow lines, and I plan on this being basically my only unique contribution (everything else is just a review of what other people have done!)

One could be more fancy as well, and make the curves different based on other factors. For instance make the control point closer to either the origin or destination is the flow amount is assymetrical, or make the control point further away (and subsequently make the arc larger) is the flow is more volumous. Ideas for the future I suppose.

Leave a comment

by Andy Wheeler on October 16, 2012 • Permalink

Posted in Crime Mapping, Data Visualization, SPSS

Tagged cartography, data visualization, flow-data, mapping, SPSS

Posted by Andy Wheeler on October 16, 2012

https://andrewpwheeler.com/2012/10/16/using-bezier-curves-to-draw-flow-lines/

Making value by alpha maps with ArcMap

I recently finished reading Cynthia Brewer’s Designing better maps: A guide for GIS users. Within the book she had an example of making a bi-variate map legend manually in ArcMap, and then the light-bulb went off in my mind that I could use that same technique to make value by alpha maps in ArcMap.

For a brief intro into what value by alpha maps are, Andy Woodruff (one of the creators) has a comprehensive blog post on them on what they are and their motivation. Briefly though, we want to visualize some variable in a choropleth map, but that variable is measured with varying levels of reliability. Value by alpha maps de-emphasize areas of low reliability in the choropleth values by increasing the transparency of that polygon. I give a few other examples of interest related to mapping reliability in this answer on the GIS site as well, How is margin of error reported on a map?. Essentially those techniques mentioned either only display certain high reliability locations, make two maps, or use technqiues to overlay multiple attributes (like hashings). But IMO the value by alpha maps looks much nicer than the maps with multiple elements, and so I was interested in how to implement them in ArcMap.

What value by alpha maps effectively do is reduce the saturation and contrast of polygons with high alpha blending, making them fade into the background and be less noticable. I presented an applied example of value by alpha maps in my question asking for examples of beautiful maps on the GIS site. You can click through to see further citations for the map and reasons for why I think the map is beautiful. But below I include an image here as well (taken from the same Andy Woodruff blog post mentioned earlier).

Here I will show to make the same maps in ArcMap, and present some discussion about their implementation, in particular suitable choices for the original choropleth colors. Much was already discussed by the value by alpha originators, but I suppose I didn’t really appreciate them until I got my hands alittle dirty and tried to make them myself. Note this question on the GIS site, How to implement value-by-alpha map in GIS? gives other resources for implementing value-by-alpha maps. But as far as I am aware this contribution about how to do them in ArcMap is novel.

Below I present an example displaying the percentage of female heads of households with children (abbreviated PFHH from here on) for 2010 census blocks within Washington, D.C. Here we can consider the reliability of the PFHH dependent on the number of households within the block itself (i.e. we would expect blocks with smaller number of households to have a higher amount of variability in the PFHH). The map below depicts blocks that have at least one household, and so the subsequent PFHH maps will only display those colored polygons (about a third, 2132 out of 6507, have no households).

I chose the example because female headed households are a typical covariate of interest to criminologists for ecological studies. I also chose blocks as they are the smallest unit available from the census, and hence I expected them to show the widest variability in estimates. Below I provide an example on how one might typically display PFHH, while simultaneously incorporating information on the baseline number the maps will be map of.

The first example seperately displays the denominator number of households on the left and the percent of female headed households with children on the right both in a sequential choropleth scheme (darker colors are a higher PFHH and Number of Households).

One can also superimpose the same information on the map. Sun & Wong, 2010 suggest one use cross hatching above the the choropleth colors to depict reliability, but here I will demonstrate using choropleth colors for the baseline number of households and a proportional point symbols for the PFHH. I supplement the map on the right with a scatterplot, that has the number of households on the X axis and the PFHH on the Y axis.

These both do an alright job (if you made me pick one, I think I would pick the side-by-side sets of maps), but lets see if we can do better with value-by-alpha maps! The following tutorial will be broken up into two sections. The first section talks about actually generating the map, and the second section talks about how to make the legend. Neither is difficult, but making the legend is more of a pain in the butt.

How to make the value by alpha map

First one can start out by making the base layer with the desired choropleth classifications and color scheme. Note here I changed what I am visualizing from a sequential color scheme of PFHH to location quotients with only four categories. I will discuss why I did this later on in the post.

Then one can make several copies of that layer (right click -> copy -> paste within hierarchy), based on however many different reliability classifications you want to display. Here I will do 4 different reliability classifications. Note after you make them for management of the TOC it is easier to group them.

Then one uses selection criteria to filter out only those polygons that fall within the specified reliability range. And then sets the transparency for the that level to the desired value.

And voila, you have your value by alpha map. Note if after you make the layers you decide you want a different classification and/or color scheme, you can make the changes to one layer and then apply the changes to all of the other layers.

How to make the legend

Now making the legend is the harder part. If one goes to the layout view, one will see that since in this example one has essentially for layers superimposed on the same map, one has four seperate legend entries. Below is what it looks like with my defaults (plus a vertical rule I have in my map).

What we want in the end is a bivariate scheme, with the PFHH dimension running up and down, and the transparency dimension running from one side to the other (the same as in the example mortality rate map at the beginning of the post). To do this, one has to convert the legends to graphics.

The ungroup the elements so each can be individualy manipulated. Note, sometimes I have to do this operation multiple times.

Then re-arrange the panels and labels into the desired format.

More tedious than making the seperate layers, but not crazy unreasonable if you only have to do it for one (or a small number of maps). If you need to do it for a larger number of maps a better workflow will be needed, like creating a seperate “fake inset” map that replicates the legend, making the legend in a seperate tool, or just making the map entirely in a program where alpha blending is more readily incorporated. For instance in statistical packages it is typically a readily available encoding that can be added to a graphic (they also will allow continous color ramps and continous levels of transparency).

And voila, here is the final map. To follow is some discussion about choosing color schemes and whether you should use a black background or not.

Some discussion about color schemes

The Roth et al. (2010) paper in the cartographic journal and Andy Woodruff’s blog post I cited at the beginning of this post initially talked about color schemes and utilizing a black background, but I didn’t really appreciate the complexity of this choice until I went and made a value-by-alpha map of my own. In the end I decided to use location quotients to display the data, as the bivariate color scheme provides further contrast. I feel weird using a bivariate color scheme for a continous scale (hence the conversion to location quotients), but I feel like I should get over that. Everything has its time and place, and set rules like that aren’t good for anyone but bureaucrats or the mindless.

I certainly picked a complex dataset to start with, and the benifits of the value by alpha map over the two side by side maps (if any) are slight. I suspect why mine don’t look quite as nice as the ones created by Roth, Woodruff and company are partially due to the greater amount of complexity. The map with the SatScan reliabilities I noted as one of my favorite maps is quite striking, but it is partly due to the relibaility having a very spatially contiguous pattern (although the underlying cancer mortality rate map is quite spatially heterogenous). Here the spatial regularity is much weaker, in either the pattern being mapped or the reliability thresholds I had chosen. It does produce a quite pretty map though, FWIW.

For reference, here is the same map utilizing a black background. The only thing different in this map is that the most transparent layer is now set to 80% transparency instead of 90% (it was practically invisible at 90% with black as the modifying background color). Also it was necessary to do the fake inset map for a legend I talked about earlier with black as the background color. This is because the legend generated by ArcGIS always has white as the modifying color. If you refer back to the map with white as the modifying color, you can tell this produces greater contrast among the purples (the location quotient 2.1 – 4 for fully opaque and 4.1 – 12.6 for 40% transparent with white as the modifying color appear very similar).

The Roth Cartographic journal article gives other bivariate and nominal color scheme suggestions, you should take their advice. Hopefully in the future it will be simpler to incorporate bivariate color schemes in ArcMap, as it would make the process much simpler (and hence more useful for exploratory data analysis).

I would love it if people point me to other examples in which value by alpha maps are useful. I think in theory it is a good idea, but the complexity intoduced in the map is a greater burden than I intially estimated until I made a few. I initially thought this would be useful for presenting the results of geographically weighted regression or perhaps cancer atlas maps in general (where sometimes people just filter out results below some population threshold). But maybe not given the greater complexity introduced.

4 Comments

by Andy Wheeler on August 24, 2012 • Permalink

Posted in Data Visualization, Mapping

Tagged choropleth, data visualization, mapping

Posted by Andy Wheeler on August 24, 2012

https://andrewpwheeler.com/2012/08/24/making-value-by-alpha-maps-with-arcmap/

Search for:
Recent Posts
Categories
Categories
Site RSS Feeds
- RSS - Posts
- RSS - Comments
Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Email Address:

Join 399 other subscribers
aoristic big-data cartography census choropleth citeulike color consulting cost-benefit courses crime-mapping crime-trends Crime Analysis Criminal Justice data-manipulation data visualization deep-learning ESRI excel flow-data folium geocoding github google-streetview-api grammar of graphics group-based-trajectory gun-violence healthcare homicide-rates hot spots hypothesis-testing linear programming logistic-regression machine-learning MACRO mapping matplotlib meta network NetworkX officer-involved-shooting open-science paper Papers peer-review Poisson prediction Predictive-Policing preprint presentation Python Python-programability pytorch quasi-experiment r recidivism regression resources scholarly scraping seaborn shootings simulation small-multiples social-media social-networking SPSS stackexchange Stata statistics survey time-series uncertainty wdd web-scraping
Top Posts & Pages
Stack Exchange

All posts tagged mapping

How to make the value by alpha map

How to make the legend

Some discussion about color schemes

Recent Posts

Categories

Site RSS Feeds

Follow Blog via Email

Top Posts & Pages

Stack Exchange