All posts tagged mapping

When should we use a black background for a map?

Some of my favorite maps utilize black (or dark) backgrounds. For some examples;

Several of the maps in the Examples of Beautiful Maps thread utilize dark backgrounds. These include value-by-alpha maps, the facebook friends connection map, and the Mapnificent London travel times map has a nightime setting which changes the background map to dark.
I think many of James Chesire’s maps of London flows are quite nice. For two examples see one with a darker grey background and an animation with a black background. Below is a picture taken from the second post.

Steven Romalewski offers a slight critique of them recently in his blog post, Mapping NYC stop and frisks: some cartographic observations;

I know that recently the terrific team at MapBox put together some maps using fluorescent colors on a black background that were highly praised on Twitter and in the blogs. To me, they look neat, but they’re less useful as maps. The WNYC fluorescent colors were jarring, and the hot pink plus dark blue on the black background made the map hard to read if you’re trying to find out where things are. It’s a powerful visual statement, but I don’t think it adds any explanatory value.

I don’t disagree with this, and about all I articulate in their favor so far is essentially “well lit places create a stunning contrast with the dark background” while white background maps just create a contrast and are not quite as stunning!

I think the proof of a black backgrounds usefulness can be seen in the example value-by-alpha maps and the flow maps of James Chesire, where a greater amount of contrast is necessary. IMO in the value by alpha maps the greater contrast is needed for the greater complexity of the bivariate color scheme, and in Chesire’s flow maps it is needed because lines frequently don’t have enough areal gurth to be effectively distinguished from the background.

I couldn’t find any more general literature on the topic though. It doesn’t seem to be covered in any of the general cartography books that I have read. Since it is really only applicable to on-screen maps (you certainly wouldn’t want to print out a map with a black background) perhaps it just hasn’t been addressed. I may be looking in the wrong place though, some text editors have a high contrast setting where text is white on a dark background (for likely the same reasons they look nice in maps), so it can’t be that foreign a concept to have no scholarly literature on the topic.

So in short, I guess my advice is utilize a black background when you want to highly focus attention on the light areas, essentially at the cost of greatly diminishing the contrast with other faded elements in the map. This is perhaps a good thing for maps intended as complex statistical summaries, and the mapnificient travel times map is probably another good example where high focus in one area is sufficient and other background elements are not needed. I’m not sure though for choropleth maps black backgrounds are really needed (or useful), and any more complicated thematic maps certainly would not fit this bill.

To a certain extent I wonder what lessons from black backgrounds can be applied to the backgrounds of statistical graphics more generally. Leave me some comments if you have any thoughts or other examples of black background maps!

8 Comments

by Andy Wheeler on July 28, 2012 • Permalink

Posted in Data Visualization, Mapping

Tagged cartography, choropleth, data visualization, mapping

Posted by Andy Wheeler on July 28, 2012

https://andrewpwheeler.com/2012/07/28/when-should-we-use-a-black-background-for-a-map/

Why great circle lines look nicer in flow maps

I got sick of working on my dissertation the other day so I started writing a review article on visualizing flow lines for journey to crime data. Here I will briefly illustrate why great circle lines tend to look nicer in flow maps than do straight lines.

Flow maps tend to be very visually complicated, and so what happens (to a large extent) is what happens in Panel B in the above graphic. Bending the lines, as is done with great circles, tends to displace the lines from one another to a greater extent. Although perfect overlap as is demonstrated in the picture doesn’t necessarily happen that frequently, the same logic applies to nearly overlapped lines. One of the nicest examples of this you can find is the facebook friends map that made the internet rounds (note there are many other aesthetic elements in the plot that make it look nice besides just the great circle lines).

Of course with great circle lines you don’t get the bending in the other direction for reciprocal flows I demonstrate in my first figure (the great circle line is the same regardless of direction). Because of this, and because when using a local projection great circles lines don’t really provide enough eccentricity in the bend to produce the desired displacement of the lines, I suggested to utilize half circles and discuss how to calculate them given a set of origin-destination coordinates at this question on the GIS site.

I need to test this out in the wild some more though. I suspect a half-circle is too much, but my attempts to script a version where the eccentricity is less pronounced has befuddled me so far. I will post an update on here if I come to a better solution, and when the working paper is finished I will post a copy of that as well. Preferably I would like the script to take an arbitrary parameter to control the amount of bend in the arc, so if you have suggestions feel free to shoot me an email or leave a comment here.

For those interested in the topic I would suggest to peruse one of my other answers at the GIS site. Therein I give a host of references and online mapping examples of visualizing flows.

2 Comments

by Andy Wheeler on July 12, 2012 • Permalink

Posted in Data Visualization, Mapping

Tagged data visualization, flow-data, mapping

Posted by Andy Wheeler on July 12, 2012

https://andrewpwheeler.com/2012/07/12/why-great-circle-lines-look-nicer-in-flow-maps/

Beware of Mach Bands in Continuous Color Ramps

A recent post of mine on the cross validated statistics site addressed how to make kernel density maps more visually appealing. The answer there was basically just adjust the bandwidth until you get a reasonably smoothed surface (where reasonable means not over-smoothed to one big hill or undersmoothed to a bunch of unconnected hills).

Another problem that frequently comes along with the utlizing the default types of raster gradients is that of mach bands. Here is a replicated image I used in the cross validated site post (made utilizing the spatstat R library).

Even though the color ramp is continous, you see some artifacts around the gradient where the hue changes from what our eyes see as green to blue. To be more precise, approximately where the green hue touches the blue hue the blue color appears to be lighter than the rest of the blue background. This is not the case though, and is just an optical illusion (you can even see the mach bands in the legend if you look close). Mark Monmonier in How to Lie with Maps gives an example of this, and also uses that as a reason to not use continous color ramps (also another reason he gives is it is very difficult to map a color to an exact numerical location on the ramp). To note this isn’t just something that happens with this particular color ramp, this happens even when the hue is the same (the wikipedia page gives an example with varying grey saturation).

So what you say? Well, part of the reason it is a problem is because the artifact reinforces unnatural boundaries or groupings in the data, the exact opposite of what one wants with a continuous color ramp! Also the groupings are largely at the will of the computer, and I would think the analyst wants to define the groupings themselves when disseminating the maps (although this brings up another problem with how to define the color breaks). A general principle with how people interpret such maps is that they tend to form homogenous groupings anyway, so for both exploratory purposes and disseminating maps we should keep this in mind.

This isn’t a problem limited to isopleth maps either, the Color Brewer online app is explicitly made to demonstrate this phenonenom for choropleth maps visualizing irregular polygons. What happens is that one county that is spatially outlying compared to its neighbors appears more extreme on the color gradient than when it is surrounded by colors with the same hue and saturation. Below is a screen shot of what I am talking about, with some of the examples circled in red. They are easy to see that they are spatially outlying, but harder to map to the actual color on the ramp (and it gets harder when you have more bins).

Even with these problems I think the default plots in the spatstat program are perfectly fine for exploratory analysis. I think to disseminate the plots though I would prefer discrete bins in many (perhaps most) situations. I’ll defer discussion on how to choose the bins to another time!

2 Comments

by Andy Wheeler on April 19, 2012 • Permalink

Posted in Data Visualization, Mapping

Tagged choropleth, data visualization, mapping, optical illusion, stackexchange

Posted by Andy Wheeler on April 19, 2012

https://andrewpwheeler.com/2012/04/19/beware-of-mach-bands-in-continuous-color-ramps/

Co-maps and Hot spot plots! Temporal stats and small multiple maps to visualize space-time interaction.

One of the problems with visualizing and interpreting spatial data is that there are characteristics of the geographical data that are hard to display on a static, two dimensional map. Friendly (2007) makes the pertinent distinction between map and non-map based graphics, and so the challenge is to effectively interweave them. One way to try to overcome this is to create graphics intended to supplement the map based data. Below I give two examples pertinent to analyzing point level crime patterns with attached temporal data, co-maps (Brunsdon et al., 2009) and the hot spot plot (Townsley, 2008).

co-maps

The concept of co-maps is an extension of co-plots, a visualization technique for small multiple scatterplots originally introduced by William Cleveland (1994). Co-plots are in essence a series of small multiples scatterplots in which the visualized scatter plot is conditioned on a third (or potentially fourth) variable. What is unique about co-plots are though the conditioning variable(s) is not mutually exclusive between categories, so the conditions overlap.

The point of co-plots is in general to see if the relationship between two variables has an interaction with a third continuous variable. When the conditioning variable is continuous, we wouldn’t expect the interaction to change dramatically with discrete cut-offs of the continuous variable, so we want to examine the interaction effect at varying levels of the conditioning variable. It is also useful in instances in which the data is sparse, and you don’t want to introduce artifactual relationships by making arbitrary cut-offs for the conditioning variable.

Besides the Cleveland paper cited (which is publicly available, link in citations at bottom of post), there are some good examples of coplot scatterplots from the R graphical manual.

Brunsdon et al. (2009) extend the concept to analyzing point patterns, when time is the conditioning variable. Also because the geographic data are numerous, they apply kernel density estimation (kde) to visualize the results (instead of a sea of overlapping points). When visualizing geographic data, too many points are common, and the solutions to visualizing the data are essentially the same as people use for scatterplots (this thread at the stats site gives a few resources and examples concerning that). Below I’ve copied a picture from Brusdon et al., 2009 to show it applied to crime data.

Although the example is conditional on temperature (instead of time), it should be easy to see how it could be extended to make the same plot conditional on time. Also note the bar graph at the top denotes the temperature range, with the lowest bar corresponding to the graphic that is in the panel on the bottom left.

Also of potential interest, the same authors applied the same visualization technique to reported fires in another publication (Corcoran et al., 2007).

the hot spot plot

Another similarly motivated graphical presentation of the interaction of time and space is the hot-spot plot proposed by Michael Townsley (2008). Below is an example.

So the motivation here is having coincident graphics simulataneously depicting long term temporal trends (in a sparkline like graphic at the top of the plot), spatial hot spots depicted using kde, and a lower bar graphic depicting hourly fluctuations. This allows one to identify spatial hot spots, and then quickly assess their temporal nature. The example from the Townsley article I give is a secondary plot showing zoomed in locations of several analyst chosen hot spots, with the cut out remaining events left as a baseline.

Some food for thought when examing space-time trends with point pattern crime data.

Citations

Brunsdon, Chris, Jonathan Corcoran, Gary Higgs & Andrew Ware. 2009. The influence of weather on local geographical patterns of police calls for service. Environment and Planning B 36(5): 906-926.
Corcoran, Jonathan, Gary Higgs, Chris Brunsdon & Andrew Ware. 2007. The use of comaps to explore the spatial and temporal dynamics of fire incidents: A case study in South Wales, United Kingdom. The Professional Geographer 59(4): 521-536.
Cleveland, William. 1994. Coplots, nonparametric regression, and conditionally parametric fits. IMS Lecture Notes Monograph Series 24: 21-36. PDF available in link from Project Euclid.
Friendly, Michael. 2007. A.-M. Guerry’s moral statistics of France: Challenges for multivariable spatial analysis. Statistical Science 22(3): 368-399. PDF available from publisher.
Townsley, Michael. 2008. Visualizing space time patterns in crime: The hotspot plot. Crime Patterns and Analysis 1(1): 61-74. PDF available from publisher.

Reference lines for star plots aid interpretation

The other day I was reading Nathan Yau’s Visualize This, and in his chapter on visualizing multi-variate relationships, he brought up star plots (also referred to as radar charts by Wikipedia). Below is an example picture taken from a Michael Friendly conference paper in 1991.

Update: Old link and image does not work. Here is a crappy version of the image, and an updated link to a printed version of the paper.

One of the things that came to mind when I was viewing the graph is that a reference line to signify points along the stars would be nice (similar to an anchor figure I mention in the making tables post on the CV blog). Lo and behold, the author of the recently published EffectStars package for R must have been projecting his thoughts into my mind. Here is an example taken from their vignette on the British Election Panel Study

Although the use case is not exactly what I had in mind (some sort of summary statistics for coefficients in multi-nomial logistic regression models), the idea is still the same. The small multiple radar charts typically lack a scale with which to locate values around the star (see a google image search of star plots to reinforce my assertion) . Although I understand data reduction is necessary when plotting a series of small multiples like this, I find it less than useful to lack the ability to identify the actual value along the star in that particular node. Utilizing reference lines (like the median or mean of the distribution, along with the maximum value) should help with this (at least you can compare whether nodes are above/below said reference line). It would be similar to inserting a guidline for the median value in a parallel coordinates plot (but obviously this is not necessary).

Here I’ve attempted to display what I am talking about in an SPSS chart. Code posted here to replicate this and all of the other graphics in this post. If you open the image in a new tab you can see it in its full grandeur (same with all of the other images in this post).

Lets back up a bit, to explain in greater detail what a star plot is. So to start out, our coordinate system of the plot is in polar coordinates (instead of rectangular). Basically the way I think of it is the X axis in a rectangular coordinate system is replaced by the location around the circumference of a circle, and the Y axis is replaced by the distance from the center of the circle (i.e. the radius). Here is an example, using fake data for time of day events. The chart on the left is a “typical” bar chart, and the chart on the right are the same bars displayed in polar coordinates.

The star plots I displayed before are essentially built from the same stuff, they just have various aesthetic parts of the graph (referred to as “guides” in SPSS’s graphics language) not included in the graph. When one is making only one graphic, one typically has the guides for the reference coordinate system (as in the above charts). In particular here I’m saying the gridlines for the radius axis are really helpful.

Another thing that should be mentioned is, comparing multi-variate data one typically needs to normalize the locations along any node in the chart to make sense. An example might be if one node around the star represents a baseball players batting average, and another represents their number of home runs. You can’t put them on the same scale (which is the radius in a polar coordinate system), as their values are so disparate. All of the home runs would be much closer to the circumferance of the circle, and the batting averages would be all clustered towards the center.

The image below uses the same US average crime rate data from Nathan Yau’s book (available here) to demonstrate this. The frequency that some of the more serious crimes happen, such as homicide, are much smaller than less serious crimes such as assault and burglary. Mapping all of these types of crimes to the same radius in the chart does not make sense. Here I just use points to demonstrate the distributions, and a jittered dot plot is on the right to demonstrate the same problem (but more clearly).

So to make the different categories of crimes comparable one needs to transform the distributions to be on similar scales. What is typically done in parrallel coordinate plots is to rescale the distribution for any variable to between 0 and 1 (a simple example would be new_x = (x – x_min)/(x_max – x_min) where new_x is the new value, x is the old value, x_min is the minimum of all the x values, and x_max is the maximum of all the x values).¹ But depending on the data you could use others (if all could be re-expressed as proportions of something would be an example). Here I will rank the data.

_{1: This re-scaling procedure will not work out well if you have an outlier. There is probably no universal good way to do the rescaling for comparisons like these, and best practices will vary depending on context.}

So here the reference guide is not as useful (since the data is rescaled it is not as readily intuitive as the original rates). But, we could still include reference guides for say the maximum value (which would amount to a circle around the star plot) or some other value (like the median of any node) or a value along the rescaled distribution (like the mid-point – which won’t be the same as the original median). If you use something like the median in the original distribution it won’t be a perfect circle around the star.

Here the background reference line in the plot on the left is the middle rank (26 out of 50 states plus D.C.). The background reference line in the plot on the left is the middle rank (26 out of 50 states plus D.C.). The reference guide in the plot on the right is the ranking if the US average were ranked as well (so all the points more towards the center of the circle are below the US average).

Long story short, all I’m suggesting if your in a situation in which the reference guides are best ommitted, an unobstrusive reference guide can help. Below is an example for the 50 states (plus Washington, D.C.), and the circular reference guide marks the 26th rank in the distribution. The plot I posted at the beginning of the blog post is just this sprucced up alittle bit plus a visual legend with annotations.

Part of the reason I am interested in such displays is that they are useful in visualizing multi-variate geographic data. The star plots (unlike bar graphs or line graphs) are self contained, and don’t need a common scale (i.e. they don’t need to be placed in a regular fashion on the map to still be interpretable). Examples of this can be found in this map made by Charles Minard utilizing pie charts, Dan Carr’s small glyphs (page 7), or in a paper by Michael Friendly revisiting the moral statistics produced by old school criminologist Andre Guerry. An example from the Friendly paper is presented below (and I had already posted it as an example for visualizng multi-variate data on the GIS stackexchange site).

An example of how it is difficult to visualize lines without a common scale is given in this working paper of Hadley Wickham’s (and Cleveland talks about it and gives an example of bar charts in The Elements). Cleveland’s solution is to provide the bar a container which provides an absolute reference for the length of that particular bar, although it is still really hard to assess spatial patterns that way (the same could probably be said of the star plots too though).

Given models with many spatially varying parameters I think this has potential to be applied in a wider variety of situations. Instances that first come to mind are spatial discrete choice models, but perhaps it could be extended to situations such as geographically weighted regression (see a paper, Visual comparison of Moving Window Kriging Models by Demsar & Harris, 2010 for an example) or models which have spatial interactions (e.g. multi-level models where the hierarchy is some type of spatial unit).

Don’t take this as I’m saying that star charts are a panacea or anything, visualizing geographic patterns is difficult with these as well. Baby steps though, and reference lines are good.

I know the newest version of SPSS has the ability to place some charts, like pie charts, on a map (see this white paper), but I will have to see if it is possible to use polar coordinates like this. Since as US state map is part of the base installation for the new version 20, if it is possible someone could just use this data I presented here fairly easily I would think.

Also as a note, when making these star plots I found this post on the Nabble SPSS forum to be very helpful, especially the examples given by ViAnn Beadle and Mariusz Trejtowicz.

Example (good and bad) uses of 3d choropleth maps

A frequent critique of choropleth maps is that, in the process of choosing color bins, one can hide substantial variation within each of the bins . An example of this is in this critique of a map in the Bad maps thread on the GIS stackexchange site. In particular, Laurent argues that the classification scheme (in that example map) is misleading because China’s population (1.3 billion) and Indonesia’s population (0.2 billion) are within the same color bin although they have noteworthy differences in their population.

I think it is a reasonable note, and such a difference would be noteworthy in a number of contexts. One possible solution to this problem is by utilizing 3d choropleth maps, where the height of the bar maps to a quantitative value. An example use of this can be found at Alasdair Rae’s blog, Daytime Population in the United States.

The use of 3d allows one to see the dramatic difference in daytime population estimates between the cities (mainly on the east coast). Whereas a 2d map relying on a legend can’t really demonstrate the dramatic magnitude of differences between legend items like that.

I’m not saying a 3d map like this is always the best way to go. Frequent critiques are that the bars will hide/obstruct data. Also it is very difficult to really evaluate where the bars lie on the height dimension. For an example of what I am talking about, see the screen shot used for this demonstration, A Historical Snapshot of US Birth Trends, from ge.com (taken from the infosthetics blog).

If you took the colors away, would you be able to tell that Virginia is below average?

Still, I think used sparingly and to demonstrate dramatic differences they can be used effectively. I give a few more examples and/or reading to those interested below.

References

Ratti, Carlo, Stanislav Sobolevsky, Francesco Calabrese, Clio Andris, Jonathan Reades, Mauro Martino, Rob Claxton & Steven H. Strogatz. (2010) Redrawing the map of Great Britain from a Network of Human Interactions. PLoS ONE 5(12). Article is open access from link.

This paper is an example of using 3d arcs for visualization.

Stewart, James & Patrick J. Kennelly. 2010. Illuminated choropleth maps. Annals of the Association of American Geographers 100(3): 513-534.

Here is a public PDF by one of the same authors demonstrating the concept. This paper gives an example of using 3d choropleth maps, and in particular is a useful way to utilize a 3d shadow effect that slightly enhances distinguishing differences between two adjacent polygons. This doesn’t technique doesn’t really map height to a continuous variable though, just uses shading to distinguish between adjacent polygons.

Another example use of small multiples, many different point elements on a map

I recently had a post at the Cross Validated blog about how small multiple graphs, AndyW says Small Multiples are the Most Underused Data Visualization. In that post I give an example (taken from Carr and Pickle, 2009) where visualizing multiple lines on one graphs are very difficult. A potential solution to the complexity is to split the line graph into a set of small multiples.

In this example, Carr and Pickle explain that the reason the graphic is difficult to comprehend is that we are not only viewing 6 lines individually, but that when viewing the line graphs we are trying to make a series of comparisons between the lines. This suggests in the graph on the left there are a potential of 30 pairwise comparisons between lines. Whereas, in the small multiple graphics on the left, each panel has only 6 potential pairwise comparisons within each panel.

Another recent example that I came across in my work that small multiples I believe were more effective was plotting multiple points elements on the same map. And the two examples are below.

In the initial map it is very difficult to separate out each individual point pattern from the others, and it is even difficult to tell the prevalence of each point pattern in the map including all elements. The small multiple plots allow you to visualize each individual pattern, and then after evaluating each pattern on their own make comparisons between patterns.

Of course there are some drawbacks to the use of small multiple charts. Making comparisons between panels is surely more difficult to do than making comparisons within panels. But, I think that trade off in the examples I gave here are worth it.

I’m just starting to read the book, How Maps Work, by Alan MacEachren, and in the second chapter he gives a similar example many element point pattern map compared to small multiples. In that chapter he also goes into a much more detailed description of the potential cognitive processes that are at play when we view such graphics (e.g. why the small multiple maps are easier to interpret). Such as how locations of objects in a Cartesian coordinate system take preference into how we categorize objects (as opposed to say color or shape). Although I highly suggest you read it as opposed to taking my word for it!

References

Carr, Daniel & Linda Pickle. 2009. Visualizing Data Patterns with Micromaps. Boca Rotan, FL. CRC Press.

MacEachren, Alan. 2004. How maps work: Representation, visualization, and design. New York, NY. Guilford Press.

2 Comments

by Andy Wheeler on January 7, 2012 • Permalink

Posted in Data Visualization, Mapping

Tagged data visualization, mapping, small-multiples

Posted by Andy Wheeler on January 7, 2012

https://andrewpwheeler.com/2012/01/07/another-example-use-of-small-multiples-many-different-point-elements-on-a-map/

Search for:
Recent Posts
Categories
Categories
Site RSS Feeds
- RSS - Posts
- RSS - Comments
Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Email Address:

Join 390 other subscribers
aoristic big-data cartography census choropleth citeulike consulting cost-benefit courses crime-mapping crime-trends Crime Analysis Criminal Justice data-manipulation data visualization deep-learning ESRI excel flow-data folium geocoding github google-streetview-api grammar of graphics group-based-trajectory gun-violence healthcare homicide-rates hot spots hypothesis-testing linear programming LLM logistic-regression machine-learning MACRO mapping matplotlib meta network NetworkX officer-involved-shooting open-science paper Papers peer-review Poisson prediction Predictive-Policing preprint presentation Python Python-programability pytorch quasi-experiment r recidivism regression resources scholarly scraping seaborn shootings simulation small-multiples social-media social-networking SPSS stackexchange Stata statistics survey time-series uncertainty wdd web-scraping
Top Posts & Pages
Stack Exchange

All posts tagged mapping

co-maps

the hot spot plot

Citations

Recent Posts

Categories

Site RSS Feeds

Follow Blog via Email

Top Posts & Pages

Stack Exchange