Choropleth

A Map chart is a great way to visualize spatial relationships in data by indicating data on a geographical map. People are very good at reading maps, a fact that allows a map chart to effectively communicate a great deal of information.

In this blog post I’d like to focus on a particular type of map chart called a choropleth. In this type of map chart we color areas on a map to indicate a value or a category of data for each area.
In the following example, we show US unemployment data at the state level. The chart below is interactive and I encourage you to play with the settings. We explain these settings later in this article.

Data

The data shown in a choropleth can be categories or numerical values. The unemployment rates in our example are numerical values, of course. For an example of categories we could show the result of an election with a color for every political party.

When showing categories, the colors are assigned from a palette of distinctive colors similar to the colors used in a pie chart. When showing numerical values, we use color schemes that reflect the numerical value using color shades.

Let’s consider two types of color schemes. The first, sequential, is designed for data that has values that progress from low to high. The second, diverging, puts equal emphasis on mid-range critical values and extremes at both ends of the data range Levels. An example of diverging can be acidity where PH of 7 is neutral and higher and lower PH values diverge to acidity and alkalinity.

In our example we used a sequential scheme that reflects progression from low unemployment in states such as North Dakota to high unemployment in states such as Nevada.

Levels

The range of numerical values is divided into a number of levels that are then mapped to colors. You can set the number of levels to between 3 and 9 to control the granularity of this mapping. In our example, we used just 5 levels.

Scale

There are two options for mapping numerical values to colors:

  • In a linear scale, the range of data values is equally divided to the specified number of levels.
  • In a quantile scale, the range of data values is divided based on data frequencies.

In our example, we use quantile. Since we selected 5 levels we have 5 quintiles. The first quintile has the 20% of states with the lowest unemployment; the next quintile has the next 20% and so on.

Color Schemes

Based on the choice of sequential or diverging data we have a choice of several appropriate color schemes. The scheme you see in our example is based on design by Cynthia Brewer and Mark Harrower at The Pennsylvania State University.