Secrets from the Data Cave, October 2013

Posted on October 29, 2013 | in Uncategorized | by CRC

Introducing. . . Secrets from the Data Cave

_by Sarah McCruden _

Welcome to CRCs new monthly series of articles on all things techie: Secrets from the Data Cave! For those who dont know, the title references the room in our office where the data staff can geek out in an isolated setting that is fondly referred to as the bat cave. We will be offering you a sneak peek into this fascinating environment every month with the latest updates and tips on what were implementing here in the CRC data cave!

October 2013: Beware the Pumpkin Pie

Stephen Fews Show Me the Numbers: Designing Tables and Graphs to Enlighten. This volume, basically a textbook, covers more info on data visualization than one could hope to fit into a blog, and is worth reading in its entirety if you get the chance. It is Fews breakdown on how the human brain processes raw sensory data from visual stimuliand why that matters when youre deciding between a pie chart or a bar graphthat has really changed the way I look at data (pun intended)! So, I wanted to share this with our readers in this first-ever Secrets from the Data Cave. In the “spirit” of the season, I’ll be using data from a 2013 consumer survey (National Retail Federation1) on Americans’ chosen Halloween costumes on Americans chosen Halloween costumes. Per this survey, the top Halloween costume choices for children (as represented by surveyed parents who had already decided on a costume) were:

1_table

So how would you represent these findings in chart or graph form? Often, the go-to choice for data like this is a pie chart. Heres what that might look like:

2_piechart

The pie chart, a familiar graphic for anyone in research, is used to “encode values in an assortment of 2D shapes that represent values in proportion to their area”2–that is, to communicate the size of the piece, relative to the whole. Yet this has a fatal flaw were not very good at estimating the comparative sizes of 2D areas like the slices of the pie above, so without labels to tell us the numeric and/or percent values for each category, the pie chart is going to fall short in communicating numbers that are close in value (i.e., the number of Pumpkins vs. Vampires vs. Ninjas).

See for yourself: Can you rank the pieces from largest to smallest, without cheating and looking at the table above? And even if you can get the rank correct, can you tell how much bigger/smaller each piece is than the one before? This is tough, to say the least. Yet if we look at the same data in bar graph form, the rank and approximate values are much more easily discerned:

3_barchart

This difference in perception lies in preattentive processing, which “occurs below the level of consciousness at an extremely high speed and is tuned to detect a specific set of visual attributes.”2 Preattentive attributes will draw our attention to certain aspects of any visual stimuli: the form (length, width, orientation, shape, size and enclosure); the color (hue and intensity); and the spatial position.3 We can exploit these preattentive attributes when designing charts and graphs to focus the viewers attention on the most important aspects of the data (in this case, the length of the bars in the bar graph, which represent the numeric value of each costume). While the pie graph does use spatial position to encode information, it uses their 2D size, as opposed to their 2D position (an example of data encoded with 2D position would be a scatter plot, where the location of the dots relative to one another represents the data). And when it comes to 2D size in pie charts, even though we can tell that one value is greater than another when we use width, size, or color intensity, these attributes do not indicate precisely how much a value differs., it is difficult to perceive by how much or to assign a value to the area.2 And this doesnt just apply to pie charts, (which one could argue are harder to read because they dont have a labeled horizontal axis), are problematic. Any comparison of area will be more challenging than the comparisons of length one would see in a bar graph. So what should we use to encode our data? According to Few, we get the maximum perceptual benefits when we rely on length and 2D position of objects for quantitative data by using:

  • Points,
  • Lines,
  • Bars, and
  • Boxes

Examples of these would be our bar graph of costume choices above, as well as the examples from Fews book below:

data_blog_ex

All of the above choices would make it much easier to see the differences in how many child Pumpkins and child Ninjas we can expect to see trick-or-treating this year.

So the moral of the story is: Beware the Pumpkin pie! Pumpkin bars would be a much better choice.

Happy Halloween everyone!

Sources:

  • National Retail Federation. (Oct 7, 2013). Traditional Costumes Favorites For Adults, Children This Halloween, According To NRF [data file]. Retrieved from https://www.nrf.com/modules.php?name=Documents&op=viewlive&sp_id=7682
  • Few, Stephen (2012). Show Me The Numbers: Designing Tables and Graphs To Enlighten, Second Edition. Burlingame, CA: Analytics Press.
  • Ware, Colin (2004). Information Visualization: Perception for Design, Second Edition. San Francisco, CA: Morgan Kaufmann.