Monday, 5 August 2019

Pie charts ain't such a bad guy!

About 3 years ago I did some research using Amazon Mechanical Turk into how well people judge segments in a part-to-whole chart. Mentioned it in a blog post back then, but didn't go into too much detail.

There had been some excellent work done by Robert Kosara (from Tableau Research) along with Dean Skau into Pie Charts, Doughnut Charts and various differences. Robert has continued this with a number of papers since.

At the heart of my research was this nagging thinking that pervaded the data visualization ecosystem - pie charts are bad, never use pie charts. This was kinda, fine because I could do other stuff - especially segmented bar charts or tree maps, but I always had business users asking for pie charts and not really getting me when I tried to explain that they weren't the best way.

In 2015 I had started into studying for a Masters Degree in Data Analytics so was starting to get back into the academic way of thinking and looking at stuff. When I had a break from studying during the summer of 2016 I started to look around and found that there was no real basis for anyone rejecting a pie chart for part-to-whole comparison, other than they didn't really like them! Because when it came to actually testing pie charts versus other types of charts, then the pies seemed to do as well or better than the alternatives.

There was some suggestion in a number of papers that pie charts actually do better as they have a number of natural visual cues - at 0%, 25%, 50% and 75% - whereas the bar chart has definitive visual cues at 0% and 100% and a less well defined cue at 50%.

So, being the curious person that I am, I decided to test things for myself. I put a little money into it and spun up an Amazon Mechanical Turk account. I created a number of images (using QlikView of course!), and had the "workers" judge the size of a segment in a chart. I used a set of "baseline" pie and bar (just standard pie and bar chart) and then a set of bar charts that had additional visual cues added.

The chart above shows the comparison of mean absolute error recorded by participants (basically, how far off the mark were they with their estimations) and the 95% confidence intervals of those results.

The baseline pie chart performed better than the baseline bar, even considering the confidence intervals. This was not a surprise as it confirmed the results of an experiment from 1915!
The bar chart with a numerical scale as a visual cue performs the very best - and this aligns to what Stephen Few says in his famous paper, Save the Pies for Dessert.

Of course, it is not always practical to have a numerical scale on a bar chart, and I have shown that adding a perceptible visual cue at the decile positions (every 10%) performs almost as well as the scale. Much better than the baseline bar as well as a bar with visual cues at the quartile positions (every 25%).

Interestingly, adding a visual cue at the quartile positions for the pie chart did not improve its performance significantly over the baseline. With quartile cues not improving the performance over the baselines, it may indicate that we do indeed pick up on those cues automatically. More research needed here.

The upshot here is to not feel bad about using a pie chart for part-to-whole comparison. No need to feel embarrassed at the next visualization meetup or to share it on an online forum! Be bold!!!
The reality is back to my Fundamentals Rules of Visualization (or "Redmond's Rules") which are, in summary:

- Use the right visual variables
- Provide context with annotations
- SFW - make sure that the results are relevant to the viewer

I'm not the only one who is leaning in this direction and, as I blogged about previously, visualization can be that simple.

I finally got round to writing up an academic paper on my research and the good news (for me!) is that is has been accepted into the Short Papers section of IEEE Vis 2019 in Vancouver. If you are interested, a pre-print is available on arXiv.

Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook