Wednesday, 16 October 2013

Low-cardinality part-to-whole comparison

I came up with this term last week, during a presentation by Bill Lay at the Masters Summit for QlikView in London.

For me, it perfectly describes the correct use of a pie chart.  Let's break it down:

In database parlance, this means that a column has few unique values.  For example, a flag field - either 1 or 0 - has only 2 unique values.  An opposite example, high-cardinality, might be Account Number - lots of unique values.

In a pie chart, for me, this means that you need to have a low number of segments.  Really only two or three.  Too many segments makes it hard to discern the differences between them.

This is critical for the correct use of a pie chart.  A pie chart is all about ratio - while you are comparing a segment against other segments, the correct context is the size of the segments versus the whole - what is the ratio of one segment versus the others and against the whole.

As an example, if we are looking at sales by country and look at only 3 countries - say Germany, USA and France.  In a pie chart we might see that Germany has about half of the sales.  But the context is just the sales of 3 countries and you can come to the incorrect conclusion that Germany is responsible for half of sales.  Showing all of the sales for all countries - probably restricting to Germany and Others or Germany, USA, France and Others - then we see the whole picture.

If we just want to compare the size of one country's sales versus other countries, then a bar chart is ideal.  The bar chart also works if you can't see all of the countries.  If you only present a sub-set of the countries in a pie chart, the context is incorrect.

So, there you go - low-cardinality part-to-whole comparison.  Feel free to use it in your next presentation.

