Sunday, 21 February 2016

Fundamental rules of data visualization

There are many "rules" of data visualization that we read in many publications. Some contradict others and some just don't make any sense. Some are accompanied by extensive amounts of proofiness, but often is missing appreciation of the fundamentals. I can use algebra to prove to you that 1+1=1, using perfectly legitimate algebraic transformations, but it is invalid because it breaks a fundamental rule (for those who are interested, I will add it at the end of the post).

I like to preach three fundamental rules of data visualization to those who will listen:

1.  Data visualization is all about ratios
This is so fundamental as almost ridiculous to have to mention, but we need to mention it. Any visualization that seeks to juxtapose several values for interpretation must do so using some kind of visual ratio.

There are many kinds of visual ratios and some are more effective than others. Cleveland and McGill (1984) gave us the order of effectiveness of interpretation for these ratios:

• Position on a common scale
• Position on non-aligned scales
• Length
• Direction
• Angle
• Area
• Volume
• Curvature
• Color saturation

To try and create a data visualization that is not based on some kind of visual ratio is a fundamentally flawed approach. Every ratio is not always appropriate for every visualization either, so we need to learn about what works where.

2.  Data visualization is all about context
We can create the most wonderfully beautiful bar charts and present them on a large screen in Times Square or print them on the most opulent paper in the most vivid colors, but without context they are just rectangles.
Context devices will include such simple elements as titles and axes - enough annotation so as to allow the reader to understand exactly what they are looking at.
As Amanda Cox, Graphics Editor at the New York Times, said in her Eyeo Festival talk:

The annotation layer is the most important thing we do... otherwise it's a case of here it is, you go figure it out.

3.  Data visualization is about SFW
This is the most important thing from a business point of view - and good data visualization is about creating a good solution for the business. SFW stands for So What.
I will always remember the day when I had spent hours on a great dashboard to present to a board-level executive at one of our most important clients. It was technically awesome! Really pushing the boundaries of what the tool could do.
I proudly showed it off at the executive presentation. My client sat patiently through it until, finally, he looked me straight in the eye and said:

So f***ing what?

He was right of course. My technically advanced dashboard had a huge fundamental flaw - I had failed to connect it correctly to the business problem. It wasn't a good solution at all - except in my head.
Fundamentally, we need to make sure that our data visualizations connect with the audience that they are intended for. The first two rules give us the correct technical result, the last gives us the brilliant business solution.

We can create some great business solutions by following these three rules. They may not look great, they may have garish colors, but if the CEO is able to use them to track his business then that is a very good dashboard.

To achieve glory among your peers, you need to start going beyond the fundamentals. Learn what works and what doesn't in most situations. Know when you should use a pie chart and when you shouldn't. Learn how to lay things out. Learn the best colors to use. This does lead to a fourth rule that could be considered fundamental:

4.  Get out of the way and show the numbers
We don't talk about all the color and layout stuff for the good of our health. There are good reasons for doing things in the ways that you will read about in the books. Learn about the reasons for good consistent layout, easy on the eye colors and clean presentation.
Above all, learn that if we don't follow the fundamentals then we start to potentially obscure the data, and this is a flaw that is important to correct.
Get out of the way and show the numbers.

For those that are interested, 1 + 1 = 1:

a = b = 1

a = b

a^2 = ab

a^2 - b^2 = ab - b^2

(a + b)(a - b) = b(a - b)

a + b = b

1 + 1 = 1

Stephen Redmond is a Data Visualization professional. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook