Wednesday, 10 June 2015

Are you answering the right question?

I bring before you the story of statistician, Abraham Wald.

During World War II, Wald was part of a team looking at the problem of bomber loss and to consider how they should reinforce the planes to better protect them. The problem was proposed that they should look at the frequency of damage sustained by returning bombers and to use that information to make recommendations on where the plane should be reinforced.

Wald's brilliant insight was to turn the problem on its head. He suggested that the places where the returning planes were being damaged most frequently were the places where those planes could actually sustain damage and, mostly, successfully return to base. What the question should really be is where the planes that weren't returning were being damaged which meant that they failed to return!

It seems so simple when you think about it, but sometimes we are so sure that we are looking at the problem the right way, that new insight that tells us that we are looking at it completely wrong is not always well received. But we should receive it and we should look at it and only dismiss it if we can logically decide that it should be dismissed.

So, think outside the box and solve this problem:

Here is a pattern of 9 dots, arranged in a 3x3 grid:

Now, I want you to connect the pattern of 9 dots using four straight lines drawn without lifting the pen from the paper or retracing any lines. Simple, eh?

Please don't post solutions below. If you discover it, just be happy that you have done so and feel good that you have thought differently and bring that skill to your daily work.

Stephen Redmond is a Data Visualization professional. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
Follow me on Twitter   LinkedIn

Saturday, 30 May 2015

Pick the right chart to tell the right story

A friend of mine recently shared the following chart on LinkedIn:

I believe that it may originate from the Centre for Learning and Teaching in Hong Kong and may have been a student project. I wonder what the grade was?

Unfortunately, this chart is not a great way to represent the data from a purist dataviz point of view. Polar charts like this are often a poor choice for most uses, in a very similar way to pie charts with lots of segments. In this case, the scales are all over the place so it is really difficult to use the graphic to tell me where the segments with the largest values are. For example, the largest segment seems to represent the 4 million Google searches which, by area, appears to be twice as big as the 3.3 million Facebook posts and it is also much bigger than the 50 billion Whatsapp messages! The 41,000 Instagram photos segment is bigger than the segment of 215,000 for the previous year! It is also way bigger than the 1.4million Skype calls right beside it. I can't really trust the size or position of any of the segments to relate any information to me.

The earliest recorded case of a Polar chart was the one created by Florence Nightingale in 1858 to demonstrate the causes of mortality in and around the Crimean campaign:

This was a revolutionary chart in its day. It was very easy to see that the largest cause of death to British servicemen was actually from preventable disease rather than from wounds or other causes of death. But, while we can see the story from the sheer amount of blue ink on the chart, can we discern anything else - is there any pattern or trend. I wonder if this is the best way to visually represent the data.

QlikView doesn't have a polar chart but we can approximate one with a radar chart:

Still not idea. We are still looking at lots of blue, but no trend. Perhaps a stacked bar chart would be useful?

This is good because we can see the overall trend. Although, one of the problems that stacked bar charts have is that we can only clearly discern the trend of either the whole or the bottom bar of the stack (Wounds). We can have difficulty seeing the trend of the middle (Other) or top (Disease) bars.

Another way might be to try a Redmond Profile chart:

The profile chart shows us the trend of the whole and then shows the % split of each of the parts. This is useful but maybe still not ideal for this data-set. There is a variation of the Redmond Profile chart that might work here:

In this case, the profile bars, instead of showing the % share of the values will show the actual value. In this way, we get to see the trend for the whole as well as the parts in a way that the stacked bar chart doesn't do.

If I was a gourmet chef, I might call it a deconstructed stacked bar chart.

There you go, gourmet data visualization.

Stephen Redmond is a Data Visualization professional. He is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
Follow me on Twitter   LinkedIn

Friday, 8 May 2015

Poacher turned gamekeeper

Today is a day of mixed feelings. I am excited because on Monday I will start a new career away from consulting. I am also sad because I am leaving behind a company that I have worked for since 1999 - over 16 years!

The last 16 years have been quite a rollercoaster and I have met, worked with, and drank with, some really great people. I have interacted online with so many other nice people who I have yet to have had the pleasure of meeting face-to-face. I have got to travel far and wide - from Seattle to Seoul - and I have attended some great events - from the old SalesLogix partner events in the early days to the Qlik Qonnections events in more recent years.

I need to give special thanks to all my colleagues at Capventis over the years. All of them have helped me grow by challenging me to be the best that I could. It has been my pleasure to have worked with an incredible bunch of smart people.

I am leaving a much bigger and stronger organisation than the one that I joined so many years ago. I know that they will continue to grow and continue to succeed into the future.

Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
Follow me on Twitter   LinkedIn

Thursday, 30 April 2015

Data Preparation for Qlik Sense

Today, Capventis have made my latest book, Data Preparation for Qlik Sense Desktop using Pentaho Kettle, available for free download:

So, what is data preparation and why does anyone need it?

For those of us that have some expertise in QlikView and Qlik Sense development, we probably don't need to worry about this at all. This is because we already do our data preparation using the Qlik script. All of the data loading, joining, mapping that we do is all data preparation. A lot of us are really very good at using the script to manipulate data to meet the needs of business users.

There are, however, many potential users of QlikView and Qlik Sense who are not adept at scripting. To even tell them that they need to use script to load data will make them turn and run! But they are certainly happy to drag and drop files from one place to another and can handle setting properties in dialog.

For that population, the new feature in Qlik Sense Desktop of being able to drag desktop data-sources into an application makes it really easy to create the self-service analyses that they need to create. But that feature - even with announced changes to the data loader in Qlik Sense 2.0 - cannot really handle more complex loading and transformations, we need to start thinking of the script again.

That is where graphical data preparation tools come in. They enable business users to perform those more complex load and transforms in a graphical environment without having to learn any scripting. They can output a single file that can be dropped into a Qlik Sense app.

There are several Data Preparation tools on the market that have working plugins to extract data into QVX format that can be read into QlikView or Qlik Sense. Leading tools such as Lavastorm and Alteryx will also have server based options and integrations to advanced analytics engines like R.

I went for Pentaho Data Integration (PDI/Kettle) for this project because it is open source and the Community Edition is free - just like Qlik Sense Desktop. Once you have some experience with one, it makes it easier to transition to another. PDI doesn't have an out-of-box output to QVX, but output to Excel is usually good enough for most business users. For the more technical amongst you, there is Ralf Becher's excellent solution to stream data from Pentaho Kettle into QlikView via JDBC.

The eBook is about 80 pages and comes with support files to help you try out the exercises. Feel free to download it now.

Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Thursday, 16 April 2015

Explaining Pie-Gauges

Back in December 2013, I discussed different KPI approaches and introduced the Pie-Gauge as a form of representation. I used them in the dashboard of the winning app in the recent Qlik UK partner competition:

Last month, I described how happy I was that QlikView 11.2 SR10 was released so that we could change the segment color in a Pie chart - making Pie-Gauges look better.

Prior to that, and indeed in my entry to the partner app competition, I had used an extension object for creating Pie-Gauges. This is published on Branch.

So, what, exactly, is a Pie-Gauge anyway? It is a gauge because it is representing a KPI - one value versus a target value. However it solves one issue with gauges in that I don't ever have to worry about the scale. This is especially good when I may have several KPIs. Pie-gauges are all about ratios.

Like all pie charts, it is a part-to-whole comparison. In this case, there are always three segments, two of which are mutually exclusive, making up the whole.

The first segment represents the amount by which we have fallen short of the target. The second segment represents the lower of the target value or the actual value. The third segment represents the amount by which we have exceeded the target.

We can see that the whole is therefore the higher of the target or the actual value. We can also see that the first and third segments cannot exist together - we can't fall short and exceed the target at the same time.

The positioning of the segments is important. The first segment must be to the left of the top of the pie, and the third segment must be to the right - signifying below and above target.

The great thing about these gauges is that they will work no matter by how much we have fallen short or exceeded the target - they are always a part-to-whole comparison. Unlike gauges with fixed axes, they will just work.

The really important thing to grasp is that the actual % above or below target is not important!  It is the representation of the ratio that is important. It is whether we are above or below target, not by how much, that we are representing.

The three values can be very easily calculated using Qlik's RangeMin and RangeMax functions.

The first segment is:


The second segment is:


The third segment is:


In QlikView, we can have these as three separate expressions in a Pie chart. In Qlik Sense, we can only have one expression so I use a ValueList dimension and an expression like this:

If(ValueList(' ', '  ', '   ')=' ',
If(ValueList(' ', '  ', '   ')='  ',
RangeMin(Sum(Actual), Sum(Target))+0.001,

Note the +0.001 on each - that stops Qlik Sense displaying the "chart contains zeros" message. The spaces in the value list are there just to stop additional text being displayed on the Pie.

The color can then be calculated like this:

If(ValueList(' ', '  ', '   ')=' ',
If(ValueList(' ', '  ', '   ')='  ',
RGB(240,240,240), LightGreen()

Have fun with Pie-gauges.

Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Thursday, 9 April 2015

The vaccine effect

Sparked by Alberto Cairo's tweet sharing a blog on Recreating a famous visualisation, I decided that I would go onto the Project Tycho website and grab the data myself to play with in Qlik Sense.

It was, of course, fairly quick to import an Excel file into Qlik Sense Desktop, adding a CrossTable command to break out the data by State.  First visualization, one that I usually turn to to see what I have, was the bar chart:

By adding a color expression to highlight bars up to 1963, when the measles vaccine was introduced, versus those after 1963, the data jumped out very quickly.

I used the new pivot table in Qlik Sense to recreate the WSJ heatmap, along with similar color rules to those used by Mick Watson:

I also decided to have a look at the geographic spread, both before and after the vaccine:

Vaccines are really amazing. Just to think that just over 50 years ago, people were dying from a disease that, for most of us now just doesn't exist.

Worth thinking about.

Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Monday, 30 March 2015

Do your Apps win Michelin Stars? Do they need to?

Back in 2012, at the TED@SXSWi event in Austin, Texas, JP Rangaswami, then Chief Scientist at SalesForce, now Chief Data Officer at Deutsche Bank, challenged us to consider that information is food.  It is an interesting analogy. I especially liked the "Supersize Me" suggestion of having to watch Fox News for 30 days.

So, if information was food - what would you do differently?

If information was food, what kind of data visualization would you like to see?

Would you be happy with the daily stodge? Not too pretty to look at, and you are not 100% sure of where the ingredients come from (and you have only been ill a few times!). Perhaps of the variety sold to the residents of Ankh-Morpork by CMOT Dibbler?

Or would you be looking for the time-consuming, detail-attentive, incredibly beautiful and incredibly expensive, Michelin starred fare?

Or is it somewhere inbetween?

The reality is, boringly, that it doesn't matter how pretty nor how ugly the presentation layer is if the ingredients are suspect. As I said in a post last year, good governance prevents people from getting food poisoning.

The first step is to get the ingredients right (or as right as we possibly can!) - then we can focus on the presentation. And focus we must. A plate of great ingredients just mashed together will not encourage our diners to return. We mush present our ingredients as best we can with the tools that we have available and then the foodies will keep coming back.

You never know, one day there may be a Michelin judge with them.

Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn