Monday 23 November 2020

Revisiting the QlikView Cookbook - Boxplot

It has been 7 years since the QlikView for Developer's Cookbook was published! 

The book, which was very much targeted at QlikView developers, is still available. Just for fun, I thought that it might be interesting to write a series of posts looking back at some of those recipes and seeing how they could be recreated in Qlik Sense Business.

In this post, I'll look at the recipe on Creating a box plot chart for a simple data set.

The Boxplot in Qlik Sense is a little more advanced than the one that I originally wrote about in QlikView. It has an element of sophistication around it which may actually initially confuse a user, especially one who has been used to other statistics tools. When I use a Boxplot in QlikView, I generally expect to add one dimension, and then expect that the values will be presented for each value in that dimension. I can, of course, get quite sophisticated with my calculations, and there is even a Wizard to help me get started with creating the necessary AGGR expressions I may need.

The Qlik Sense Boxplot will actually assume that I want to do the calculations using similar AGGR expressions and, if I don't, I need to make some manual edits. It is probably easier if you let me show you what I mean with an example. First, some sample data - only slightly more complex than in the original recipe:

Load 
    'USA' as Country,
    Ceil(Rand()*12) as Month,
    Round(Rand()*50, 0.01) as Value
AutoGenerate(100);
Load 
    'UK' as Country,
    Ceil(Rand()*12) as Month,
    Round(Rand()*20, 0.01) as Value
AutoGenerate(100);

After loading, if I drop a Boxplot onto my sheet, it offers me the usual Add dimension and Add measure options. If I try to recreate the original QlikView Cookbook chart, I might try with Country and Value - actually you will note that, as usual, it forces you to add an aggregation.


Ok, so what happens next? Well, you might be surprised to see only 1 skinny box plotted!


So what happened? Why did it not split by Country as a QlikView user might have expected? Turning to the help file, you might find this description of adding the first dimension:

    This is the inner dimension, which defines a box.

Eh? What box does it define? Well, we have added a dimension and an expression, so this is the box it defines:


So, actually, there are only 2 values being represented by the Boxplot. The plot isn't looking at all the individual values (which can't be bigger than 50 because of the way we have created them), it is representing the statistics of all the calculated values - in this case only 970.77 and 2529.77.

Let's rejig slightly. I am going to replace the Box dimension with Month and set the X-axis dimension to be Country. The result looks something like this:


In this case, the "box" is now a table of the Sum of Value by Month, and there is a different "box" for each Country. For example, the UK "box" may look like this:


The Boxplot calculations are then calculated across the 12 values in each "box". Simples!

But what are the calculations? Well, if we look into the Box plot element properties, we can see that it defaults to some presets from which we can choose Standard (Tukey), Percentile-based or Standard deviation. If we turn off the Use-presets option, we can now see what set of calculations are being used for each element in the Boxplot. In fact, this also allows us to put in our own calculations! We could, for example, use this option to go back to a version of the original QlikView Boxplot.

If add a new Boxplot on the screen and then go straight to the properties and add Country for both the Box and X-axis dimension. For the Y-axis measure, I am just going to add the value 0 - it doesn't matter too much what value you use, though it does need to return a valid calculation or the chart won't work.

Now, when I go down to the Box plot element section and deselect the Use-presets option, I can specify new expression values:

Property
Expression
First whisker
Min(Value)
Box start
Fractile(Value, 0.25)
Center line
Fractile(Value, 0.5)
Box end
Fractile(Value, 0.75)
Last whisker
Max(Value)



Now we have something that is similar to the original!


As well as holding a Master's Degree in Data Analytics, Stephen Redmond is a practicing Data Professional of over 20 years experience. He is author of Mastering QlikViewQlikView for Developer's Cookbook and QlikView Server and Publisher.  
  LinkedIn