Tuesday, 27 January 2015

Don't smooth the path

One of the options in QlikView line charts is to apply a smooth option so that lines look nicely curved rather than ugly and jagged. However, while it can be aesthetically pleasing, it can also be a distortion of the data.

An interesting aspect of this distortion, and one that many designers might not think about, is how we deal with interpreting curved lines that are tracking along similar paths - for example, sales versus budget. Have a look at such a chart:


Now, if I was to ask someone where these lines diverge the most, the average person will suggest that the largest gap is in and around the apogee of the curve. But it's not correct! Would you believe me if I told you that the largest divergence is at the start and that the difference at the apogee is actually half way between the maximum and minimum? Here is the data:


The problem is that our visual system is looking at the minimal distance between the lines, we aren't looking at the vertical difference between the points that define the lines. Lines in a line chart are actually there to give some shape to show us a trend, they aren't the data!

The line chart below may be a better representation:


Here there is still the temptation to look at the distance between the lines but at least we have the points to help us. We could even add some further clarity using a combo chart:


It may not be ideal, but using the gestalt principles of enclosure will at least direct our visual system to the connected dots and give us some better understanding of the differences.

You might even get away with putting the curves back in!




Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Friday, 16 January 2015

More on Same Day Last Year

Back in March, I published a post on calculating the same day last year.

In a recent post on the Masters Summit for QlikView LinkedIn group (the group is only open to previous attendees - yet another good reason to attend!), someone was asking about the subject so I shared a link to my original post. Another member queried the logic and suggested that it only works "75% of the time".

I expanded a bit on my logic from the original post, so I though that I would update it here for public viewing.

My assumption was that by 75% he was really just mentally rounding and actually meant 83%. In the last 100 years (04-Jan-1915 to 28-Dec-2014) there have been 17 years with 53 weeks. I assumed that this was the "25%" where he assumed that the logic broke down.

I believe that the logic doesn't break down - ever.

As an example, let us consider 2009 which had 53 weeks - 28-Dec-09 to 03-Jan-2010. What should be the correct week in the past to compare this week to? After all, there is no week 53 in 2008. Should we just forget that this week exists?

Of course not.  What we really need to do is to consider the purpose of the comparison and that is to compare performance of equivalent weeks and equivalent days in those weeks. The equivalent week, Monday to Sunday,  to make the comparison to would have to be the week of 29-Dec-08 to 04-Jan-09 - and that is actually week 1 of 2009!

This logic continues through 2010. 2010-01 is compared to 2009-02, 2010-02 is compared to 2009-03, etc., all the way until 2010-52 is compared to 2009-53. The week numbers will then realign in 2011.

In this fashion, the Date-364 calculation is always correct (leap year, 53 week year or otherwise) because it always gives you the equivalent day one year ago.

For your pleasure, here is a small script that you can run in QlikView or Qlik Sense to compare the like-for-like dates for the last 100 years:

Let vStart=Floor(MakeDate(1915,1,4));
Let vEnd=Floor(MakeDate(2014,12,28));
Let vDiff=vEnd-vStart+1;

Calendar:
Load
DateID,
Date(DateID) As Date,
WeekYear(DateID) as WeekYear,
Year(DateID) As Year,
Week(DateID) As Week,
WeekDay(DateID) As WeekDay,
WeekYear(DateID) & '-' & Num(Week(DateID), '00') As YearWeek,
Date(DateID-364) As Date_LFL,
WeekYear(DateID-364) & '-' & Num(Week(DateID-364), '00') As YearWeek_LFL
;
Load 
$(vStart)-1+RecNo() As DateID
AutoGenerate($(vDiff));



Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Tuesday, 16 December 2014

Sway to the Qlik

Microsoft have made their new Sway service generally available on "Preview".

So, what is this Sway?  Really, it is about telling stories - without having to know anything about design or building websites, you can upload some images, type in some text, and publish your story.  Your story is then viewable by any user, on device, in a responsive manner.

For example, witness the rather amusing antics of Flat Tony.

You can have your story go horizontally, or you can have it go vertically.  You can easily arrange things if you don't like the way the system presents them.  More ways of arranging things will be coming on stream quite soon.

Why might this be of interest to a Data Viz professional?  Because stories are what we tell - stories about data.

One of the features of Qlik Sense that most interested me was Stories.  One of the features of Qlik Sense that most disappointed me was ... Stories.  The concept is brilliant - easily snapshot images of my charts and then embed them in a story along with annotations to explain what is happening.  The problem is that I can only share these with other users of Qlik Sense - I can't export them to any other format or push them out on a website to non Qlik Sense users.

There is a logic to these restrictions.  The features of being able to view the charts "live" from within a story would not be available to external users.  OK, that can be an important thing, but is that the most important thing?

Here is something that I built in about 10 minutes using Sway (it may look familiar!):

Food Corp Sales Analysis 2014



It isn't earth shattering, but it wasn't several days of effort either.  I used the Windows Snipping tool to grab the images (apparently I can use APIs to get snapshots too), and just added the text.

The big deal about this link is that you can view it - and you can view it on any device.  And you don't need to have a Qlik Sense license.

I am sure that we will see Qlik Sense Stories evolving over time - this is definitely a direction that I would like to see them going.


Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Friday, 12 December 2014

Season's Greetings

This QlikView script may be the most important one that you ever see (or not, mostly not).

Follow these steps.

1.   Create a new QlikView application and add the following script:

   For i = 0 to 100 step 5

Tree:
LOAD
RowNo() as id,
1 as Period,
$(i) As Branch,
-100+$(i) as X,
$(i)+10 as Y
AUTOGENERATE (1);

LOAD
RowNo() as id,
2 as Period,
$(i) As Branch,
0 As X,
$(i) As Y
AUTOGENERATE (1);

LOAD
RowNo() as id,
1 as Period,
$(i)+1 As Branch,
0 As X,
$(i) As Y
AUTOGENERATE (1);

LOAD
RowNo() as id,
2 as Period,
$(i)+1 As Branch,
100-$(i) as X,
$(i)+10 as Y
AUTOGENERATE (1);

   Next

      Reload the document.

2.   Add a new Scatter Chart.

3.   On the General tab, set the Title in Chart to:

   =chr(8902)

      In the Title Settings, set the font size to 26 and turn on Bold.
   
4.   On the Dimensions tab, add Period and Branch.

5.   Turn on Advanced Mode on the Expressions tab.  Remove the expressions that have been added and add the following 3 expressions:

   =X

   =Y

   =10+Avg(fabs(X))

6.   Under the first expression, set the following Background color expression:

   =RGB(1, 121, 111)

7.   On the Style tab, select the flat, connected bubbles Look (3rd one down in the 2nd column).

8.   On the Presentation tab, turn off Show Legend.  Add a Reference Line on the X-Axis with a value of 0.  The color should be the same RBG as in step 6.

9.   On the Axes tab, turn off the Forced 0 option and turn on the Hide Axis option for both axes.  For the Y Axis, set a Static Min of -5 and a Static Max of 110.

10.  On the Layout tab, turn off the border (or set to 0 pt).

11.  On the Caption tab, set an appropriate caption.

With a little tweaking (using Ctrl+Shift to move bits around), you should be able to come up with something like this:




Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Friday, 5 December 2014

Searching Questions


I had the privilege recently of visiting Qlik Labs (or "Q Branch", as I like to call it) in London and listened to Alistair Eaves, the Director of Qlik Labs, give a talk on some of the work that they are doing there.

Needless to say, I can't go into the specifics of everything that was said (nothing about laser pens though!) but I can talk about one feature that has made it into Qlik Sense - global search.

I particularly wanted to talk about this one feature because I am not sure that enough has been said about it - probably because most people who are experienced with QlikView might miss the significance of it.

In the Search Object in QlikView, you can enter multiple terms.  It will then highlight where all of those terms might match in the different fields.  There is no concept though about how well those search terms have matched - do they all match, if not how many match?  If I type the query light beer minnesota into the object, then I might get some hits across some fields, but I don't really understand how well I have searched.  I don't know if there is any association between the values that have been found - and Qlik is all about association.


In Qlik Sense, however, the same query gives me far more interesting results.  It gives me qualitative information about how many hits that I have had, what fields I have hit, and how they associate together.  It also gives me information on places where I didn't have exact hits, but still might be very useful to know about.

This might appear to be quite a simple thing, but it is really very powerful.

Users are already used to asking questions of Google to find information.  In the future, they may expect to ask similar questions from their own data to discover insight.  Perhaps even verbally!

Watch this space...


Stephen Redmond is author of Mastering QlikView, QlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a Qlik Elite Partner.
Follow me on Twitter   LinkedIn

Friday, 28 November 2014

Mini chart for alerts, text objects and popups

I was handed and interesting challenge yesterday.  The challenger was looking for a way to present both a list of dimensions, and an associated calculation, in a popup dialog.  This sounds like it should be quite achievable, but I couldn't think of an answer immediately!

We can, of course, present the list of dimensions using Concat:

=Concat(Customer, chr(10))

We can add a sort

=Concat(Customer, chr(10), -Aggr(Sum(LineValue),Customer))

If we tried to add the Sum into the Concat's first parameter, we would get an error about not allowing an aggregation.

We were racking our brains to come up with a way to add the value (without seeing the bleeding obvious! - see below).  Eventually, I came up with this horrendous expression:

=If(Len(FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),1))>0, 
FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),1)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),2))>0, 
FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),2)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),3))>0, 
FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),3)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),4))>0, 
FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),4)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),5))>0, 
FirstSortedValue(Customer, -Aggr(Sum(LineValue), Customer),5)
& ' - ' & Num(Sum({} LineValue), '#,##0')
, Null())

This just shows the top 5 - but only if there is no match on the expression - in which case it would hide the matching values (because firstsortedvalue returns null).  So I had some thoughts and came up with this:

=If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),1))>0, 
FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),1)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),2))>0, 
FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),2)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),3))>0, 
FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),3)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),4))>0, 
FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),4)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null()) 

& If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),5))>0, 
FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),5)
& ' - ' & Num(Sum({} LineValue), '#,##0')
& chr(10), Null())

& If(Len(FirstSortedValue(DISTINCT Customer, -Aggr(Sum(LineValue)+Rand(), Customer),6))>0, 
'Others'
& ' - ' & Num(Sum(LineValue)-Sum({} LineValue), '#,##0')
, Null())

This adds a small rand() value that appears to break the match enough to make things work - I am not sure if it will work on all occasions, but in several tests it did work.  So, it does solve a problem where I want to see just the top 5 and "Others".

But it doesn't show all the dimensions plus values - which was the original challenge.

A rethink was in order and I returned to the Concat with the sort.  I wonder...  If the sort expression accepts an Aggr, would that be allowed in the first parameter:

=Concat(Customer & ' - ' & Num(Aggr(Sum(LineValue), Customer), '#,##0'), chr(10), -Aggr(Sum(LineValue), Customer))

Yay!  It worked!!!

Potentially, this could be used in text objects, alert texts, and popup texts.


Stephen Redmond is author of Mastering QlikViewQlikView Server and Publisher and the QlikView for Developer's Cookbook
He is CTO of CapricornVentis a QlikView Elite Partner. We are always looking for the right people to join our team.
Follow me on Twitter: @stephencredmond

Wednesday, 19 November 2014

No nodistinct

This article is taken from my new book - Mastering QlikView.  Available from 26th November.



The Aggr function has, as an optional clause, the possibility of stating that the aggregation will be either distinct or nodistinct.

The default option is distinct and, as such, is rarely ever stated.  In this default operation, the aggregation will only product distinct results for every combination of dimensions - just as you would expect from a normal chart or straight table.

The nodistinct option only makes sense within a chart, one that has more dimensions than are in the Aggr statement.  In that case, the granularity of the chart is lower than the granularity of the Aggr and therefore QlikView will only calculate that Aggr for the first occurrence of lower granularity dimensions and will return null for the other rows.  If we specify nodistinct then the same result will be calculated across all of the lower granularity dimensions.

That can be difficult to understand without seeing an example, so let us look at a common use case for this option.  We will start with a data set:

ProductSales:
Load * Inline [
Product, Territory, Year, Sales
Product A, Territory A, 2013, 100
Product B, Territory A, 2013, 110
Product A, Territory B, 2013, 120
Product B, Territory B, 2013, 130
Product A, Territory A, 2014, 140
Product B, Territory A, 2014, 150
Product A, Territory B, 2014, 160
Product B, Territory B, 2014, 170
];

We will build a report from this data using a pivot table:


Now, we want to bring the value in the Total column into a new column under each year, perhaps to calculate a percentage for each year.  We might think that, because the total is the sum for each Product and Territory then we might use an Aggr like this:

   Sum(Aggr(Sum(Sales), Product, Territory))

However, as stated above, because the chart includes an additional dimension (Year) than the Aggr, then the expression will only be calculated for the first occurrence of each of the lower granularity dimensions (in this case, for Year = 2013):


The commonly suggested fix for this is to use the Aggr without the Sum and with a nodistinct like this:

   Aggr(NoDistinct Sum(Sales), Product, Territory)

At first, this will appear to solve the problem:


The problem occurs when we decide to have a total row on this chart:


Because there is no aggregation function surrounding the Aggr, it does not total correctly at the Product or Territory dimensions.  And we can't add an aggregation function - like Sum - because it will break one of the other totals.

There is, however, something different that we can do - something that doesn't involve Aggr at all!  We can use our old friend Total:

   Sum(Total<Product, Territory> Sales)

This will calculate correctly at all the levels:






Stephen Redmond is author of Mastering QlikView, the QlikView for Developer's Cookbook and QlikView Server and Publisher
He is CTO of CapricornVentis a QlikView Elite Partner. We are always looking for the right people to join our team.
Follow me on Twitter: @stephencredmond