Thursday, 10 October 2013

Flags and Dual values in Set Analysis

To avoid having to create very complex Set Analysis, it can be quite a good idea to create flags in your script.  Doing the complex calculation in the script to create a simple 1 or 0 value means it happens once - when your users are not involved - and makes the front end calculations a lot quicker - when your users are involved.

Sometimes we want to dual-purpose that flag field.  As well as having the value of 1 or 0, we want to allow the users to select the field.  "1" and "0" are not very friendly so we want to have the users click on something like "Yes"/"No" or "True"/"False".  There are a few ways of achieving this:

1.  Just create the 1 or 0 in the script and in the front-end use an Expression in the list box like:

   If(Flag=1, 'True', 'False')

This works fine but, of course, is putting some calculation back into the front-end.  While it is not an arduous calculation, it will re-calculate on all selections so it does add some clock cycles.  It also makes no impact on how you would use the value in Set Analysis so I don't need to worry about this option any further in this post.

2.  Create the 1 or 0 in the Flag field and create a 2nd field (e.g. Flag_Text) with the text values that you want.  The user gets to select the value from Flag_Text, but we use the Flag value in Set Expressions.

There are a couple of issues with this.  First off, it adds additional overhead in the Script because we need to do the same calculation twice and create 2 fields.  Secondly, if we use the Flag value in a Set, we need to ensure that we also exclude the Flag_Text value from that set:

   Sum({<Flag={1}, Flag_Text=>} Sales)

There is also the possibility that a user using collaboration features might add the Flag field as a list box and that might cause confusion.

3.  Instead of creating the values as 1 or 0, we create Dual values:

   ...
   If(Condition=True, Dual('Yes',1), Dual('No',0)) As Flag,
   ...

This is more efficient because everything happens in the script and we are only creating the one field there from one piece of code.

There is a complication though in your Set Analysis.  You would like to do something like this:

   Sum({<Flag={1}>} Sales)

But it will now fail!  This is because the set comparison always compares against the text value of a dual (except for Dates - more of that below).  That means that you would have to use:

   Sum({<Flag={'Yes'}>} Sales)

Of course, this is a text comparison so will not execute as efficiently as a numeric one.  What can we do?

Well, we can force the calculation to be numeric by using a search syntax instead of the actual value and that search should involve > or <.  For example:

   Sum({<Flag={">0"}>} Sales)

Will work!

Interestingly, this does not work immediately with a date value that has been created with one of the date functions (like Date(), MakeDate, etc.)  I am not totally sure of why but I suspect that this is handled differently because the system recognizes it as a date (so it understands what ">2/6/2013" means) and doesn't handle the same way as other numeric Duals.

This means that we often need to add formatting into a date set to get it to work.  Like this:

   Sum({<Date={$(=Date(vDate,'M/D/YYYY'))}>} Sales)

We can override this and make our dates handle like other Duals by loading them as a Dual explicitly in the Script.  Instead of this:

   ...
   Date(date_field) As Date,
   ...

We can do this:

   ...
   Dual(Date(date_field), Num(date_field)) As Date,
   ...

Have fun!


Stephen Redmond is author of QlikView for Developer's Cookbook
He is CTO of CapricornVentis a QlikView Elite Partner. We are always looking for the right people to join our team.
Follow me on Twitter: @stephencredmond

5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hi Stephen,

    thanks for your interesting post. I have one question, why do you think "Of course, this is a text comparison so will not execute as efficiently as a numeric one"?

    IMHO, this should be mutch faster than evaluating a search expression.

    Best regards,
    Ralf

    ReplyDelete
  3. Hi Ralf,

    I have tested both on a simple dataset of about 10million rows and the {">0"} set executes about 20% faster than the {'Yes'} comparison.

    Regards,


    Stephen

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete