Friday 19 November 2010

Do we really need maps for analysis?

This is a response to Barry's QlikFix post - http://t.co/q4jPQAs

While interesting that 80% of data may have a spatial component (depending on how you define it), the reality is that the majority of that data does not have a special dependency – it is not really that suitable for display on a map.

A common example for such displays might be something like sales by state. These are often displayed as bubbles geographically positioned over the state in question but do they really help with the analysis? Sure, you may be able to easily see that sales in California are larger than in Oregan, but how do they compare against sales in New York, New Jersey and Rhode Island that may be smudged together on the far side of the map. What might look “cool” on a demo may turn out to be something that is easier to analyse in a bar chart.

One other unfortunate flaw is the level of education that might be encountered. Studies have shown that percentages of schoolchildren in the UK have difficulty picking out their own country versus France or Germany! Can we expect that everyone who needs to analyse European data is actually aware that the boot shaped country is Italy?

Maps can be cool and have some great applications (many of which can be easily done in just Google maps without QlikView). Real world analysis can mostly do without them.

Sunday 14 November 2010

It is often simpler than it looks

Often, as we encounter more and more diverse data, we encounter data structures that would appear to call for more and more complex workings to associate them in QlikView. What we need to do though is take a step back and see if there is a different way. My experience is that once I do this, I find that it is often simpler than it looked.

For example, you have 2 or 3 tables that have several fields in common. Some might start down the route of generating complex keys (for example, with autonumber or autonumberhash256) and one or more key tables. A second look might reveal that if we simply concatenate all the tables - even though it appears that they are unlikely candidates for concatenation - then we will suddenly be left with a simple key table and some dimension tables.

Another way of simplifying things that I use all the time is ApplyMap. This is a really useful function (not unlike vlookup in excel) that enables us to map small tables into larger tables. Always look to see if you can do a mapping load.

It is often simpler than it looks.

Thursday 30 September 2010

Access DBs on 64bit

I have written before about the issues of 32bit drivers on 64bit QlikView. It is great to see that version 10 will get past this but, if you need to get Access data into your server now, on earlier versions, that is not a help.

I am writing this because it is fresh in my mind because I have had the exact problem with a client. They had an old Access database that they needed to get data into on 64bit QlikView. Now, I have a 64bit Access driver because I have 64bit Office 2010 and it works great in QlikView. Microsoft have released this as a separate download so you don't need office installed on your server:


You don't need newer format Access databases, it works with older mdb files too. Only criticism that I have is that the DataLinks properties box doesn't have an ellipse button to allow you browse for your database. This isn't the most difficult thing to get past, just paste the path into the properties box.

Hope this helps.

Tuesday 10 August 2010

Installation ended prematurely because of an error

I had started having this problem recently and decided to blog the solution because I couldn't find it elsewhere.

This was happening when I went to install QlikView 9 SR5 but it is actually an issue with the Wise installer rather than QlikView specifically. My problem is on Windows 7 64bit.

What happens is that the Wise installer is trying to use VBScript and if it fails then it will give this message. If you log the installer you will see messages like:

DEBUG: Error 2896: Executing action WiseNextDlg failed.

Googling the message led me to try re-registering the VBScript.dll file (both in System32 and SysWOW64). However, that did not help.

Using Process Viewer (sysinternals) I noticed that it was loading the following registry entry:

HKEY_CURRENT_USER\Software\Classes\Wow6432Node\CLSID\{B54F3741-5B07-11CF-A4B0-00AA004A55E8}

This GUID - B54F3741-5B07-11CF-A4B0-00AA004A55E8 - is the GUID for VBScript. However, the entries under it were invalid. There was also another HKEY_LOCAL_MACHINE entry that was valid. It appears that Windows will go for the current user entry first.

I renamed the registry entry under HKCU and re-ran the installer and all is now well.

Wednesday 23 June 2010

More on Dual - "Others" in chart using AGGR



In a previous blog post (More On Dual - http://qliktips.blogspot.com/2009/09/more-on-dual.html) I proposed a method for getting better labels in a Pie chart by using Dual to create the label and the value in the expression:

Dual(Country & ' (' & Num(Sum(Sales)/Sum(Total Sales),'0.00%') & ')' & chr(13) & Num(Sum(Sales),'#,##0'), Sum(Sales))

A problem has been pointed out to me by one of my readers - if the number of slices is restricted (10 by default), then the "Others" does not display correctly:

We can fix this by limiting the number of dimensions using an AGGR statement with RANK in a calculated dimension:

=If(aggr(rank(Sum(Sales)), Country) <= 4, Country, 'Others')

So here, I am checking if the rank of the dimension is less than or equal to 4. If so, just use the Country name, if not, use "Others".

Now, there is still a slight problem with this and the Dual - the name is still blank. So, we need to modify the expression to check for a blank name:

Dual(If(Len(Country)>0,Country,'Others') & ' (' & Num(Sum(Sales)/Sum(Total Sales),'0.00%') & ')' & chr(13) & Num(Sum(Sales),'#,##0'), Sum(Sales))

(It will be blank because there are actually multiple values).

So, all should be well!


Of course, this doesn't just apply to Pie charts (thankfully!) - this method of restricting the dimension using Rank and AGGR can be used in any chart - including Straight Tables!



Thursday 29 April 2010

Use a ramdrive!

Thanks to HÃ¥kan (the inventor of QlikView) for this handy tip.

If you are storing a temporary file to QVD and then reading it instead of using Resident (because it is way faster - something that you may find is improved greatly in version 10), if you configure a RamDisk (there are several options for this) and use that as the temporary location for the QVD, then it will be even faster!

I hadn't come across a RamDisk for many years (I think since MSDOS 6) but this is a very interesting tip if saving seconds is critical to you.

Beta testing QlikView 10

Just back from Qonnections - the annual QlikView Partner conference. This year it was held in the Fontainebleau Hotel, Miami Beach. Good time was had by all.

We got to hear about the new features coming in version 10. I can't say too much about it now, but there are some pretty cool things in there.

They have had a first beta internally for a couple of months now and beta 2 was launched to partners at the conference (and downloaded by me a few minutes later!)

There should be a public beta release in and around June - but that will depend on how beta 2 goes. Release in the autumn.

This is the longest beta cycle that QlikView have ever run. In fact, they are shortly going to upgrade all their own internal systems (including demo.qlikview.com) to version 10 - really eating their own dog food. Expect version 10 to be the most stable new release ever.

Friday 16 April 2010

How QlikView stores data

To become really good at QlikView, you have to really understand how QlikView stores data. This is an introduction that should help you along that road.

If I load a very simple dataset:

Fruit:
LOAD *
INLINE [
Fruit, Color
Banana, Yellow
Blueberry, Blue
Tomato, Red
Tomato, Green
];


And then add both fields as list boxes, I see that there are only 3 fruit listed - even though I have loaded 4 entries. "Tomato" is only stored once by QlikView. Internally, there are binary keys keeping track of the association between "Tomato" along with "Red" and "Green".

In a large data set with a lot of repeating values (especially text values) the fact that each field value is only stored once means that there is a huge saving in the amount of memory needed to store the information. This is why QlikView can store so much data in memory.

Let's add another table of data:

Colors:
LOAD * INLINE [
Color, Red, Green, Blue
Yellow, 255, 255, 0
Blue, 0, 0, 255
Red, 255, 0, 0
Green, 0, 255, 0
Amber, 255, 121, 0
];


I now might think that I have 2 tables of data - both of which have a field called "Color". But I don't! There is only one field called "Color" in our dataset and this field will have 5 values. Internally, the association between each of them and their associated values are maintained with binary keys. It just so happens that one of the values - "Amber" - is not associated with any of the values in the "Fruit" field.

Note that there is probably no direct association between "Fruit" and "Red". The association is indirect. However, because of the smaller data set in memory and the binary keys, the association is very fast.

Hope this is useful!

Thursday 8 April 2010

Radish Bits

A long-time news group buddy of mine, Walter Shpuntoff, has his own blog that covers QlikView. Well worth the read:

http://radishbits.wordpress.com

Watch your Security Connections

QlikView security is quite simple to implement - you just need to link the security information to the data and you can enable automatic reduction of the information that a user sees. However, you need to watch out for the data that is linking the user to the data - if it is a text field then all of the values in that text field must be uppercase (you know that all the field names in Section Access must be uppercase too, right?)

For example, this straightforward load will enable Tom, Jane or Bill to login:

Section Access;
LOAD * INLINE [
ACCESS, USERID, LNK
ADMIN, TOM, TOM
USER, JANE, JANE
USER, BILL, BILL
];
Section Application;

LOAD * INLINE [
LNK, Name, Month, Sales
TOM, Tom, Jan, 12321
JANE, Jane, Jan, 31232
BILL, Bill, Jan, 32131
TOM, Tom, Feb, 34342
JANE, Jane, Feb, 34545
BILL, Bill, Feb, 76575
TOM, Tom, Mar, 32133
JANE, Jane, Mar, 12123
BILL, Bill, Mar, 21321
];


If you enable the "Initial Data Reduction Based On Section Access" in the "Opening" tab of the Document Properties then they will be restricted to their own data.

However, if I make a slight change:

Section Access;
LOAD * INLINE [
ACCESS, USERID, LNK
ADMIN, TOM, TOM
USER, JANE, Jane
USER, BILL, BILL
];
Section Application;

LOAD * INLINE [
LNK, Name, Month, Sales
TOM, Tom, Jan, 12321
Jane, Jane, Jan, 31232
BILL, Bill, Jan, 32131
TOM, Tom, Feb, 34342
Jane, Jane, Feb, 34545
BILL, Bill, Feb, 76575
TOM, Tom, Mar, 32133
Jane, Jane, Mar, 12123
BILL, Bill, Mar, 21321
];


Now, Jane will not be able to login to this document - because the LNK field has mixed case!

Best advice is to use an integer where possible:

Section Access;
LOAD * INLINE [
ACCESS, USERID, LNK
ADMIN, TOM, 1
USER, JANE, 2
USER, BILL, 3
];
Section Application;

LOAD * INLINE [
LNK, Name, Month, Sales
1, Tom, Jan, 12321
2, Jane, Jan, 31232
3, Bill, Jan, 32131
1, Tom, Feb, 34342
2, Jane, Feb, 34545
3, Bill, Feb, 76575
1, Tom, Mar, 32133
2, Jane, Mar, 12123
3, Bill, Mar, 21321
];


If you need to link on a text field, make sure that you have all capitals - use the UPPER function if necessary.

Monday 22 February 2010

Seeing the Blindingly Obvious

Recently on QlikCommunity, I have proposed a solution however the very brilliant Oleg Troyansky proposed an even better solution: http://community.qlikview.com/forums/p/25381/96915.aspx#96915

Hi answer to the problem (excluding nulls, zero length strings, etc.) was this:

sum({$ < TextField = {"=len(trim(TextField))>0"}>} Qty )

Brilliantly fantastic! Of course, it is even more brilliant if you understand it and, alas, I just couldn't fathom it. It works. It works great. But I didn't see it.

Then my genius colleague, Karol Dorniak, spelt it out for me :- anything that you can type into a search dialog in QlikView can be used in a set! And that includes any "advanced" search syntax!!!

So this is what Oleg was doing - using advanced search. Brilliant.

When you right-click on a List Box in QlikView, there are 3 search options available - Search, Fuzzy Search and Advanced Search. When you click on a list box and just start typing, you get the default search for that List Box (usually the normal search but you can set it to Fuzzy as default on the General tab of the List Box properties). Did you know that you can delete the "**" from the default search and change it to a "~" to start doing a fuzzy search. You can also start typing an "=" followed by an expression (e.g. "=Sum(SalesAmount) > 5000") to do an advanced search - you don't need to go through the advanced search dialog.

So, you can use any of the 3 standard searches by just replacing the search condition with something else ("**", ">", "><", ">=", "~", "=...") Anything that you can type into a search can be used inside a Set!

Man, does this QlikView rock!

Monday 25 January 2010

More on Dual - Problem in Sets

I have discussed using Dual a couple of times in the past. It can be enormously useful. Sometimes it can throw you a bit - especially if you do not realise that you are using a Dual.

An example comes up in my response to the post of jhoffmann on QlikCommunity.

He was asking about doing calculations in a Set. This is, of course, eminently achievable using the dollar-expansion syntax. You do have to watch it when using a Dual value.

He was using a field called "RollingMonth". I assume, because it comes from the (old) training manual, that this was created in the Script using:

...
Date(MonthStart(date_field), 'MM/YY') As RollingMonth,
...


This syntax creates a Dual value - with the month's date (as an integer) and the text as per the format string.

This is quite fine in a Set if you use a simple calculation such as Max:

Sum({<RollingMonth = {"$(=Max(RollingMonth))"}>} Amount)

You see, Max knows about Dual and will return the correct value so that the comparison will work - it returns the Text value. It is the text value that is required in the Set comparison. If the max month is January 2010, Max will correctly return (in this case) "01/10" - even though it is alphabetically less than the "12/09" before it. Max does the Dual sort on the number and returns the text. Simple.

The problem arises with some other functions that are not so "Dual aware". For example, if we wanted to see the values from the same month last year, we might try this:

Sum({<RollingMonth = {"$(=AddMonths(Max(RollingMonth),-12))"}>} Amount)

The problem is that AddMonths takes a date value as a parameter and returns a date. It doesn't care about the text value. It uses the Dual's numeric value to do the calculation and then returns a date/numeric value. For example, it might be "01/01/2009". This will not work in our Set above (RollingMonth={"01/01/2009"}) because the text doesn't match and the Sum will return 0.

To fix this one, we just need to re-establish the Dual value:

Sum({<RollingMonth = {"$(=Date(AddMonths(Max(RollingMonth),-12),'MM/YY'))"}>} Amount)

Now the calculations will work perfectly.

More on "Sets =- or -=???"

A while ago I blogged about using =- or -= in a set.

I answered a post in QlikCommunity today (http://community.qlikview.com/forums/p/24894/95176.aspx#95176) that demonstrates the use very well.

The question was from sebagiar. He asked how to get an average out of values while ignoring 0s. For example if I have 3 values: 100, 0, 200 then he wants a result of 150, not the 100 that the straight Avg function will give.

My response is to use the following set on the field:

=Avg({< F1 -= {0}>} F1)

So, why "-=" instead of "=-"? Well, the later just tells the field to use all values except 0 - it will exclude any selections on that field. The first one is "additive" (well, I guess I should say "subtractive") and will include and selections on that field and then drop out the zeros.