Transcript
  • TekAcademyLabs.com

    QlikView Tips and Tricks

    June 9

    2014QlikView tips and tricks is the document in which content is aggregated from the qlik community. The appropriate source of the content is also mentioned to avoid plagiarism.

    http://tekacademylabs.com/

  • Contents 1. Load XML field from database (Oracle) ............................................................................................. 3

    2. Hash functions and Collisions ............................................................................................................. 4

    Hash functions.................................................................................................................................... 4

    Hash functions in QlikView ................................................................................................................. 5

    Detecting changed records ................................................................................................................... 5

    Hash collisions ................................................................................................................................... 6

    The birthday problem .......................................................................................................................... 7

    Calculating the probability of a hash collision .......................................................................................... 7

    3. Autonumber Vs AutonumberHash Vs Autonumberhash128 ............................................................. 9

    4. Loosely coupled tables ...................................................................................................................... 10

    5. Circular References ........................................................................................................................... 11

    6. Incremental Load .............................................................................................................................. 14

    7. Three types of Incremental Load ...................................................................................................... 17

    8. Qlikview Associative data model ...................................................................................................... 20

    9. The magic of variables ...................................................................................................................... 21

    10. The QlikView Cache......................................................................................................................... 25

    11. Null handling in QlikView ................................................................................................................ 27

    12. Text searches .................................................................................................................................. 27

    13. Automatic Number interpretation .................................................................................................. 30

    15. Colors in Chart ................................................................................................................................. 35

    16. Aggregations and Function Classes ................................................................................................ 38

    17. Its all Aggregations ........................................................................................................................ 40

    18. Dimensions and Measures ............................................................................................................. 42

    19. Qlikview Quoteology ...................................................................................................................... 45

    20. The Crosstable Load ....................................................................................................................... 47

    21. On Boolean fields and functions ..................................................................................................... 49

    22. The Dual() function ........................................................................................................................ 52

    23. A primer on Section Access ............................................................................................................. 56

    24. Data reduction using multiple fields .............................................................................................. 58

    25. Color, state and vectors .................................................................................................................. 61

    26. Handling multiple languages .......................................................................................................... 62

    27. Dynamically selecting timeframes ................................................................................................. 65

    28. The Only() function ......................................................................................................................... 67

  • 29. AND and OR .................................................................................................................................... 69

    30. To JOIN or not to JOIN..................................................................................................................... 71

    31. Canonical Date ................................................................................................................................ 73

    32. Linking to two or more dates .......................................................................................................... 77

    33. IntervalMatch and Slowly Changing Dimension ............................................................................ 78

    34. The Calculation engine ................................................................................................................... 78

    35. Symbol Tables and Bit stuffed pointers .......................................................................................... 80

    36. Basics for Complex authorization .................................................................................................. 82

    37. Generic Keys .................................................................................................................................... 83

    38. Generate missing data in QlikView ................................................................................................ 84

    39. Strategies for creating key tables ................................................................................................... 85

    40. Recipe for a Gantt chart .................................................................................................................. 86

    41. Relative Calendar Fields ................................................................................................................. 87

    42. Master Calendar .............................................................................................................................. 90

    43. Year Over Year Comparisons........................................................................................................... 91

    44. Redefining the week numbers ........................................................................................................ 94

    45. Preceding Load ................................................................................................................................ 97

    46. Macros are BAD .............................................................................................................................. 99

    47. Recipe for Pareto Analysis ............................................................................................................ 101

    48. Monte Carlo Methods .................................................................................................................. 104

    49. A myth about COUNT distinct ....................................................................................................... 107

    50. Unbalanced n level hierarchies ................................................................................................... 110

    51. Hierarchies .................................................................................................................................... 113

    52. Loops in the Script ......................................................................................................................... 114

    53. IntervalMatch ................................................................................................................................ 117

    54. Counters in the Load ..................................................................................................................... 119

    55. Synthetic Keys ............................................................................................................................... 121

    56. Data types in QlikView .................................................................................................................. 124

    57. The nature of Dual flags ................................................................................................................ 126

    58. Dont Join use ApplyMap instead ................................................................................................ 128

    59. Slowly Changing Dimension .......................................................................................................... 130

    60. Search, but what shall you find? .................................................................................................. 132

    71. Cyclic or Sequential ....................................................................................................................... 134

    72. The magic of Dollar Expansion ...................................................................................................... 137

  • 73. When should the Aggr function not be used ................................................................................ 139

    74. Recipe for memory statistics analysis ........................................................................................... 142

    75. The Table Viewer.......................................................................................................................... 145

    76. Creating a Scatter Chart ................................................................................................................ 146

    77. Fiscal Year ...................................................................................................................................... 147

    78. The Master Time Table ................................................................................................................ 150

    79. Create reference dates for intervals ............................................................................................. 152

    80. Fact table with mixed granularity ................................................................................................. 154

    81. How to populate a sparsely populated field ................................................................................. 156

    82. Calculated Dimensions .................................................................................................................. 158

    83. Finding Null .................................................................................................................................. 158

    84. Creating intervals from a single date ........................................................................................... 160

    85. Why dont my dates work ............................................................................................................. 162

    86. Master table with multiple roles................................................................................................... 165

    87. Rounding errors ........................................................................................................................... 168

    88. Generic Load ................................................................................................................................. 170

    89. Clarity Vs. Speed ........................................................................................................................... 173

    90. Green is the Colour ....................................................................................................................... 176

    91. Joins ............................................................................................................................................... 178

    92. On format codes for numbers and date ....................................................................................... 182

    93. The Aggregation Scope ................................................................................................................. 185

    QlikView Tips and Tricks

    1. Load XML field from database (Oracle)

    Source: http://community.qlik.com/thread/8453

    SQL SELECT extract(XML_DOCUMENTO,'/','xmlns="http://www.portalfiscal.inf.br/cte"').getClobVal() AS

    XML_DOCUMENTO

    FROM XML_DOCUMENTO_FISCAL

  • This way the files were extracted and can work with the data to make the load of the QVD created.

    2. Hash functions and Collisions

    Source: http://www.qlikfix.com/2014/03/11/hash-functions-collisions/

    Im currently updating my materials for the upcoming Masters

    Summit for QlikView in Chicago, and thought Id share a little bit with you. In my session on

    data modeling, I explain how you can deal with various types of Slowly Changing

    Dimensions in QlikView. One of the techniques I explain is using hash functions to detect

    changes in (historical) records. During the previous events, this always lead to two questions

    from the audience:

    What exactly are hash functions and hashes?

    And, from those who already know the answer to the first question: Arent you worried

    about hash collisions?

    Today I will answer both questions and hopefully give you some insight into hash functions,

    their usefulness in QlikView and the risks of hash collisions.

    Hash functions

    A hash function is an algorithm that maps data of arbitrary length to data of a fixed length.

    The value returned by a hash function is like a fingerprint of the input value, and is called a

    hash value or simply hash. For example, all of the text above can be translated into the

    following MD5 hash: 357799131ceffdd43cc0fe9f52b36eeb.

    You will notice that this hash is much shorter than the original string used to generate

    it.Besides that, if only a single character in the text is changed this will lead to a completely

    different hash. This property makes hash functions very useful to compare things, for

    example files, but also historical versions of a record.

  • A hash function is deterministic, meaning that the same input value should always lead to

    the same hash value. Typically, a hash function is a one-way function, you cannot decode

    the original input value based on the hash value alone. Besides that, a good hash function is

    also uniform, which means that each hash value should have the same probability of being

    picked. The image at the top of this post illustrates a very simple hash function. Each of the

    four input values is mapped to a unique output value.

    Hash functions in QlikView

    In QlikView, the following hash functions are available:

    Hash128(): a 128 bit hash function that returns a 22 character string.

    Hash160(): a 160 bit hash function that returns 27 character string.

    Hash256(): a 256 bit hash function that returns a 43 character string.

    The number of bits determines the output range of the function. A 128 bit hash can store

    2^128 (or, 340.282.366.920.938.000.000.000.000.000.000.000.000) different combinations.

    160 and 256 bit can store even more combinations (2^160 and 2^256 respectively).

    Besides these functions, QlikView also has

    the AutoNumberHash128() and AutoNumberHash256()functions. These functions

    basically take the output of the Hash128() and Hash256() function and passes it through

    the AutoNumber() function. While I think they have a nicer syntax than the

    regularAutoNumber(), you can supply a comma-separated list of fields instead of a

    concatenated string, the usefulness of these functions eludes me.

    Detecting changed records

    Consider a QlikView application containing the following Employee table:

    Now, assume we get some new, changed data and want to quickly determine which rows

    have changed:

  • As you can see, Jim has moved to another office. How can we detect that this row has

    changed? We could compare each field in the table to each previous version of the field, but

    as we are only interested in detecting if the row has changed, using a hash function is a more

    elegant solution. UsingHash128(Name, Position, Office) we can calculate a hash value for

    each row:

    The hash value for Dwights record hasnt changed, because the record hasnt changed either.

    Jims changed record however does have another hash value than the previous one. Once

    weve detected this we can do further processing on these records. This will be the topic of a

    future blog post. Or, if you dont want to wait for that, my data modeling session at

    the Masters Summit for QlikView.

    Hash collisions

    As noted before, a hash function is an algorithm that maps data of arbitrary length to data of a

    fixed length. When different input values lead to the same output hash value, this is known as

    a hash collision. Consider the following, simplified hash function:

    In this example, both Michael and Toby get the same hash value of 2. Its easy to see what

    the problem is here, there are 5 input values and only 4 possible hash values. The input

    domain is greater than the output range.

    Now, you may think this isnt a problem for me, the number of input values I deal with is

    much less than 2^128, let alone 2^256. Its a simple assumption to make, but also a wrong

  • one as hash collisions can occur long before the number of input values reaches the range of

    the hash function.

    The birthday problem

    Imagine youre in a room with a group of people. How many people do you think need to be

    in that room before the probability of two people sharing the same birthday reaches 50%?

    There are (excluding leap years) 365 days in a year, so maybe 185? 200?

    The answer is 23. Surprising, isnt it? If we raise the number of people to 75, the probability

    of at least two people sharing a birthday raises to 99,95%. This is known as the birthday

    problem.

    As this is a QlikView blog and not a math blog, I wont go through the complete solution and

    proof. Basically, instead of calculating the probability that two people in a group share a

    birthday, the trick is to calculate the probability that no one in the group shares a birthday.

    This is much easier to calculate. The result is then subtracted from 1, which gives the

    probability that at least two people in the group share a birthday.

    Calculating the probability of a hash collision

    If you looked closely at the previous example, you may see that the people can be considered

    input values and that their birthdays can be considered hash values. When two people share

    the same birthday its a hash collision! If we understand this, then we can apply the same

    logic to determine the probability of a hash collision in our data sets. To calculate the

    approximate probability of a hash collision we can use the following formula:

    I created a small Excel workbook to calculate the probability of a hash collision. Now, its

    good to realize that Excel only uses 30 significant digits. As these probabilities are very

    small, this means that Excel is unable to calculate probabilities for very small input values.

    So, in the example below, I calculated the probability that 1 quadrillion (thats a 1 with 15

    zeroes) input values could lead to a hash collision when using a 128 bit hash.

  • The probability of this happening are around 1 in 680 million. Or, to put it in perspective:

    Now, there is a small caveat with this calculation. It assumes the hash functions used in

    QlikView leads to a uniform output, meaning each value has the same probability. This may

    not be the case.

    On the other hand, we are not comparing a quadrillion records, we are only comparing two.

    When calculating the probability of a hash collision with just 2 records and a 128 bit hash

    using an online high precision calculator, the result is 2.938735877055718769922E-39 (1 in

    2.9 Duodecillion). Or, to put it in perspective again, this is less likely than a single person

    winning the lottery, getting hit by a meteorite, getting attacked by a shark -and- becoming

    president of the USA in their lifetime.

    Switch to a 160 bit hash and the likelihood of a collision becomes lower than the combined

    probability of all events in the chart above. Now, just because it is very unlikely doesnt mean

    that it cant happen (see:Law of large numbers), but I like those odds!

  • 3. Autonumber Vs AutonumberHash Vs Autonumberhash128

    I read it that autonumber stores the expression value and gives it a unique integer value

    whereas autonumberhash128 stores just the hash value (in 128 bits) of the corresponding

    expression value. Therefore, autonumberhash128 should be more efficient in data storage

    (particularily when the expression value is larger) and so the document size reduced.

    Other notes:

    Having our new baby (AKA the mini QlikView addict) around has meant very little time for

    anything, let alone blogging. So in order to ensure I at least manage the odd post or 2 I

    thought it would be good to start a new series of short posts on different qlikview functions

    and their uses. To kick things off I have decided to take a look at the autonumber() function

    and the closely related autonumberhash128() and autonumberhash256(). All 3 functions do a

    very similar thing so let's look at autonumber() first and then consider how the other 2

    functions differ.

    Autonumber() can be considered a lookup function. It takes a passed expression and looks up

    the value in a lookup table. If the expression value isn't found then it is added to the table and

    assigned an integer value which is returned. If the expression value is found then it returns

    the integer value that is assigned against it. Simply put, autonumber() converts each unique

    expression value into a unique integer value.

    Autonumber() is only useful within the QlikView script and has the following syntax:

    autonumber(expression [, index])

    The passed expression can be any string, numeric value or most commonly a field within a

    loaded table. The passed index is optional and can again be any string or numeric value. For

    each distinct value within the passed index, QlikView will create a separate lookup table and

    so the same passed expression values will result in a different returned integer if a different

    index is specified.

    So how exactly are the 3 autonumber functions different? Autonumber() stores the expression

    value in its lookup table whereas autonumberhash128() stores just the 128bit hash value of

    the expression value. I'm sure you can guess therefore, autonumberhash256() stores the

    256bit hash value expression value.

    Why on earth would I want to use any of these functions? Well the answer is quite simply for

    efficiency. Key fields between two or more tables in QlikView are most efficient if they

    contain only consecutive integer values starting from 0. All 3 of the autonumber functions

    allow you to convert any data value and type into a unique integer value and so using it for

    key fields allow you to maintain optimum efficiency within your data model.

    A final word of warning. All 3 of the autonumber functions have one pitfall, the lookup

    table(s) exist only whilst the current script execution is active. After the script completes, the

    lookup table is destroyed and so the same expression value may be assigned different integer

  • values in different script executions. This means that the autonumber functions can't be used

    for key fields within incremental loads.

    2 comments:

    1. Mike Taylor8 March 2014 11:13

    Nice, simple explanation. Can you clarify how autonumber function deal with null values. I

    had some issues recently and resorted back to using the original values where I had one table

    with nulls (which were assigned a autonumber of 0) and I was trying to join to another table

    that had no null values.

    Reply

    2. Matthew Fryer24 April 2014 15:00

    Hi Mike

    First of all, how autonumber() will behave depends on if it is a true null or a zero length

    string. For true nulls, the result of the autonumber() will also be null. You can prove this by

    running the following script:

    LOAD autonumber(null()) AS field AUTOGENERATE 1;

    By adding "field" to a list box you will see no records. Being a true null and the fact that null

    values don't really exist in QlikView (they are simply the absence of a value) means that

    when using it for a key field, QlikView will not associate a null in one table to a null in the

    other.

    For a zero length string, autonumber() will assign it a value as it would any other passed

    value. The number assigned to the zero length string will depend on the order it appears in the

    values that are passed to autonumber(). You can see this by running the following script:

    LOAD autonumber('') AS field AUTOGENERATE 1;

    The result will be a single value in "field" of "1". Autonumber() is 1 indexed and so I'm not

    sure where you are getting your 0 value back.

    4. Loosely coupled tables http://community.qlik.com/thread/104608

    NOTES:

    Loosely coupled tables are created automatically when the data model(3 or More Tables) that

    includes circular references is loaded into QlikView,to avoid them the circular references create a loop

    in the QlikView internal logic. These loosely coupled tables need to be handled in order to visualize

    data in a way that is expected and understandable.

    See Article Circular References

    Any table can also be made loosely coupled interactively from this dialog or via macros. Additionally,

    it is possible to declare loosely coupled tables explicitly in the script via the Loosen Table statement.

  • The normal QlikView associative logic is disconnected internally for loosely coupled tables. This

    means that selections in one field do not Associate through to the other fields in the table. its very

    useful to avoid circular references in the data structure in various scenarios. For more examples of

    how this feature can be used, please refer to the QlikView Reference Manual - Intentionally Creating

    Loosely Coupled Tables

    One or more QlikView internal data tables can be explicitly declared loosely coupled during script

    execution by using a Loosen Table statement.

    The use of one or more Loosen Table statements in the script will make QlikView disregard any setting

    of tables as loosely coupled made before the script execution.

    The syntax is:

    Loosen Table[s] tablename [ , tablename2 ...]

    Either syntax: Loosen Table and Loosen Tables can be used.

    Example:

    Table1:

    Select * from Trans;

    Loosen table Table1;

    Note!

    Should QlikView find circular references in the data structure which cannot be broken by tables

    declared loosely coupled interactively or explicitly in the script, one or more additional tables will be

    forced loosely coupled until no circular references remain. When this happens, the Loop

    Warning dialog, gives a warning.

    5. Circular References http://community.qlik.com/blogs/qlikviewdesignblog/2013/06/25/circular-references

    There are two Swedish car brands, Volvo and SAAB. Or, at least, there used to be... SAAB was made

    in Trollhttan and Volvo was and still is made in Gothenburg.

    Two fictive friends Albert and Herbert live in Trollhttan and Gothenburg, respectively. Albert

    drives a Volvo and Herbert drives a SAAB.

    If the above information is stored in a tabular form, you get the following three tables:

    Logically, these tables

    form a circular reference: The first two tables are linked through City; the next two through Person; the

    last and the first through Car. Further, the data forms an anomaly: Volvo implies Gothenburg;

    Gothenburg implies Herbert; and Herbert implies SAAB. Hence, Volvo implies SAAB which doesnt

    make sense. This means that you have ambiguous results from the logical inference - different results

    depending on whether you evaluate clockwise or counterclockwise.

  • If you load these tables into QlikView, the circular reference will be identified and you will get the

    following data model:

    To avoid ambiguous results, QlikView marks one of the tables as loosely coupled, which means that

    the logical inference cannot propagate through this table. In the document properties you can decide

    which table to use as the loosely coupled table. You will get different results from the logical inference

    depending on which you choose.

    So what did I do wrong? Why did I get a circular reference?

    It is not always obvious why they occur, but when I encounter circular references I always look for

    fields that are used in several different roles at the same time. One obvious example is if you have a

    table listing external organizations and this table is used in several roles: as Customers, as Suppliers

    and as Shippers. If you load the table only once and link to all three foreign keys, you will most likely

    get a circular reference. You need to break the circular reference and the solution is of course to load

    the table several times, once for each role.

    In the above data model you have a similar case. You can think of Car as Car produced in the city or

    Car that our friend drives. And you can think of City as City where car is produced or City where

    our friend lives. Again, you should break the circular reference by loading a table twice. One possible

    solution is the following:

    In real life circular references are not as obvious as this one. I once encountered a data model with

    many tables where I at first could not figure out what to do, but after some analyzing, the problem

    boiled down to the interaction between three fields: Customers, Machines and Devices. A customer

    had bought one or several machines; a device could be connected to some of the machine types

    but not to all; and a customer had bought some devices. Hence, the device field could have two roles:

    Devices that the customer actually had bought; and devices that would fit the machine that the customer

    had bought, i.e. devices that the customer potentially could buy. Two roles. The solution was to load

    the device table twice using different names.

    Bottom line: Avoid circular references. But you probably already knew that

  • The post assumes that the reader knows what the Logical Inference engine does.

    The Logical Inference engine is the core of QlikView. It evaluates which field values are possible,

    given the selection. Basically it first evaluates which records are possible, and then the result of the

    evaluation "propagates" into the next table via the possible values of the key field, and then the next

    table is evaluated. It is this propagation that is disabled by the loosely coupled table.

    Read more about Logical Inference

    under http://community.qlik.com/blogs/qlikviewdesignblog/2013/07/15/logical-inference-and-aggregations

    5. Logical Inference and Aggregations

    Every time you click, QlikView recalculates everything.

    Everything.

    A new selection implies a new situation: Other field values than before are possible; other

    summations need to be made; the charts and the KPIs get other values than before. The state vectors

    and the objects are invalidated.Everything needs to be recalculated since this is what the user demands.

    Well, there is of course a cache also so that QlikView doesnt have to recalculate something which

    has been calculated before. So it isnt quite true that everything is recalculated: If a calculation has

    been made before, the result is simply fetched from the cache. But it is true that nothing is pre-

    calculated. There is no need for that. Everything canbe done in real-time.

    QlikView is an on-demand calculation engine.

    From a principal point, there are two steps in the recalculation of data: The logical inference in the data

    model, and the calculations of all objects, including sheet labels and alerts.

    The logical inference is done first. The goal is to figure out which field values in the symbol tables are

    possible and which records in the data tables are possible, given the new selection. There is no

    number crunching involved - it is a purely logical process. The result is stored in the state vectors.

    Think of it as if the selection propagates from one table in the data model to all other tables. Table by

    table is evaluated and QlikView figures out which values and records are possible, and which are

    excluded.

  • When the logical inference is done, QlikView starts to evaluate all exposed objects. List boxes and

    dimensions in charts must be populated and sorted. All expressions in charts, in text boxes, in

    labels, in alerts must be calculated. Objects that are on other sheets, minimized or hidden, are

    however not calculated.

    The calculations are always aggregations based on the data records that have been marked as

    possible by the logical inference engine. I.e., the objects do not persist any data on their own.

    The calculation phase is usually the phase that takes time often over 90% of the response time is

    due to calculations. The calculations are asynchronous and multi-threaded on several levels: First of

    all, every object is calculated in its own thread. Secondly, in the 64-bit version, many aggregations

    e.g. Sum() are calculated using several threads, so that a sum in one single object can be calculated

    quickly using several CPUs.

    Finally, when an object has been calculated, it is rendered. Since the calculation is asynchronous and

    multi-threaded, some objects are rendered long before other objects are ready.

    And when an object has been rendered, you can click again. And everything is repeated.

    HIC

    If you want to read more about QlikView internals, see Symbol Tables and Bit-Stuffed

    Pointers and Colors, states and state vectors.

    6. Incremental Load Souce: http://www.resultdata.com/qlikview-incremental-loads-and-qvds/

    QlikView, by design, includes many new and innovative technologies such as the associative data

    model and highly effective data compression algorithms which make possible its state-of-the-art in-

    memory technology. QlikView allows us to load and keep all the data in memory for evaluation,

  • analysis and reporting. If youve worked with QlikView you understand the value of this approach,

    but it sometimes comes with a price. Very large data sets can often take a long time to load bogging

    down the performance of your QlikView documents over time. This month we will take a look at a

    way to minimize the load time of very large data sets and increase your performance using

    incremental data loads.

    What is an Increment Load?

    Incremental load is term that describes loading only new or changed records from the database. It is

    a common task with databases and can greatly reduce the time needed to load data into your

    QlikView application. The bulk of the data needed will already be available within your application

    and only the new or changed data will be necessary to complete the picture. Incremental loads are

    possible through the use of .QVD files.

    What is a QVD file?

    A QVD file is a native QlikView file format optimized and compacted for speed when reading data

    from within a QlikView load script. Reading data from a QVD file can be 10-100 times faster than

    reading records directly from other data sources. A QVD file contains a single table of data from a

    QlikView application. While that may seem somewhat restricting remember that table can be the

    result of a concatenation or a join so the structure you create in the application can greatly increase

    its use. You can also include all calculations and manipulations in the script that creates you QVD file

    further increases load performance at run time.

    How could you use a QVD file?

    There are several uses for a QVD file and in many cases more than one will be applicable at the same

    time.

    Decreasing Load Time

    Decreasing Database Server Loads

    Joining Data from Different QlikView Applications

    Incremental Data Loading

    Decreasing Load Time

    By saving data to and loading large amounts of data from a QVD file you eliminate most of the time

    used during load by using an optimized and compressed data file. By scripting all of your

    concatenation, joining, calculations and data manipulations in the file used to create the QVD you

    will increase your performance even more.

    Decreasing Database Server Loads

    By isolating your large data volumes and loading them from QVD files you will reduce the processing

    on the database server at load time and dramatically increase the load time of your scripts as well.

  • You only need to provide data since the last load of your QVD to your QlikView document during

    refresh. The fresher the data in your QVD the less data needed from the database server.

    Joining Data from Different QlikView Applications

    Once youve formatted and manipulated your data and get it working just the way you want, you

    can save that table to a QVD and use the same vetted structure in other QlikView documents. While

    it is true that you could copy and paste your load script into another QlikView document, by using a

    QVD file instead you have the added advantage of dramatically faster loading. As your scripts

    become more and more complex based on the business questions asked by the users you can

    standardized your scripts and maintain the logic in one place. This increases our ability to create a

    single version of the truth.

    Incremental Data Loading

    By adding business logic to the creation of you QVD files you can extend that logic to all of the

    QlikView Applications that use that data; to create a more dynamic loading scenario. Lets say you

    schedule your QVD loads monthly, after the close of business for the previous month. You

    application now only needs to load data for the current period directly from the database and then

    load all previous periods from your QVD file.

    Incremental Load Scenarios

    The structure of your data, available evaluation fields and how you choose to store your data in

    QVDs will determine your particular scenario but here are a few examples to get you started

    thinking.

    Daily Incremental Reloads of a QVD

    This scenario requires a date field in data that identifies the creation or update time of all records.

    We can retrieve the last modified/created date from the existing QVD, use that date to retrieve new

    records from the database and then concatenate the previously saved records from the QVD file to

    our current data and save the QVD file again.

    1. Load the latest (max) modified date from you previously saved QVD. If you have not yet

    created the QVD then use the current date.

    2. Select only the records from the database where the last modified date is between the last

    modified date you retrieved in step on and right now.

    3. Concatenate all the data from the QVD file where there is no match in new data table on the

    unique ID field. This allows QlikView to only add the records that do not exist and accounts

    for updated records as well as new records.

    4. Save the resulting data set by overwriting the QVD file with all of the records in the new data

    set.

  • This scenario will force QlikView into Fast mode instead of Super-Fast mode but will still be

    significantly faster than loading all data from the database. You may also need to extend this logic to

    your production QlikView Application if it needs to retrieve data since the last daily load.

    Daily/Monthly/Yearly Stacked QVDs

    At close of each Day, Month or Year (Month and/or Year being the most popular) you will create a

    QVD containing that periods data. Each QVD will be named so that the data in them is clearly

    identified by the name (I.E.: 3-1-2010.qvd or 3-2010.qvd or 2010.qvd). You may wish to use a

    combination approach such as saving data from previous year in a yearly QVD and data within the

    current year in a monthly QVD. This will give you the option of loading only the appropriate data into

    your QlikView Applications. Depending on the target audience for your application you may need

    different combinations of data. One application might require all available data while other may only

    require a specific number of years past. A more analytic application may only require yearly and/or

    monthly data while others will require up-to-the-minute data. This approach will give you flexibility

    for all of those scenarios.

    Another advantage of this approach is that the daily, monthly or yearly data can be loaded in Super-

    Fast mode since no date evaluation is needed. Only the data needed to supplement the application

    since the last saved QVD file, if any, will be read directly from the database.

    7. Three types of Incremental Load Source: http://www.resultdata.com/three-types-of-qlikview-incremental-loads/

    Large transactional tables can be significantly time consuming in a reload. Taking advantage of

    Qlikviews incremental load logic can reduce that time considerably. An incremental load is a process

    of simply appending new records or updating specific records in an existing QVD. There are three key

    scripting options available for incremental loads.

    Insert Only

    Insert & Update

    Insert, Update, & Delete

    For a detailed review of QVDs and the concept of incremental loads, please review the following

    article

    Incremental Loads and QVDs

    SET UP

    Each of these three scenarios is designed to run once an INITIAL LOAD has occurred. An initial load is

    a task that creates the source QVDs. These QVDs from then on can be optimized to reload with one

    of the following incremental load scripts. Since an incremental load is designed to pull only new or

  • altered data, a source QVD is needed to hold all non-modified information and must exist before an

    incremental load can run.

    INSERT ONLY

    For an INSERT ONLY scenario, there is the assumption that new data will not create duplicate

    records. There is no set way to determine NEW data, so this must be reviewed case by case. Once a

    method for finding new records is determined, the reload process is a simple three step process.

    1. Load all NEW data from the data source

    2. Concatenate this data with a load of all data from the QVD file

    3. Store the entire table out to the QVD file

    As long as the QVD is named the same, this will overwrite the previous QVD so the process can

    repeat for the next reload.

    INSERT & UPDATE

    The INSERT & UPDATE scenario also takes new data from the source but it also pulls in updated

    records. Additional precautions need to be taken in order to avoid duplicate records. During the load

    from the QVD, exclude records where there is a match on the primary key. This will ensure that the

    updated records will not be duplicated.

    1. Load all NEW and UPDATED data from the data source

  • 2. Concatenate this data with a load of only the missing records from the QVD file

    3. Store the entire table out to the QVD file

    Example of Script

    Data:

    SQL SELECT

    PrimaryKey,

    A,

    B,

    C

    FROM DB_Table

    WHERE ModifyDate >= $(vDate);

    CONCATENATE

    LOAD

    PrimaryKey,

    A,

    B,

    C

    FROM Data.qvd

    WHERE NOT exists (PrimaryKey);

    STORE Data into Data.qvd;

    Using the Exists() function keeps the QVD from loading the obsolete records since the UPDATED

    version is currently in memory.

    INSERT, UPDATE, & DELETE

    An INSERT, UPDATE, & DELETE script is very similar to the load process of the INSERT & UPDATE,

    however there is an additional step needed to remove deleted records. The most effective method is

    to load all the PrimaryKeys from the source and then apply an inner join. This will achieve the delete

    process.

    1. Load all NEW and UPDATED data from the data source

    2. Concatenate this data with a load of only the missing records from the QVD file

    3. Inner join all PrimaryKeys from the data source

    4. Store the entire table out to the QVD file

    Example of Script

  • Data:

    SQL SELECT

    PrimaryKey,

    A,

    B,

    C

    FROM DB_Table

    WHERE ModifyDate >= $(vDate);

    CONCATENATE

    LOAD

    PrimaryKey,

    A,

    B,

    C

    FROM Data.qvd

    WHERE NOT exists (PrimaryKey);

    INNER JOIN

    SQL SELECT

    PrimaryKey,

    FROM DB_Table;

    STORE Data into Data.qvd;

    Very large data sets can take a long time to load and greatly effect the performance of your QlikView

    documents over time. By implementing QVD optimization with incremental loads, this technique can

    be employed to perform faster loads in less time, utilizing less system resources.

    8. Qlikview Associative data model Source: http://community.qlik.com/blogs/theqlikviewblog/2010/08/16/qlikview-is-associative-to-

    its-very-core

    One thing we're trying to do a better job of at QlikTech is communicating the associative nature of

    QlikView. I've seen lots of conversations taking place online (for example on the QlikCommunity

    site as well as Donald Farmer's blog andCurt Monash's blog). So I tapped into the brains of Dan

    English, our Global Product Manager for OEM and Integration for his explanation, and I'm sharing it

    with you here.

    First and foremost we should clear up the semantics. If one uses the Wikipedia definition of an

    associative model of data then it is correct to say that QlikView does not store data in an associative

    format. However, QlikTech uses the word associative in an entirely different sense. When we say

    that QlikView is associative we mean that at a data engine level QlikView creates and maintains real-

    time associations among all result sets, creating a cohesive and intuitive view of business

    information.

  • We describe QlikView's architecture as associative to differentiate it from query-based business

    intelligence tools. With all query-based BI tools (whether ROLAP, MOLAP, or HOLAP) each individual

    result set is returned from the underlying data engine without any inherent association back to the

    data repository as a whole, or to any other query result set (see figure below).

    When we say QlikView is associative, we aren't talking just about QlikView's intuitive user

    interface?the UI that utilizes green for selected data, white for associated data, and gray for

    unassociated data to illustrate relationships hidden in business information. (See this QlikView blog

    post.) We're talking about a revolution in data engine architecture, in that:

    Every data point in a QlikView document shares a common selection state. With QlikView's

    data engine, each and every discrete data point in a given QlikView document?whether it is

    part of an aggregated result set (e.g., straight table, pivot table, chart, etc.) or unaggregated

    data (e.g., data in a list box)?shares a common selection state (e.g., universe of included and

    excluded data).

    All data points are constantly updated based on the selection state. All the data points in a

    QlikView document are continually and instantaneously updated based on changes the user

    makes to the selection state. The associations among result sets are maintained 100% by the

    underlying data engine, which is built on a column-store, in-memory architecture.

    QlikView's associative architecture delivers unprecedented flexibility

    Why is QlikView's associative engine so important? One might argue that a real-time query tool gives

    you the capability to answer any question you want. After all, within the limits of the tool's user

    interface, you can define any result set you want, right? We maintain that the answers to real-world

    business questions are almost never exposed in the result set of a single query. Almost always the

    answer can only be extracted by examining the relationships of two or more associated result sets,

    often aggregated along completely different dimensionality.

    The bottom line: QlikView represents a fundamentally different class of analytic engine. All

    associations are based on the data model set up when the QlikView document is developed. Those

    associations are used to update every single result set in real time each and every time the user

    changes the selection state. This is the source of QlikView's associative magic.

    9. The magic of variables Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/11/04/the-magic-of-variables

    Variables can be used in many ways in QlikView. They can have static values or they can be

    calculated. But when are they calculated? At script run-time or when the user clicks? And how

    should they be called? With or without dollar expansion?

  • One basic way to assign a value to a variable is to use a Let statement in the script:

    Let vToday = Num(Today()) ;

    This will calculate the expression and assign it to the variable when the script is run. This is exactly

    what you want if you want to use a variable as a numeric parameter in your expressions.

    But if you want the expression to be evaluated at a later stage, e.g. every time the user clicks, what

    should you do then? One way is to store the expression as a string in the variable, using either the

    Set or the Let statement or by defining it in the Document Properties -> Variables:

    Set vSales = Sum(Sales) ;

    Let vSales = 'Sum(Sales)' ;

    In neither case, the expression will be calculated. The variable will contain the string Sum(Sales),

    which subsequently can be used in an expression using a dollar expansion: $(vSales).

    With a dollar expansion, QlikView will substitute the $(vSales) with Sum(Sales) before the

    expression with the dollar expansion is evaluated. Some of you will recognize this as an old style

    assembler macro expansion. The subsequent calculation will be made based on the evaluation of the

    resulting expression. Note the two steps: (1) Variable expansion; and (2) Expression evaluation.

  • In the chart above, you can see the result of using a normal variable reference (the first expression)

    or using a dollar expansion (the second expression). In the second expression, the variable is

    expanded and the numbers are calculated correctly.

    But this is just the beginning

    It is also possible to calculate the variable value, i.e. determine how it should be expanded, by using

    an initial equal sign in the variable definition.

    Let vSales2 = '=Sum(Sales)';

    In this case, the variable value is calculated after each click, whereupon the dollar expansion in the

    chart expression is made, and finally the expression is evaluated. This means that the evaluation of

    Sum(Sales) is done before the variable expansion. Note the three steps: (1) Variable calculation; (2)

    Variable expansion; and (3) Expression evaluation.

    The table below summarizes the three methods.

  • With the above, you can do almost magical things. You can for instance make conditional

    calculations that depend on e.g. selections, client platform or user.

    Example:

    Create a field [Field to Aggregate] containing the names of two other numeric fields:

    'Quantity' and 'Sales'

    Create a variable vConditionalAggregationField = '=Only([Field to Aggregate])'

    Create a chart with an expression = Sum($(vConditionalAggregationField))

    The calculation in a chart will now toggle between Sum(Quantity) and Sum(Sales) depending on your

    selection.

  • The use of variables is an extremely powerful tool that you can use to create flexible applications.

    Use it but with caution. Too much magic behind the curtains can be confusing.

    10. The QlikView Cache Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/04/14/the-qlikview-cache

    QlikView has a very efficient, patented caching algorithm that effectively eliminates the calculation

    time for calculations that have been made before. In other words, if you use the back button in the

    toolbar, or if you happen to make a selection that you have made before, you usually get the result

    immediately. No calculation is necessary.

    But how does it work? What is used as lookup ID?

    For each combination of data set and selection or data sub-set and expression QlikView calculates a

    digital fingerprint that identifies the context. This is used as lookup ID and stored in the cache

    together with the result of the calculation.

  • Here "calculation" means both the Logical Inference and Chart calculation - or in fact, any expression

    anywhere. This means that both intermediate and final results of a selection are stored.

    There are some peculiarities you need to know about the cache

    The cache is global. It is used for all users and all documents. A cache entry does not belong

    to one specific document or one user only. So, if a user makes a selection that another user

    already has made, the cache is used. And if you have the same data in two different apps,

    one single cache entry can be used for both documents.

    Memory is not returned, when the document is unloaded. Cache entries will usually not be

    purged until the RAM usage is close to or has reached the lower working set limit. QlikView

    will then purge some entries and re-use the memory for other cache entries. This behavior

    sometimes makes people believe there is a memory leak in the product. But have no fear it

    should be this way. So, you do not need to restart the service to clear the cache.

    The oldest cache entries are not purged first. Instead several factors are used to calculate a

    priority for each cache entry; factors like RAM usage, cost to calculate it again and time since

    the most recent usage. Entries with a combined low priority will be purged when needed.

    Hence, an entry that is cheap to calculate again will easily be purged, also if it recently was

    used. And another value that is expensive to recalculate or just uses a small amount of RAM

    will be kept for a much longer time.

    The cache is not cleared when running macros which I have seen some people claim.

  • You need to write your expression exactly right. If the same expression is used in several

    places, it should be written exactly the same way Capitalization, same number of spaces,

    etc. otherwise it will not be considered to be the same expression. If you do, there should

    be no big performance difference between repeating the formula, referring to a different

    expression using the label of the expression or using the Column() function.

    The cache efficiently speeds up QlikView. Basically it is a way to trade memory against CPU-time: If

    you put more memory in your server, you will be able to re-use more calculations and thus use less

    CPU-time.

    11. Null handling in QlikView Source: http://community.qlik.com/docs/DOC-3155

    Refer: Null and Nothing.pdf

    12. Text searches Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/10/16/text-searches

    One of the strengths of QlikView is its search engine. With it, you can find pieces of information in a

    fraction of a second and select the found field values. The response is immediate, which is necessary

    for the user experience. Without it, you would easily get an empty result set without understanding

    why.

    Search strings can be made in many different ways, and QlikView will respond differently depending

    on how the search string is defined. Normally you just enter a text, and QlikView will match this

    against the beginning of the words in the field values. If several strings are entered, QlikView will

    return the union of the matches of each of the strings.

  • But if you instead use a wildcard in your search string, the evaluation of the search string will be

    made in a different way: the entire search string with the wild card will be matched against the

    entire field value, sometimes yielding more matches, sometimes fewer.

    If you want to create more complex search strings (and e.g. store them in actions or bookmarks) you

    can do this too. Just use (, |, & and double quotes to define the syntax.

  • In all the above cases, the search and the selection are made in one and the same field. But

    sometimes you want to make the selection in one field, but make the search in another. This can be

    done using the associated search, which is an indirect search method. Start with the field where you

    want to make the selection, enter the search string, and click on the small chevron to the right. You

    will then get a list of other fields containing this search string. By clicking the desired match, you will

    narrow down the number of matches in the primary list to show just the relevant values. You can

    then make your selection by hitting Enter.

    Further, did you know that

    In the user preferences and in the list box properties, you can define how a default search

    string should be created, but this does not affect how it is evaluated only how it is created.

    Once created, you can add or remove wild cards as you please.

    When you make a search and save the resulting selection in a bookmark, the bookmark will

    contain the search string and not the list of selected values. When the bookmark is applied,

    it will perform the search and select the found values. If data has changed, this may imply a

    different search result than before.

  • You can use the same search string in many places: In list boxes, in Set analysis, in the

    Advanced search dialog, in actions and in bookmarks.

    Bottom line: The search string is a powerful tool that helps you find the values you want. Use it.

    13. Automatic Number interpretation Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/07/07/automatic-number-

    interpretation

    I have in several previous blog posts written about the importance to interpret dates and numbers

    correctly e.g. in Why dont my dates work?. These posts have emphasized the use of interpretation

    functions in the script, e.g. Date#().

    But most of the time, you dont need any interpretation functions, since there is an automatic

    interpretation that kicks in before that.

    So, how does that work?

    In most cases when QlikView encounters a string, it tries to interpret the string as a number. It

    happens in the script when field values are loaded; it happens when strings are used in where-

    clauses, or in formulae in GUI objects, or as function parameters. This is a good thing QlikView

    would otherwise not be able to interpret dates or decimal numbers in these situations.

    QlikView needs an interpretation algorithm since it can mix data from different sources, some typed,

    some not. For example, when you load a date from a text file, it is always a string: there are no data

    types in text files it is all text. But when you want to link this field to date from a database, which

    usually is a typed field, you would run into problems unless you have a good interpretation

    algorithm.

  • For loaded fields, QlikView uses the automatic interpretation when appropriate (See table: In a text

    file, all fields are text - also the ones with dates and timestamps.) QlikView does not use any

    automatic interpretation for QVD or QVX files, since the interpretation already is done. It was done

    when these files were created.

    The logic for the interpretation is straightforward: QlikView compares the encountered string with

    the information defined in the environment variables for numbers and dates in the beginning of the

    script. In addition, QlikView will also test for a number with decimal point and for a date with the ISO

    date format.

    If a match is found, the field value is stored in a dual format (see Data Types in QlikView) using the

    string as format. If no match is found, the field value is stored as text.

    An example: A where-clause in the script:

    Where Date > '2013-01-01' will make a correct comparison

    The field Date is a dual that is compared to a string. QlikView automatically interprets the string on

    the right hand side and makes a correct numeric date comparison. QlikView does not (at this stage)

    interpret the content of the field on the left hand side of the comparison. The interpretation should

    already have been done.

    A second example: The IsNum() function

    IsNum('2013-01-01') will evaluate as True

    IsNum('2013-01-32') will evaluate as False

    In both cases, strings are used as parameters. The first will be considered a number, since it can be

    interpreted as a date, but the second will not.

  • A third example: String concatenation

    Month(Year & '-' & Month & '-' & Day) will recognize correct dates and return the dual month

    value.

    Here the fields Year, Month and Day are concatenated with delimiters to form a valid date format.

    Since the Month() function expects a number (a date), the automatic number interpretation kicks in

    before the Month() function is evaluated, and the date is recognized.

    A final example: The Dual() function

    Dual('Googol - A large number', '1E100') will evaluate to a very large number

    Here the second parameter of Dual() is a string, but QlikView expects a number. Hence: automatic

    interpretation. Here, you can see that scientific notation is automatically interpreted. This

    sometimes causes problems, since strings that really are strings in some cases get interpreted as

    numbers. In such cases you need to wrap the field in a text function.

    With this, I hope that the QlikView number handling is a little clearer.

    14. Why dont my dates work?

    Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/02/19/why-don-t-my-dates-

    work

    A common recurring question on the QlikCommunity forum is around dates that dont work. Here

    follows a help on fixing the three most common causes. If you encounter such a question on the

    forum, just link to this post in your answer.

    1. Incorrect Date Interpretation

    When data is loaded into QlikView, dates are often read as strings. QlikView then tries to recognize a

    pattern in the string that looks like the date format specified in the DateFormat environment

    variable. This sometimes fails and then you need to use the Date#() function to help QlikView

    understand that it is a date.

  • How do I know that a date is correctly interpreted? Thats easy. Just format it as a number and see

    what you get. (List box properties Number Integer)

    The question is now what your list box looks like. If you have a number which is roughly 40000

    (usually right-aligned), then you are all set. But if you still have a date stamp (usually left-aligned),

    then you need to use the Date#() function in the script. See QlikView Date fields.

    2. Linking integer dates with fractional dates

    You have a date in two different tables, and you want to use this date as a key, but it doesnt seem

    to work. Then you should suspect that you have true dates (integers) in one table and timestamps

    (fractional numbers) in the other, but the formatting of the dates hides this fact.

    How do I know whether this is the case? Thats easy. Just format it as a timestamp and see what you

    get. (List box properties Number TimeStamp)

  • The question is now what your list box looks like. If you have timestamps where hours, minutes and

    seconds are all zero, then you are all set. But if you have numbers in these places, then you need to

    use the Floor() function in the script to get integer dates. See QlikView Date fields.

    3. Incorrect date comparisons

    The most subtle error is however the one with timestamps in comparisons, e.g.

    Where Date = '2011-12-31';

  • Will this work? Yes, provided that the date format inside the string is recognized by QlikView, i.e.

    that it corresponds to the date format specified in the environment variable DateFormat in the

    beginning of the script.

    It becomes even more complex if you use variables. Then it is important to use quotes correctly. The

    following will work:

    Let vToday = Today();

    Where Date = '$(vToday)';

    but the following will not:

    Where Date = $(vToday);

    The reason is that the $(vToday) will expand to the date, and then the comparison will be e.g.

    Where Date = 2/19/2013;

    So the date (which is approximately 40000) will be compared to 2 divided by 19 divided by 2013,

    which of course is not what you want.

    My recommendation is to always use numeric variables for dates. They always work - quotes or no

    quotes:

    Let vToday = Num(Today());

    Where Date = $(vToday);

    15. Colors in Chart Source: http://community.qlik.com/blogs/qlikviewdesignblog/2012/12/04/colors-in-charts

    It is not uncommon that users want specific products or customers to be displayed in specific colors.

    The most obvious way to do this is to change the colors in the chart properties. This is in fact quite

    easy if you use the copy and paste functions found when you right-click a color button. Just copy one

    button and paste on another, and you have moved the color.

  • This way you can assign which color is to be used for the different values of the field. However, a

    prerequisite for this to work is that the order of the field values doesnt change.

    A more robust way is to use color functions. Usually, you want to set the color of a bar, line or

    bubble and this is done by using the Background Color on the Expression tab:

    By the way dont use Visual cues. This feature is old and not very versatile. Use color functions as

    described here instead.

    In the picture above, both the product ID and the color are hard-coded in the expression. However,

    if you want to define colors for many products, the if-function will not be manageable. Then it is

    better to store this information in a table either in the database or in an Excel sheet or as an inline

    statement in a scriptlet that is included in the script. Hence,

  • 1. Create your color definition table and store it in an appropriate place. The Red, Green and

    Blue columns hold the different color components and define the color uniquely.

    2. Load the color definitions into a mapping table:

    ProductColors:

    Mapping Load ProductID, Rgb(Red,Green,Blue) as ProductColor From ProductColors

    3. Use this mapping table when loading the products table, creating a new field for the product

    color:

    Applymap('ProductColors', ProductID , lightgray()) as ProductColor

    The third parameter, here lightgray(), defines which color the unlisted products should get. If

    you instead use null(), the unlisted products will be multicolored according to the color

    settings in the chart properties.

    4. Finally, use this field as product color in the charts:

    This way it is easy to define which color specific products, customers, or other dimensions should

    get.

    Which colors to use? Oh, that is a completely different topic:

    Stephen Few has a number of good general recommendations.

    Adam Bellerby has some recommendations on how to avoid problems for color blind users.

    Shima Auzins suggests using colors as warning signals.

  • HIC

    16. Aggregations and Function Classes Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/05/19/function-classes

    A typical QlikView application may have one million records in the data, one hundred rows in a pivot

    table and a single number, a KPI, in a gauge or text box. Although different in magnitudes, all three

    numbers may still represent all data. The numbers are just different aggregation levels.

    There are many functions in QlikView that can help you write the necessary formulae to calculate

    aggregated KPI:s. Some will collapse many records into one value, others will not. Today I will write

    about the different function classes, and how you can combine them.

    The Scalar Functions constitute the first class. Typical for these is that they are

    one-to-one functions, i.e. they take a single value as parameter and return a single value (of

    the dual data type). Examples: Left(), If(), Num(), Date(), Year(), Subfield(), etc.

    The Aggregation Functions constitute the second class. These are many-to-one

    functions, i.e. they use the values from many records as input and collapse these into one

    single value that summarizes all records. Examples: Sum(), Count(), Avg(), Min(), Only(),

    Concat(), etc.

  • Aggregation functions are special: You must use one to collapse several records into one number

    which means that you need them in pretty much any formula in QlikView: In Chart expressions, in

    Text boxes, in Labels, etc. If you dont write an aggregation function in your expression, QlikView will

    assign one for you: It will use the Only() function.

    Scalar functions can be used both inside and outside the aggregation function:

    Date( Min( Date )

    Money( Sum( If( Group='A', Amount ) ) )

    There is one restriction: You can normally not use an aggregation function inside another

    aggregation function. Hence, you usually need every field reference to be wrapped in exactly

    one aggregation function.

    The next function class has only one member: The Aggr Function. It is in spite

    of its name not an aggregation function. It is a many-to-many function, rather like a tensor

    or a matrix in mathematics. It converts an ntuple (table) with N records to an ntuple with M

    records. In other words: It returns an array of values. Regard it as a virtual straight table with

    one expression and one or several dimensions.

    Most places in QlikView demand that you write your expression so that it returns one single value.

    This means that you must wrap the Aggr function in an aggregation function to get a meaningful

    result. The only exception is if you use the Aggr function to define a calculated dimension or field.

    This means that you have two aggregation steps; one nested in the other:

    Avg( Aggr( Sum( Amount ), Month ) )

  • Charts complicate the matters slightly: A chart is like a For-Next loop where the number of distinct

    dimension values determines the number of loops. In each loop, the expression must return one

    value only, and this is the value used for the bar/slice/pivot table row.

    However, sometimes you need values from other rows in the chart, and it could even be that you

    need values from several rows. To solve this, there are two additional classes of functions that

    should be used together:

    The Chart Inter-record Functions return values fetched from other rows in the

    chart. Some of these can return several values, i.e. an array of values. These functions are

    only meaningful inside a chart or Aggr() function. Examples: Above(), Below(), Top(), etc.

    The Range Functions are functions that can collapse a chart inter-record array

    into one single value. Examples: RangeSum(), RangeMin(), RangeMax(), etc.

    Example:

    RangeSum( Above( Sum( Amount ), 0, 12 ) )

    Bottom line: Know your functions. It will help you write correct expressions.

    17. Its all Aggregations Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/08/06/it-s-all-aggregations

    I often see incorrect expressions being used in the QlikCommunity forum. Expressions that seem to

    work correctly but really dont

    So, let me make this clear: Calculations in QlikView are aggregations.

  • It doesnt matter if it is a measure in a chart, or a calculated object label, or a show condition for an

    object, or a calculated color, or an advanced search all expressions in the user interface are

    evaluated as aggregations. (Except calculated dimensions.)

    This means that it is correct to use the Sum() function in an expression, since this is an aggregation

    function - a function that uses several records as input. But if you omit the aggregation function or

    use a scalar function only, e.g. RangeSum(), you can get an unexpected behavior.

    Basically, all field references should be wrapped in an aggregation function. The Aggr() function and

    some constructions using the total qualifier can even have several layers of aggregations.

    But if the created expression does not contain an aggregation function, the expression is ill-formed

    and potentially incorrect.

    Examples:

    =Sum(Amount)

    =Count(OrderID)

    These are both correct aggregations. Amount is wrapped in the Sum() function which will sum

    several records of the field Amount. OrderID is wrapped in the Count() function, which will count the

    records where OrderID has a value.

    =Only(OrderID)

    This is also a correct aggregation. OrderID is wrapped in the Only() function, which will return the

    OrderID if there is only one value, otherwise NULL.

  • =OrderID

    A single field reference is not an aggregation, so this is an ill-formed expression. But QlikView will

    not throw an error. Instead it will use the Only() function to interpret the field reference. I.e., if there

    is only one value, this value will be used. But if there are several possible values, NULL will be used.

    So, it depends on the circumstances whether an expression without aggregation function is correct

    or not.

    =If(Year=Year(Today()), Sum(Amount1), Sum(Amount2))

    Here, both the amounts are correctly wrapped in the Sum() function. But the first parameter of the

    if() function, the condition, is not. Hence, this is an ill-formed expression. If it is used in a place where

    there are several possible Years, the field reference will evaluate to NULL and the condition will be

    evaluated as FALSE, which is not what you want. Instead, you probably want to wrap the Year in the

    Min() or Max() function.

    =ProductGroup= 'Shoes'

    =IsNull(ProductGroup)

    These expressions can both be used as show conditions or as advanced searches. However, since

    there are no aggregation functions, the expressions are ill-formed. If you want to test whether there

    exists Shoes or NULL values among the field values, you probably want to use the following instead:

    =Count(If(ProductGroup= 'Shoes', ProductGroup))>0

    =NullCount(ProductGroup)>0

    Conclusions:

    An aggregation function is a function that returns a single value describing some property

    of several records in the data.

    All UI expressions, except calculated dimensions, are evaluated as aggregations.

    All field references in expressions must be wrapped in an aggregation function. If they

    arent, QlikView will use the Only() function.

    18. Dimensions and Measures Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/03/25/dimensions-and-

    measures

    To make a chart in QlikView or in any Business Intelligence tool, for that matter you need to know

    what Dimensions and Measures are. But not all people have a clear picture of the difference between

    the two. So this weeks post will try to straighten out whats what.

  • When you make a chart, you should start by asking yourself What do I want to show? The answer

    is usually Sales, Quantity or some other number. This is your Measure. In QlikView we have

    traditionally called this an Expression, but "Measure" is really the correct word. (There are

    expressions that are not measures, e.g. expressions used as labels, or as sort order definitions).

    The second question you should ask yourself is How many times should this be calculated? Per

    what do I want to show this measure? The answer could be once per Month, per Customer, per

    Supplier or something similar. This is your Dimension.

    In the bar chart below, you have one bar per month, and a general rule is that you always have one

    data point per distinct dimensional value: But depending on which visualization form you have

    chosen, it can be a row (in a table), a point (in a scatter chart) or a slice (in a pie chart).

    Measures

    A database or a QlikView app can consist of thousands or millions of records that each contains a

    small piece of information. A Measure is simply a calculation that can be made over multiple records

    in this data set. The calculation always returns one single value that summarizes all relevant records.

    This type of calculation is called an aggregation. There are several aggregation functions: Sum(),

    Count(), Min(), Max(), etc.

  • Examples:

    Each record contains a sales number. Then Sum(Sales) is a relevant measure that calculates

    the total sales value.

    Each record represents an order and OrderID is the key. Then Count(OrderID) is a relevant

    measure that calculates the number of orders.

    A Measure can be used almost anywhere in QlikView: In charts, in text boxes, as label for objects, in

    gauges, etc. Typical measures are all KPI:s, Revenue, Number of orders, Performance, Cost, Quantity,

    Gross Margin, etc.

    Once again: A Measure is always based on an aggregation. Always!

    Dimensions

    Contrary to Measures, dimensions are descriptive attributes typically textual fields or discrete

    numbers. A dimension is always an array of distinct values and the measure will be calculated once

    per element in the array.

    Example:

    The field Customer is used as dimension. The individual customers will then be listed and

    the measure will be calculated once per customer.

    Typical dimensions are Customer, Product, Location, Supplier, Activity, Time, Color, Size, etc.

    Like a For-Next loop

    You can regard a chart like a For-Next loop: The Dimension is the loop variable; the calculations will

    be made once per dimensional value. So the Dimension determines how many

    rows/bars/points/slices the chart will have. The Measure is what is calculated in each loop.

    Several Dimensions

  • If you have two or three dimensions in a chart, the dimensional values no longer form an array, but

    instead a matrix or a cube, where the measures are calculated once per cell in the cube.

    SQL

    You can also compare a chart with an SQL SELECT statement. The GROUP BY symbols are the

    dimensions and the aggregations are the Measures.

    With this, I hope that the difference between Dimensions and Measures is a little clearer.

    19. Qlikview Quoteology Source: http://community.qlik.com/blogs/qlikviewdesignblog/2013/04/09/quoteology

  • In all programming environments there is a need for quotation marks, and QlikView is no exception.

    But which symbol should you use? " ", [ ], ` ` or ' '? This post will try to explain the differences

    between the different quotation marks.

    When creating the script or an expression in QlikView, you need to reference fields, explicit values

    and variables. To do this correctly, you sometimes need to write the string inside a pair of quotation

    marks. One common case is when a field name contains a symbol that prevents QlikView from

    parsing it correctly, like a space or a minus sign.

    For example, if you have a field called Unit Cost, then

    Load Unit Cost

    will cause a syntax error since QlikView expects an "as" or a comma after the word "Unit". If you

    instead write

    Load [Unit Cost]

    QlikView will load the field Unit Cost. Finally, if you write

    Load 'Unit Cost'

    QlikView will load the text string "Unit Cost" as field value. Hence, it is important that you choose

    the correct quotation mark.

    So, what are the rules? Which quote should I use? Single? Double? Square brackets?

    There are three basic rules:

    1. Single quotes are used to quote literals, e.g. strings that should be used as field values.

    2. Inside a Load statement, to the left of the as, double quotes are used to quote source field

    references, i.e. names of fields.

    3. Double quotes can always be substituted by square brackets or by grave accents.

    With these three rules, most cases are covered. However, they dont cover everything, so I'll

    continue:

  • In the script, but outside a Load statement, double quotes denote a variable reference and

    not a field reference. If double quotes are used, the enclosed string will be interpreted as a

    variable and the value of the variable will be used.

    A general rule in QlikView is that field references inside a Load must refer to the fields in the input

    table the source of the Load statement. They are source field references or in-context field

    references. Aliases and fields that are created in the Load cannot be referred since they do not exist

    in the source. There are however a couple of exceptions: the functions Peek() and Exists(). The first

    parameters of these functions refer to fields that either have already been created or are in

    the output of the Load. These are out-of-context field references.

    Out-of-context field references and table references, e.g. the parameters in NoOfRows() and

    Peek(), should be regarded as literals and therefore need single quotes.

    Finally, in many places you are free to use any of the four quotation methods, e.g.

    o Inside a Set statement, to the right of the =

    o Inside a Load statement, to the right of the as

    o In places where QlikView expects a file name, a URL or a table name

    o Defining the beginning and end of an inline table

    o For the first parameter of Peek() or Exists() when used inside a Load

    I have deliberately chosen not to say anything about SELECT statements. The reason is that the rules

    depend on which database and which ODBC/OLEDB you have. But usually, rules 1-3 apply there also.

    With this, I hope that the QlikView quoteology is a little clearer.

    20. The Crosstable Load Source: http://community.qlik.com/blogs/qlikviewdesignblog/2014/03/24/crosstable

    There are a number of prefixes in QlikView, that help you load and transform data. One of them is

    the Crosstable transformation.

    Whenever you have a crosstable of data, the Crosstable prefix can be used to transform the data

    and create the desired fields. A crosstable is basically a matrix where one of the fields is displayed

    vertically and another is displayed horizontally. In the input table below you have one column per

    month and one row per product.

  • But if you want to analyze this data, it is much easier to have all numbers in one field and all months

    in another, i.e. in a three-column table. It is not very practical to have one column per month, since

    you want to use Month as dimension and Sum(Sales) as measure.

    Enter the Crosstable prefix.

    It converts the data to a table with one column for Month and another for Sales. Another way to

    express it is to say that it takes field names and converts these to field values. If you compare it

    to the Generic prefix, you will find that they in principle are each others inverses.

    The syntax is

    Crosstable (Month, Sales) Load Product, [Jan 2014], [Feb 2014], [Mar 2014], From ;

    There are however a couple of things worth noting:

    Usually the input data has only one column as qualifier field; as internal key (Product in the

    above example). But you can have several. If so, all qualifying fields must be listed before the

    attribute fields, and the third parameter to the Crosstable prefix must be used to define the

    number of qualifying fields.

  • It is not possible to have a preceding Load or a prefix in front of the Crosstable keyword.

    Auto-concatenate will however work.


Top Related