Qlik QVDs and Why You Should Be Using Them

note: Hi hello and happy timezome, I’m gonna’ talk about QVDs in Qlik today. This tutorial assumes you have a basic familiarity with the Data Load Editor

What are QVDs?

Behind the scenes a QVD file contains three key components

An XML header with metadata
Symbol tables for efficient storage of field values
The actual data an optimized format

note: QVDs are proprietary file formats created by Qlik. They serve as an intermediate storage format between your source systems (SQL Server, Excel Files, etc.) and your Qlik applications. [READ SLIDE]

Why Use QVDs?

SPEED
Reduced Server Load
Data Transformation Layer

note: ONE—QVDs load data 10-100 times faster than most other data formats or direct connections to databases. QVDs are designed for Qlik, this means for any larger dataset this is the choice. TWO—By extracting data once and storing it as QVDs, it minimizes the load on source systems. Instead of multiple Qlik applications repeatedly querying your databases, they can access the pre-extracted QVDs. THREE—QVDs enable a structured ETL process with clear separation between extraction, transformation, and loading phases

An app before the app

Having a Qlik app to generate QVDs for your production app

note: Because QVDs are so powerful, this leads to the idea of having an app to generate data for the production app. This is a great practice as it has a separate, logical layer for all ETL steps, meaning the Production app can focus purely on the visuals and front-end, leading to less to maintain. This is a structured pipeline approach. That’s beauty

Optimized Loads Actions that COULD Break Them :)

These actions will break optimization and result in a standard load

Concatenating fields together
Applying functions to fields during the load, like as Date(), Upper(), or Num()
Using WHERE clauses to filter the data
Using MAPPING loads

note: QVDs allow for an ‘optimized load’, which can speed up reading in a large complicated file. I will say I have ran into trial and error with some of these rules, I went to an official Qlik presentation and they also talked about “Just play around and see what breaks it :)“

Practical Framework Example

Load data from source
Save data as QVD
Load data from QVD
Transform data
Save data as QVD
Load data from QVD

note: These 6 steps above are the ones I follow in my app before the app to generate QVDs for production. We’ll walk through each step one by one over in Qlik

1. Load data from source

// Extract from source system
random_data:
LOAD
    first_name,
    last_name,
    integer_column,
    float_column
FROM [lib://QVD_Tutorial/random_data_1M.csv]
(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

TRACE Loaded .csv;
CALL AddTimestamp;

2. Save data as QVD (Extraction Layer)

// Store raw data as QVD
STORE random_data INTO [lib://QVD_Tutorial/random_data.qvd] (qvd);

DROP TABLE random_data;

TRACE Stored QVD;
CALL AddTimestamp;

3. Load data from QVD

// Load the raw QVD using optimized load
random_data:
LOAD *
FROM [lib://QVD_Tutorial/random_data.qvd] (qvd);

TRACE Loaded QVD;
CALL AddTimestamp;

4. Transform data

// Apply transformations
random_data_transformed:
NoConcatenate
LOAD
    LOWER(first_name) as first_name,
    UPPER(last_name)  as last_name,
    integer_column - 100 AS integer_column,
    float_column
RESIDENT random_data;

DROP TABLE random_data;

5. Save transformed data as QVD

// Store transformed data as QVD
STORE random_data_transformed INTO [lib://QVD_Tutorial/random_data_transformed.qvd] (qvd);

DROP TABLE random_data_transformed;

TRACE Transformations;
CALL AddTimestamp;

6. Load data from transformed QVD for analysis

// Final load for the application
LOAD *
FROM [lib://QVD_Tutorial/random_data_transformed.qvd] (qvd);

TRACE Final Load;
CALL AddTimestamp;

kthxbye

QVDs:

Dramatically faster data loading
Reduced strain on source systems
Structured data management approach
Support for incremental loading strategies

note: By implementing the layered approach we’ve demonstrated, you create a robust data architecture that separates the E the T and L concerns. This leads to more maintainable applications, faster reload times, and better overall performance. This presentation was made using Obsidian’s Advanced Slides, and my Obsidian Vault with all my notes will be available in the description

Wes' Public Vault

Explorer

Qlik Tutorial - QVDs