When to Use Subroutines in Qlik
note: Hi hello and happy time zone, my name is Wes, and welcome to the how’s the whys and the when to use Subroutines in Qlik.
Introduction
This Lesson covers:
- How subroutines work in Qlik and their basic syntax
- Some practical use cases
- Side-Notes
- Subroutine Documentation
- INCLUDE
note: [Read Lesson] If you find yourself re-using code within multiple sections of your app or sharing code between multiple apps, then subroutines are the next dart we should throw at the board. I’ll show you how to not hit the wall with them, let’s get started
What Are Subroutines in Qlik?
Basic Syntax:
SUB SubroutineName
// Code to execute when the subroutine is called
END SUB
// Calling the subroutine
CALL SubroutineName
note: In Qlik, a subroutine is a named block of script that can be called from elsewhere in your load script. Think of them as mini-functions that help you organize and reuse code. As always, let’s take a gander at the documentation [Link, explore, return]
Subroutine Characteristics
- Must be defined before they are called
- Can access and modify global variables
- Don’t directly return values
- Can set global variables to act as “return values”
- Can contain almost any valid Qlik script code
note: Subroutines have plenty of wonderful characteristics [Read list]
WHY do we care?
- Keep your code DRY
- Solve a problem ONCE
- Can take in parameters
- Can ‘return’ values by assigning variables in Qliks global memory
note: But WHY do we care? Subroutines, just like methods and functions in other languages, allow our code to contain multitudes. They allow us to not repeat ourselves, to solve a problem only once, to handle parameters, return values, and to not repeat ourselves. I attended a lunch and learn tutorial about Qlik Subroutines at my current job a few months back and when the audience was polled, “What are use cases for subroutines?” I was the only person in chat who typed something. So let’s walk through some simple examples where we can utilize this power
1.1 - Simple Example
SUB AddTimestamp
LET vTimestamp = Now();
TRACE Script execution time: $(vTimestamp);
END SUB
// Now we can call this anywhere in our script
CALL AddTimestamp;
note: This is a simple example of possibilities, we can add a timestamp wherever we want in our code. With this subroutine defined, we can add timestamps throughout our script with a single line of code. [Qlik, RECORDING BREAK]
1.2a - Abstracting a Loop with a Subroutine
SUB LoadRegionYearData(mRegion, mYear)
// Construct filename
LET vFileName = '$(mRegion)_$(mYear).csv';
TRACE Processing file: $(vFileName);
IF FileSize('lib://Data/$(vFileName)') > 0 THEN
[$(mRegion)_$(mYear)]:
LOAD
Date,
Product,
Amount,
'$(mRegion)' AS Region,
$(mYear) AS Year
FROM [lib://Data/$(vFileName)]
(txt, embedded labels, delimiter is ',');
TRACE Loaded $(NoOfRows($(mRegion)_$(mYear))) rows;
ELSE
TRACE File $(vFileName) not found, skipping...;
ENDIF
END SUB
note: [Slides] Obsidian Advanced slides doesn’t love when I’m code-verbose so let’s take this over to [Qlik]. We can take a problem where nested for loops could be the solution and abstract some elements. Subroutines, like all code, are infinitely customizable, but having a controlled space for code that we may want debug, re-use, easily share, easily document; all these are skills that make us more desirable as developers than we already are.
1.2b
// Now we can call this subroutine multiple times
// First, set the variables
LET vRegion = 'North';
LET vYear = 2023;
CALL LoadRegionYearData(vRegion, vYear);
// Process South region
LET vRegion = 'South';
LET vYear = 2023;
CALL LoadRegionYearData(vRegion, vYear);
// Process West region from previous year
LET vRegion = 'West';
LET vYear = 2022;
CALL LoadRegionYearData(vRegion, vYear);
note: So let’s play around with this, a new toy in our toolbox needs some mileage. So close your eyes, imagine you’re at a corporate meeting with lots of people in suits who stare at excel files and emails about excel files all day, and you tell them you can easily process all their business reports Instead of writing similar loops multiple times, we can abstract this into a subroutine: If we need to update the logic we only update logic in a single location, the subroutine, and all calls to it will use this updated logic. Isn’t that neat [Run Code, Pause recording]
1.3a - Assigning “Return Values” Using Global Variables
// Load some sales data
SalesData:
LOAD * INLINE [
Date, Product, Sales
2023-01-15, Widget A, 1500
2023-01-18, Widget B, 2300
2023-01-22, Widget C, 1800
2023-01-25, Widget A, 1650
];
note: [Start Recording, Slides] Let’s load up some test data to play around with another subroutine feature we can implement. While Qlik subroutines don’t have formal return values like functions in other languages, we can simulate this functionality using Qlik’s global variable scope. Let’s create a subroutine that performs calculations and “returns” the results through global variables
1.3b
// Define a subroutine to calculate sales metrics
SUB CalculateSalesMetrics
/*
Assumes SalesData is a table in memory
Returns:
- vTotalSales
- vAvgSales
- vMaxSale
- vStatus
*/
SET vStatus = 'Failed';
IF NoOfRows('SalesData') > 0 THEN
// Summary table of statistics. No GROUP BY so overall
Agg:
Load
SUM(Sales) AS totalSales,
Avg(Sales) AS avgSales,
Max(Sales) AS maxSales
RESIDENT SalesData;
// Calculate metrics
LET vTotalSales = PEEK('totalSales',0,'Agg');
LET vAvgSales = PEEK('avgSales',0,'Agg');
LET vMaxSale = PEEK('maxSales',0,'Agg');
SET vStatus = 'Success';
// Clean up table now that we're done with it
DROP TABLE Agg;
ENDIF
// These variables are now available after the subroutine call
END SUB
note: [Qlik] We’re going to write a subroutine SO full of good practices. The end goal of this subroutine is to have a number of different metrics that we could want to know about all of our sales. We need to correctly handle whether or not there is sales data, and then define those data aggregations [Code]
1.3c
// Call the subroutine to calculate metrics
CALL CalculateSalesMetrics;
// Now we can use the "returned" values
IF '$(vStatus)' = 'Success' THEN
TRACE Analysis complete;
TRACE Total Sales: $(vTotalSales);
TRACE Average Sale: $(vAvgSales);
TRACE Largest Sale: $(vMaxSale);
ELSE
TRACE No data available for analysis;
ENDIF
note: This pattern effectively simulates return values from our subroutine. It’s important to initialize the “return” variables at the beginning of the subroutine to ensure they’re cleared from any previous calls.
1-4 - Creating Reusable Data Transformation Logic
// Define a subroutine to standardize date fields
SUB StandardizeDateFields(vTableName, vDateField, vOriginalFormat)
// Temporary table name
LET vTempTable = '$(vTableName)_Temp';
// Create a temporary table with transformed dates
[$(vTempTable)]:
LOAD
*,
Date(Date#($(vDateField), '$(vOriginalFormat)'), 'YYYY-MM-DD') AS StandardDate
RESIDENT [$(vTableName)];
// Drop the original table
DROP TABLE [$(vTableName)];
// Rename the temp table to the original name
RENAME TABLE [$(vTempTable)] TO [$(vTableName)];
// Set a return value - count of processed rows
LET vProcessedRows = NoOfRows('$(vTableName)');
END SUB
note: [Slides] We can abstract data transformation logic that may needed to be applied to multiple tables. We can create a subroutine to standardize date formats across these many tables, which can be done without copy-pasting code [Qlik, RECORDING BREAK] So let’s code up a standardization subroutine for multiple possible tables
1.4b
Sales:
LOAD * INLINE [
OrderID, OrderDate, Amount
1001, 01/15/2023, 1500
1002, 01/28/2023, 2300
];
// Standardize dates in both tables
LET vTableName = 'Sales';
LET vDateField = 'OrderDate';
LET vOriginalFormat = 'MM/dd/yyyy';
CALL StandardizeDateFields;
TRACE Processed $(vProcessedRows) rows in Sales table;
Purchases:
LOAD * INLINE [
PurchaseID, PurchaseDate, Cost
2001, 2023.01.05, 3500
2002, 2023.01.22, 4200
];
LET vTableName = 'Purchases';
LET vDateField = 'PurchaseDate';
LET vOriginalFormat = 'yyyy.MM.dd';
CALL StandardizeDateFields;
TRACE Processed $(vProcessedRows) rows in Purchases table;
note: Now that our subroutine exists let’s get some text data here to ensure that we can test that our subroutine works as expected [Run, Recording break]
S-1 Subroutine Documentation
// 1.2a
/*
Loads Region Year Data
Args:
- mRegion: A region, North, South, East, or West
- mYear: A Year in the format of yyyy
*/
// 1.4a
/*
Arguments:
- vTableName - Name of a table, assumed to be in memory
- vDateField - Name of date field in table to standardize
- vOriginalFormat - Current format of vDateField
Return value:
- vProcessedRows - Number of rows processed for the table
*/
note: As developers we hold so much knowledge inside of us, it is our duty to share this internal dialogue with others. For they can learn our assumptions, the input we assumed our subroutine would take when we designed it, and the output we assumed it would return if given the correct input. The style for your docstring isn’t as important as your consistency that you use for all your subroutines in your code. Remember, documentation isn’t just for current you and current co-workers, it’s for future you.
S-2 INCLUDE
$(Include=subroutines.txt);
$(Must_Include=subroutines.txt);
note: Include is a keyword that can be used within the Data Load Editor which allows you to abstract a bunch of code into a file that can be imported to not only one app, but to multiple apps. On runtime Qlik will take whatever is within that file and place it within the section of the script that called the INCLUDE
Conclusion
- If you’re copy-pasting code routinely, consider subroutines
- Even if subroutines can’t ‘return’ variables, we have global variables as a workaround
- Write documentation that tells of your assumptions
note: Wow subroutines are so neat and make our code very readable, [Read Slide]. Thank you for watching,I hope all recursive methods you write find no infinite loops. This presentation was made using Obsidian’s Advanced Slides, and my Obsidian Vault with all my notes will be available in the description