DAX — KEEPFILTERS vs FILTER
Every now and then we need to create a calculation containing filtered values. Like on the picture bellow.
As you can see, among others, we can use the KEEPFILTERS() or FILTER() function. What is the difference? Performance of course.
As an example, we use an Iowa Liquor Sales dataset. It contains 21 million records.
Source: Iowa Liquor Sales | data.iowa.gov
When we go with the KEEPFILTERS() solution…
DEFINE MEASURE
Iowa_Liquor_Sales[KeepFiltersMeasure] =
CALCULATE(
SUM(Iowa_Liquor_Sales[Volume Sold (Liters)]),
KEEPFILTERS(
Iowa_Liquor_Sales[City] in {"Aurelia", "Avoca", "Baxter"} &&
AND(
Iowa_Liquor_Sales[Sale (Dollars)] >= 10,
Iowa_Liquor_Sales[Sale (Dollars)] < 55
)
)
)
EVALUATE
SUMMARIZECOLUMNS(
Iowa_Liquor_Sales[County],
Iowa_Liquor_Sales[City],
Iowa_Liquor_Sales,
"KeepFiltersMeasure",
Iowa_Liquor_Sales[KeepFiltersMeasure])
…we get following statistics and a plan.
Once you use the FILTER() function…
DEFINE MEASURE
Iowa_Liquor_Sales[FilterMeasure] =
CALCULATE(
SUM(Iowa_Liquor_Sales[Volume Sold (Liters)]),
FILTER(
Iowa_Liquor_Sales,
Iowa_Liquor_Sales[City] in {"Aurelia", "Avoca", "Baxter"} &&
AND(
Iowa_Liquor_Sales[Sale (Dollars)] >= 10,
Iowa_Liquor_Sales[Sale (Dollars)] < 55
)
)
)
EVALUATE
SUMMARIZECOLUMNS(
Iowa_Liquor_Sales[County],
Iowa_Liquor_Sales[City],
Iowa_Liquor_Sales,
"FilterMeasure",
Iowa_Liquor_Sales[FilterMeasure]
)
…you get this result.
As you can see, here is the performance difference for the benefit of the KEEPFILTERS() function.
In this case the KEEPFILTERS() function is faster than the FILTER() calculation. Total time is 15 ms in comparison to 30 ms. The FILTER() uses two queries as opposed to the KEEPFILTERS() which uses only one query.
How the KEEPFILTERS() works? It keeps an existing context and compares context filters to function’s arguments. In other words, a query result is a combination of a filtered context and values from the function’s arguments.
How the CALCULATE() + FILTER() works? The FILTER() is an iterator. So, placed inside the CALCUALTE() function it evaluates every row of a provided table (by function arguments) inside a particular filter context.