POWER BI —Direct Query on Clustered Columnstore Index

Michal Molka
2 min readMay 14, 2021

When you don’t want to load your model into memory, you use a Direct Query mode. As you now, gap between an in-memory and DQ performance can be immense. We can use a Direct Query mode and attain a good effectiveness using a DQ on a Columnstore Clustered Indexed table.

I’ve prepared two identical tables.

The first: [GHCN_CCI] — Clustered Columnstore Index

CREATE CLUSTERED COLUMNSTORE INDEX CCI_GHC_CCI ON dbo.GHCN_CI

The second: [GHCN_RSI] — a rowstore containing two indexes: Clustered and Non-Clustered.

CREATE CLUSTERED INDEX IX_StationID_Alpha2 ON dbo.GHCN_RSI(StationID, Alpha2)
CREATE NONCLUSTERED INDEX IX_Date ON dbo.GHCN_RSI([Date])

Both tables contain 0.7 billion rows.

A Power BI model:

A first query:

DEFINE 
MEASURE GHCN_CCI[COUNT_CCI] = COUNTROWS(GHCN_CCI)
MEASURE GHCN_CCI[SUM_CCI] = SUM(GHCN_CCI[Value])/10
MEASURE GHCN_CCI[AVG_CCI] = AVERAGE(GHCN_CCI[VALUE])/10

EVALUATE
SUMMARIZECOLUMNS(
'ghcnd-countries'[CountryName],
CALCULATETABLE(
GHCN_CCI,
GHCN_CCI[Type] = "TAVG",
'ghcnd-countries'[Alpha2] in {"PL"}
),
"COUNT", GHCN_CCI[COUNT_CCI],
"SUM", GHCN_CCI[SUM_CCI],
"AVG", GHCN_CCI[AVG_CCI]
)

The rowstore table is partly covered by indexes: 24.5 sec.

The columnstore table: 0.7 sec.

A second query, the rowstore table is covered by indexes:

DEFINE 
MEASURE GHCN_CCI[COUNT_CCI] = COUNTROWS(GHCN_CCI)
MEASURE GHCN_CCI[SUM_CCI] = SUM(GHCN_CCI[Value])/10
MEASURE GHCN_CCI[AVG_CCI] = AVERAGE(GHCN_CCI[VALUE])/10

EVALUATE
SUMMARIZECOLUMNS(
'ghcnd-countries'[CountryName],
CALCULATETABLE(
GHCN_CCI,
DATESBETWEEN(GHCN_CCI[Date], "2020-01-01", "2020-10-01")
),
"COUNT", GHCN_CCI[COUNT_CCI],
"SUM", GHCN_CCI[SUM_CCI],
"AVG", GHCN_CCI[AVG_CCI]
)

The rowstore table: 37.7 sec. vs the columnstore table: 1.2 sec.

A third one, without filters:

DEFINE 
MEASURE GHCN_CCI[COUNT_CCI] = COUNTROWS(GHCN_CCI)
MEASURE GHCN_CCI[SUM_CCI] = SUM(GHCN_CCI[Value])/10
MEASURE GHCN_CCI[AVG_CCI] = AVERAGE(GHCN_CCI[VALUE])/10

EVALUATE
SUMMARIZECOLUMNS(
'ghcnd-countries'[CountryName],
GHCN_CCI,
"COUNT", GHCN_CCI[COUNT_CCI],
"SUM", GHCN_CCI[SUM_CCI],
"AVG", GHCN_CCI[AVG_CCI]
)

The rowstore table : 39 sec. vs the columnstore table: 2.3 sec.

As you can see. During developing Power BI solutions. You may take an advantage of a columnstore index in combination with a Direct Query.

--

--