The effect of Summarize by Default on a Tabular Model

Another one of my experiences recently was this strange looking query generated off my Power BI reports. These reports were running off live a live connection to a tabular model on Analysis Services. And of course these queries at up my CPU, leaving it at 100% for quite a long while, with the reports not responding. One thing about the reports were that they had quite a number of parameters (which of course were a client requirement which we couldn’t do much about).

Here’s just a tip of the ice berg sample of the huge query that is generated:

Big Query

The data model was quite a large one with 30+ measures, and 10+ dimensions, and one thing that we had missed out was that all numeric fields that were not used as measures such as the Year, MonthNumber and WeekNumber fields of the Calendar dimension, and EmployeeAge field of the Employee dimension had their Summarize By property set to Default. But this was not indicated in any way in Power BI especially in the Fields section, where usually fields that are summarized show up with a Σ to indicate that the field is summarized when the data model is a Power BI data model.

The phenomenon is explained in detail in Pinal’s SQLAuthority. Make sure you take a look. Happy New Year!


Report and Dashboard Usage Metrics in Power BI

One thing we tend to overlook as technical folk working in the business intelligence arena is the importance of analytics about analytics. However, to business folk, it is quite important that they know how valuable their reports and dashboards are to others, if not what’s the purpose in them doing what they do if nothing is being used. The same applies to reports and dashboards in Power BI.

Power BI Usage Metrics Location

Power BI offers something known as Usage Metrics for each report when you access reports or dashboards off the Power BI portal. You can find it above the top left of each report, which when clicked on gives you a summary of how your report or dashboard has been used.

Power BI Usage Metrics

It includes quite a few useful insights as to the main report’s usage. However, as humans can never be satisfied, the business user naturally will ask for more analysis on top of what they see on the summary. For instance, “I need to see at-a-glance which user looks at which page of the report more often”. That’s when you wished you had a way of providing users with customization options of report usage metrics. But there is actually a way.

Each app workspace gets its own report usage metrics data set, it’s just that you don’t see it when you are in the portal. In order to access it (at least for now) you need to use Power BI Desktop. When you open Power BI Desktop, you need to sign-in with the appropriate login, and then choose Power BI service from Get Data menu item. You then get listed with a set of app workspaces; within each you would find a list of all the datasets that were every published to each of the workspaces. Additionally, Power BI will also give you two more datasets: Report Usage Metrics Model and Dashboard Usage Metrics Model. However, these data models will only show up if you had attempted to view usage metrics at least once on one of the reports of the app workspace. The moment you click on the Usage Metrics link and the following happens; that is when the model is created:

Usage Metrics Generating

Usage Metrics Generated

And now, it’s your playground, a whole new data model that gives you analytics on the produce of another data model.

Power BI service


Unexpected Exception in Power BI connected to SQL Server 2016 Analysis Services

One interesting issue that showed its ugly head in my recent project was an unexpected exception that popped up on all visuals on a Power BI report. How it happened was quite strange; one moment all the reports were working fine. The next moment when a report visual was modified and deployed to Power BI, all the visuals start turning into a grey box with a cross, with a “See details” link. The details weren’t at all helpful. Just, “An unexpected exception occurred.”:

Clicking on the “Copy Details” link and pasting on a text editor, and quickly scanning through showed this:

And to make matters worse, all subsequent report loads, including that of other reports, puts the reports into hang, until Analysis Services is restarted. The cycle starts again when the offending report is opened again.

A little digging around pointed at a bug on SQL Server 2016 SP1 Analysis Services. I’m not sure if the problem occurs on a version prior to SP1, however there is a fix: SP1 CU5 (KB Article: KB4037354). Once installed the problem disappears. I tested the issue out on SQL Server 2017 Analysis Services, and this problem does not occur on SQL Server 2017.

Speeding up ETL data cleansing

While I was at the PASS Summit this year, I was sitting in Paul Turleys’s (b|t) session on creating integrated BI solutions. Paul was talking about how ETL takes up most of a BI solution’s development lifetime, regardless of how you planned for it. That’s when someone from the audience asked the question: “…methodology or tool in the MS stack to shorten the time taken for ETL?”. Paul’s answer was: “There’s no silver bullet, But I’m going to say it’s Power Query”. This got me thinking, since for a few weeks, my team had been struggling to get data cleansed on the project that they were working on.

Despite having a data profiling document, which documented how source data mapped to the data model, along with the rules that needed to be applied to these data, such as nullability, constraints, valid domain values and so forth, bad data continued to come in. Apart from bad data, there were scenarios that were not identified nor thought of before that were coming in with the data. Though we do usually expect some amount of bad data or anomalies, which can be reported on, this was pretty bad. We needed to have a data set that would not fail the ETL so that it can be at least tested for all business logic that we had identified as requirements.

The issue was that the team had to build the Integration Services packages first, before running it to finally find out the various data problems that kept spitting out. So, it took a while for the developer to identify the problem, realize the data issue, report it to the client, have it fixed by the client, (sometimes do a fix on the package) and repeat. There were times when I had to jump in to help figure out what the data problems were. But because getting onto a tool (specially non-Microsoft), connecting to the source, and then writing SQL queries to retrieve and check the data was quite a slow process, I preferred using Power Query.

Power BI is a self-service tool; it can connect to many different types of data source, the best thing about it is the ease of pulling in data, building joins, filtering, sorting and perform all the data manipulations that you want do is made super easy. And you could just save it all in one just file, and use it the next time you want it. So I get it (or at least formed an impression in my head), and related to it when Paul said “Power Query” as the tool to speed up the life time of ETL development.

Explaining how you could go about it is another post for the future, but I am presenting it at the Sri Lankan Data Community Meetup on Nov 29th in Colombo,if you are interested 😉

Distinct Count of Dimension based on a Filtered Measure

Okay, so we have this brand new awesome project going on; first time on Tabular model, and that too with a large number of measures connected to Power BI and all that jazz. The model would contain all those standard measures that the organization used, while we built some Power BI reports using these measures, with some visuals needing some improvisation on top of the standard measures, i.e. requirements specific to the reports.

Enthusiastic as we were, one of the hardest nuts to crack, though it seemed so simple during requirements gathering, was to perform a distinct count of a dimension based on a filtered measure on a couple of the reports. To sketch it up with some context; you have products, several more dimensions, and a whole lot of measures including one called Fulfillment (which was a calculation based on a couple of measures from two separate tables). The requirement was to get a count of all those products (that were of course filtered by other slicers on the Power BI report) wherever Fulfillment was less than 100%, i.e. the number of products that had not reached their targets.

Simple as the requirements seemed, the hardest part in getting it done, was the limited knowledge in DAX, specifically, knowing which function to use. We first tried building the data model itself, but our choice in DAX formulae, and the number of records we had (50 million+) soon saw us running out of memory in seconds on a 28GB box; Not too good, given the rest of the model didn’t even utilize more than half the memory.

Using calculated columns was not a possibility since the columns that made up the measures that made up Fulfillment was from a different table, and the calculation does not aggregate up.

Since it was a report requirement, we tried doing it visually, by pulling in the Product and the Fulfillment fields on to a table visual, and then filtered Fulfillment, as so:

And then performed a distinct count on product, and voila! it performed a count, but alas! we realized that the filter gets disabled/removed when that happens. Which means the count always ends up being a count of all.

A frenzy of DAX formulae at the Power BI level did not help either, until we tried out the VALUES() function, courtesy direction from Gerhard Brueckl (b|t), trust the MVP community to have your back.

The VALUES() function returns a single-column table of unique values from a specified table. Hence using the FILTER() function we were able to extract the unique products where fulfillment was less than 100%, and then used the COUNTROWS() function to simply count the number of records returned.

Product Short of Target =
VAR ProductCount =
COUNTROWS ( FILTER ( VALUES ( 'Product'[ProductCode] ), [Fulfillment] < 1 ) )
RETURN IF ( ISBLANK ( ProductCount ), 0, ProductCount )

It is very easy to overlook VALUES as a function that would give you unique values. Hence why it is important that you have an overall understanding of what each DAX function can be used for, or at least get your hands on a DAX cheat sheet of sorts pinned against your wall. Glad this worked out though.