Managed SSBI with Power BI: Core Dimensions


In this post I want to revisit the core datasets I talked about in a previous post about Managed SSBI.

One problem that we may have overlooked when building a bunch of core datasets in that post, is that certain dimensions tend to duplicate across the datasets. Imagine a scenario where the single master data source of a managed self-service setup is a data warehouse, which sources all the required dimensions. When you have, for example, core reseller sales, internet sales, and finance datasets, each one will have a calendar dimension and a few others created in each of these datasets. This is not ideal if you think about the extent of the duplication and effort that is required.

This is where, once again, using DQ for PBI datasets and AS comes into play, where you could draw up a layered core dataset architecture. If we take the example of AdventureWorks’ fact tables in the data warehouse (single master data source) you can figure out what the business processes are. Then, when you draw up an enterprise bus matrix, this is a snippet of what you will arrive at:

You will note that dimensions such as Calendar (Date), Salesperson, Product and a few others span multiple business processes. These are ideal candidates for the first layer of the core datasets. The corresponding core workspace looks like this; a dataset each for the business processes (with the exception of sales quota and inventory):

Each dataset has at least one of the following dimensions: Calendar, Salesperson and Product. Hence, I shall focus on these for the post.

Since, reseller sales has all the three dimensions, I shall focus on this first, and download the dataset from the service:

The reseller sales dataset

I will get to work by keeping Calendar, Salesperson and Product, and removing the Reseller Sales and Metrics tables:

Dataset stripped down to only the common dimensions

I will name this dataset Core Dimensions and publish to the core workspace. This will be the dataset that all core datasets will refer to for common dimensions.

Then, I will edit each core dataset by removing the dimensions that correspond to what’s in Core Dimensions. Next, I will connect to the data hub, select the Core Dimensions dataset and select the required dimensions. Finally, I re-create the relationships, for the dataset to look like this (This is the new and improved reseller sales):

If you look at the lineage in the core workspace, this is what you will now see:

Note: For the steps to direct query a dataset, you can refer to my previous post on extending a core dataset at the edge.

Leave a comment