Hot on the heels of the question I was asked a few days ago, comes another closely related one: “What’s the difference between Connect live to a data source and DirectQuery a data source in Power BI?”
We had already established that there are two methods in which we could interact with data using Power BI: loading data into Power BI and accessing the data source directly.
Connecting live and DirectQuery both fall into the latter method, but there is a difference.
In DirectQuery mode, you access the data source, such as a relational database or data mart for data, but then you would create calculated columns or measures on top of it in Power BI generating a data model layer, something similar to a database view, if you may. The data still exists at the data source; but is pulled through the data model on Power BI onto the the visuals. The end users and report creators will see and interact with the data model on Power BI.
In the case of Connect live, the data model itself is at the source, you interact with it directly and no data model layer is created on Power BI. All measures, calculated columns and KPIs are provided by the data model at the source, along with the data. End users and report authors will see and interact with this data model through Power BI.
If you would compare these two methods on a conceptual level; DirectQuery mode is used in cases of self-service where you have data marts or a data warehouse on a relational database, and business users build their own data models off this for their business needs. The data marts or data warehouse will integrate data from various systems, and provide base measures with related dimensions. Business user may create custom measures and calculated columns on top of this to suit their reporting and analytic requirements, and then explore data and build visual reports. Think of this as the data discovery phase of the self-service exercise.
Live connections would probably be used in scenarios where the analytic needs are better understood, and/or the type of analytics that were described above have matured and has become a mainstream in the organization. Here data models are built off the data warehouse using Analysis Services (multidimensional or tabular), with measures, calculations and KPIs that were earlier part of the self-service (and the data discovery exercise) incorporated in it. Business users now have established reports and dashboards that showcase organizational performance powered by established data models. Think of this phase where things have evolved into corporate BI that gives real value.
[SUBJECT TO CHANGE] Out of the whole bunch of supported data sources, Power BI currently supports the following in DirectQuery mode:
- SQL Server
- Azure SQL Database
- Azure SQL Data Warehouse
- SAP HANA
- Oracle Database
- Teradata Database
- Amazon Redshift (Preview)
- Impala (Preview)
- Snowflake (Preview)
and the following using a Live connection:
- Analysis Services Tabular
- Analysis Services Multidimensional
- Azure Analysis Services (Preview)