Enabling real-time dashboards: Power BI DirectQuery mode and Dataflows support in Azure Cosmos DB

Rodrigo Souza

Revin Chalil

The Azure Cosmos DB V2 Connector for Power BI, with support for querying the Azure Cosmos DB transactional store in both DirectQuery and Import modes, is now in preview. The new V2 connector also supports Azure Cosmos DB as a data source in Power BI dataflows. It’s available now on Power BI components including Desktop, Power BI service, Power BI Mobile and on the Power Query platform.

Real-time reporting without importing data

The DirectQuery mode enables queries pushdown, including aggregations to the Azure Cosmos DB container, when a filter on partition key is specified. This new capability unlocks real time reporting and enables connecting directly to Cosmos DB without needing to import the data into Power BI cache. The DirectQuery mode in the V2 connector enables you to query and visualize large Azure Cosmos DB datasets without the limitations imposed by the Import mode. 

Before the V2 connector, Power BI developers and users were limited to the imported dataset, refreshed or not with a fixed schedule. Now with the support for in-partition aggregations, you can obtain data in real time and optimize performance with the aggregations being executed in Azure Cosmos DB, avoiding the load of big volumes of data into Power BI cache. 

Imagine a hypothetical IoT scenario where thousands of devices send billions of data points per day. Users want to monitor totals per day and when necessary, dig into one specific device. With the Import mode in V1 connector, users had to import the entire dataset into Power BI cache to dynamically perform aggregations and apply filters based on the given device. Now with the DirectQuery Mode, users can create a dashboard that will dynamically pass the DeviceId as a parameter to the query that is pushed down to Azure Cosmos DB. 

For partitioned Azure Cosmos DB containers, a SQL query with an aggregate function is pushed down to Azure Cosmos DB if the query also contains a filter (WHERE clause) on the Partition Key. For example, considering DeviceId from the scenario above as the partition key, then a SQL query with aggregations can be pushed down when there is a filter on this property: 

SELECT SUM(ColumnName) FROM TableName WHERE DeviceId = ‘1234’  

This is what we call in-partition aggregations. If you need cross-partition aggregations, we recommend Azure Synapse Link for Azure Cosmos DB.

Please note that although aggregations would not be pushed down to Azure Cosmos DB when the query does not contain a filter on the Partition Key, the projections and filters would still be pushed down. For eg: if DeviceId was not the partition key on the Cosmos DB container in the above query, the filter will be pushed down to Cosmos DB and the Sum aggregation of ColumnName will be performed in the Power BI cache. 

The most optimal way to specify Partition Key filter is to use Power BI dynamic M Parameters. For that, you need a Dashboard Slicer to list the possible Partition Keys and the selected value is used as a parameter for the dashboard with the aggregation.  

How to use Azure Cosmos DB V2 connector for Power BI 

To start creating your Power BI dashboard, follow these steps: 

Step 1 – Get Data

Click on Get Data in Power BI Desktop and search for Azure Cosmos DB and choose the Azure Cosmos DB V2 (Beta) option. If a third-party warning pops up, ignore it, and click on continue. In the next screen is where you will decide between Import or DirectQuery modes. They are both supported now.

Configure your connection to Azure Cosmos DB using the V2 connector, including the DirectQuery mode now
Configure your connection to Azure Cosmos DB using the V2 connector, including the DirectQuery mode now

Step 2 – Specify the partition key dynamic filter

To specify the Partition Key filter using dynamic M parameters after the initial configuration, you would create a dataset with unique Partition Key values, create a parameter, add it as filter on main dataset, bind it to the unique Partition key dataset, and use it as a slicer for the main dataset. Please follow the detailed steps to specify the Partition Key filter using dynamic M parameters. You can then add visualizations and apply Partition Key filter from the slicer as shown below.

Filter on a partition key to push aggregations down to Azure Cosmos DB
Filter on a partition key to push aggregations down to Azure Cosmos DB

Use Azure Cosmos DB as source in Power BI dataflows

The Azure Cosmos DB V2 Connector for Power BI supports Azure Cosmos DB as data source in Power BI dataflows, which can be used to create reusable transformation logic and can be shared across datasets, reports, and dashboards. To create a dataflow with Azure Cosmos DB as data source, use the “Define new tables” option from dataflow interface, choose Azure Cosmos DB v2 as data source, as shown below and specify any transformation logic that that can be reused across Power BI artifacts. 

Image PBI V2 Connector Data Sources

Azure Cosmos DB V2 is now a supported data source on Power Query platform.

Once the Dataflow is created, it can be consumed from Power BI Desktop and Power BI service. To consume the Azure Cosmos DB based Dataflow from Power BI Desktop, select “Dataflows” in the Get Data dropdown. Select the Azure Cosmos DB Dataflow to create a dataset that will allow you to create reports and dashboards. You can connect to the Dataflow in both DirectQuery and Import modes.

Dataflow Get Data in Power BI Desktop
Dataflow Get Data in Power BI Desktop

What to use When 

With this announcement, customers now have three options to implement Business Intelligence solutions on top of their Azure Cosmos DB data: Azure Cosmos DB V2 connector for Power BI, Azure Synapse Link, and Azure Cosmos DB ODBC driver. But what is the right option for you? 

Option  Used for… 
Azure Cosmos DB V1 connector for Power BI 
  • Legacy version now, upgrade to V2 when possible. 
Azure Cosmos DB V2 connector for Power BI 
  • Azure Cosmos DB for NoSQL API only. 
  • Real time BI: Create Power BI reports using the last inserted data on its original JSON format and Power BI DirectQuery mode. 
  • In-partition aggregations: Filter on a Partition Key and get back only aggregated data from the database, improving performance by reducing the amount of data transferred. 
  • Joins in Power BI: Add collections to your Power BI dashboard as datasets and create relationships between them. The join happens in Power BI. 
  • Share throughput between transactional and BI workloads: Use your Azure Cosmos DB RU/s to support your transactional and BI workloads, provisioning throughput enough for both. 
Azure Synapse Link for Cosmos DB 
  • Azure Cosmos DB for NoSQL, MongoDB, or Gremlin APIs. 
  • No impact on OLTP performance: No impact to OLTP performance while running BI queries. 
  • In-partition or cross partitions aggregations: Perform aggregations with no limitations.  
  • Joins between collections: Using Azure Synapse Serverless SQL pools, you can leverage many T-SQL capabilities including views, joins between collections, stored procedures, and joins between collections and data located in your Azure Data Lake Store. The joins happen in the storage level. 
  • Advanced Analytics, Data Science, and Data Engineering: unlock typical big data workloads using Apache Spark in Azure Synapse Analytics
  • Near real time BI: Create BI reports, with any BI tool, using analytical store, Azure Synapse Serverless SQL pools, and Power BI DirectQuery mode. Transactional store data will be available within 2 minutes.  
  • Reduce your Azure CosmosDB Costs: Consumption-based pricing. No impact to RUs provisioned for transactional workloads.
Azure Cosmos DB ODBC driver 
  • You can’t use Power BI 
  • You can’t use Azure Synapse Link 

 

Get Started

Try Azure Cosmos DB free with no sign-up or credit cards.

Power BI Desktop is free, and you can get started now by downloading the latest version here.

 

Conclusion 

Azure Cosmos DB now offers multiple options for BI and Analytics. You can leverage our new V2 connector to improve the performance of your Power BI dashboards, reports, or dataflows by pushing filters and aggregations down in real time with the DirectQuery mode. 

 

0 comments

Discussion is closed.

Feedback usabilla icon