The Azure Cosmos DB V2 Connector for Power BI, with support for querying the Azure Cosmos DB transactional store in both DirectQuery and Import modes, is now in preview. The new V2 connector also supports Azure Cosmos DB as a data source in Power BI dataflows. It’s available now on Power BI components including Desktop, Power BI service, Power BI Mobile and on the Power Query platform.
Real-time reporting without importing data
The DirectQuery mode enables queries pushdown, including aggregations to the Azure Cosmos DB container, when a filter on partition key is specified. This new capability unlocks real time reporting and enables connecting directly to Cosmos DB without needing to import the data into Power BI cache. The DirectQuery mode in the V2 connector enables you to query and visualize large Azure Cosmos DB datasets without the limitations imposed by the Import mode.
Before the V2 connector, Power BI developers and users were limited to the imported dataset, refreshed or not with a fixed schedule. Now with the support for in-partition aggregations, you can obtain data in real time and optimize performance with the aggregations being executed in Azure Cosmos DB, avoiding the load of big volumes of data into Power BI cache.
Imagine a hypothetical IoT scenario where thousands of devices send billions of data points per day. Users want to monitor totals per day and when necessary, dig into one specific device. With the Import mode in V1 connector, users had to import the entire dataset into Power BI cache to dynamically perform aggregations and apply filters based on the given device. Now with the DirectQuery Mode, users can create a dashboard that will dynamically pass the DeviceId as a parameter to the query that is pushed down to Azure Cosmos DB.
For partitioned Azure Cosmos DB containers, a SQL query with an aggregate function is pushed down to Azure Cosmos DB if the query also contains a filter (WHERE clause) on the Partition Key. For example, considering DeviceId from the scenario above as the partition key, then a SQL query with aggregations can be pushed down when there is a filter on this property:
SELECT SUM(ColumnName) FROM TableName WHERE DeviceId = ‘1234’
This is what we call in-partition aggregations. If you need cross-partition aggregations, we recommend Azure Synapse Link for Azure Cosmos DB.
Please note that although aggregations would not be pushed down to Azure Cosmos DB when the query does not contain a filter on the Partition Key, the projections and filters would still be pushed down. For eg: if DeviceId was not the partition key on the Cosmos DB container in the above query, the filter will be pushed down to Cosmos DB and the Sum aggregation of ColumnName will be performed in the Power BI cache.
The most optimal way to specify Partition Key filter is to use Power BI dynamic M Parameters. For that, you need a Dashboard Slicer to list the possible Partition Keys and the selected value is used as a parameter for the dashboard with the aggregation.
How to use Azure Cosmos DB V2 connector for Power BI
To start creating your Power BI dashboard, follow these steps:
Step 1 – Get Data
Click on Get Data in Power BI Desktop and search for Azure Cosmos DB and choose the Azure Cosmos DB V2 (Beta) option. If a third-party warning pops up, ignore it, and click on continue. In the next screen is where you will decide between Import or DirectQuery modes. They are both supported now.
Step 2 – Specify the partition key dynamic filter
To specify the Partition Key filter using dynamic M parameters after the initial configuration, you would create a dataset with unique Partition Key values, create a parameter, add it as filter on main dataset, bind it to the unique Partition key dataset, and use it as a slicer for the main dataset. Please follow the detailed steps to specify the Partition Key filter using dynamic M parameters. You can then add visualizations and apply Partition Key filter from the slicer as shown below.
Use Azure Cosmos DB as source in Power BI dataflows
The Azure Cosmos DB V2 Connector for Power BI supports Azure Cosmos DB as data source in Power BI dataflows, which can be used to create reusable transformation logic and can be shared across datasets, reports, and dashboards. To create a dataflow with Azure Cosmos DB as data source, use the “Define new tables” option from dataflow interface, choose Azure Cosmos DB v2 as data source, as shown below and specify any transformation logic that that can be reused across Power BI artifacts.
Azure Cosmos DB V2 is now a supported data source on Power Query platform.
Once the Dataflow is created, it can be consumed from Power BI Desktop and Power BI service. To consume the Azure Cosmos DB based Dataflow from Power BI Desktop, select “Dataflows” in the Get Data dropdown. Select the Azure Cosmos DB Dataflow to create a dataset that will allow you to create reports and dashboards. You can connect to the Dataflow in both DirectQuery and Import modes.
What to use When
With this announcement, customers now have three options to implement Business Intelligence solutions on top of their Azure Cosmos DB data: Azure Cosmos DB V2 connector for Power BI, Azure Synapse Link, and Azure Cosmos DB ODBC driver. But what is the right option for you?
Option | Used for… |
Azure Cosmos DB V1 connector for Power BI |
|
Azure Cosmos DB V2 connector for Power BI |
|
Azure Synapse Link for Cosmos DB |
|
Azure Cosmos DB ODBC driver |
|
Get Started
Try Azure Cosmos DB free with no sign-up or credit cards.
Power BI Desktop is free, and you can get started now by downloading the latest version here.
Conclusion
Azure Cosmos DB now offers multiple options for BI and Analytics. You can leverage our new V2 connector to improve the performance of your Power BI dashboards, reports, or dataflows by pushing filters and aggregations down in real time with the DirectQuery mode.
0 comments