Power BI from Rookie to Rock Star Archives - Microsoft Dynamics 365 Blog http://microsoftdynamics.in/category/power-bi-from-rookie-to-rock-star/ Microsoft Dynamics CRM . Microsoft Power Platform Tue, 30 May 2023 04:49:34 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 https://i0.wp.com/microsoftdynamics.in/wp-content/uploads/2020/04/cropped-Microsoftdynamics365-blogs.png?fit=32%2C32 Power BI from Rookie to Rock Star Archives - Microsoft Dynamics 365 Blog http://microsoftdynamics.in/category/power-bi-from-rookie-to-rock-star/ 32 32 176351444 What is Data Factory in Microsoft Fabric http://microsoftdynamics.in/2023/05/30/what-is-data-factory-in-microsoft-fabric/ Tue, 30 May 2023 04:49:34 +0000 https://radacad.com/?p=18157 Microsoft Fabric is an end-to-end data analytics solution in the cloud, and one of its workloads is called Data Factory. In this article, you will learn what Data Factory is, how it works with the rest of Microsoft Fabric, and what are elements and functions of Data Factory. Video Microsoft Fabric To understand Data Factory, Read more about What is Data Factory in Microsoft Fabric[…]
The post What is Data Factory in Microsoft Fabric appeared first on RADACAD. ...

The post What is Data Factory in Microsoft Fabric appeared first on Microsoft Dynamics 365 Blog.

]]>
What is Data Factory in Microsoft Fabric

Microsoft Fabric is an end-to-end data analytics solution in the cloud, and one of its workloads is called Data Factory. In this article, you will learn what Data Factory is, how it works with the rest of Microsoft Fabric, and what are elements and functions of Data Factory.

Video

Microsoft Fabric

To understand Data Factory, it is best to understand Microsoft Fabric first. Microsoft Fabric is an end-to-end Data Analytics software-as-a-service offering from Microsoft. Microsoft Fabric combined some products and services to cover an end-to-end and easy-to-use platform for data analytics. Here are the components (also called workloads) of Microsoft Fabric.

Microsoft Fabric

To learn more about Microsoft Fabric and enable it in your organization, I recommend reading the articles below;

Data Factory Origin

Microsoft Fabric has a workload for Data Integration. Any end-to-end data analytics system should have a data integration component. Microsoft has been a strong data integration tool and service leader for decades. This started with SQL Server tools such as DTS (Data Transformation Service) and SSIS (SQL Server Integration Services) and then stepped into cloud-based technologies such as ADF (Azure Data Factory). Microsoft also used a data transformation engine that first targeted citizen data analysts called Power Query.

Data Factory is the data integration component of Microsoft Fabric which brings the power of Azure Data Factory and Power Query Dataflows into one place. For many years, we had these two technologies doing data transformations separately. But now, these two are combined under Fabric, called Data Factory.

Power Query

Power Query Dataflows was first announced a few years ago as an additional component to Power BI for data transformation as a cloud technology that is simple to use for data analysts. But soon, it became more than just for Power BI; it became Power Platform Dataflows. These days, Power Query Dataflows are used for data transformations in Power BI projects and data migration in Power Apps projects.

Power Query

Although Power Query Dataflows is also on the dataflow side, it needed some enhancements on scalability and the control of execution with some control flow elements (such as loop structures, conditional execution, etc.).

Azure Data Factory

Azure Data Factory came into the market many years ago as the next generation of SSIS for in-the-cloud ETL. However, the data transformation engine of Azure Data Factory was not built on a strong basis, so most of the time, ADF was used for data ingestion, and then with the help of SQL stored procedures, etc., for doing the transformation afterward. ADF was not a tool for citizen data analysts. It was instead for data engineers and developers. ADF used data pipelines to execute a group of activities as a flow, and among those activities, there were tasks such as copy data, running a stored procedure, etc.

Azure Data Factory. Image sourced from: https://learn.microsoft.com/en-us/azure/data-factory/introduction

For the past few years, we have always had this split; If you wanted a simple-to-use data transformation engine but not much data, use Power Query Dataflows. If you want scalable data ingestion, then use Azure Data Factory.

Best of Both Worlds

Now in Microsoft Fabric, We combine the best things from Power Query Dataflows and Azure Data Factory Data Pipelines into one stream: Data Factory. Data Factory ensures that you still have a simple-to-use and powerful transformation engine of Power Query for data transformation, but on the other hand, you also have the scalability of Data Pipelines and can build a control flow for execution of the ETL using the Data Pipelines. In other words, Data Factory is a state-of-the-art ETL software-as-a-service offering for Microsoft Fabric.

Data Factory in Microsoft Fabric combines Azure Data Factory and Power Query Dataflows together.

Elements of Data Factory

Combining these two services brings great features that make the Data Factory an ultimate ETL service. Here are some of those below;

Data Connectors

For an ETL (Extract, Transform, Load) system, one of the most important aspects is what sources the data can be fetched from. Data Factory offers hundreds of data connectors, enabling you to get data from sources such as databases, files, folders, software-as-a-service systems, etc.

Data Factory Connectors

It is also possible to create your connector if you are keen.

Dataflows

Dataflows are the heart of Data Factory. This is where you get the data from the sources, define the data transformation and prepare it in any shape needed, and finally load it into destinations. Dataflows use the Power Query data transformation engine and the user interface for creating it using the simple-to-use Power Query Editor online.

Dataflow

Power Query Editor online is not only powerful in the graphical interface, it also enables the developer to write code in M language, which is the data transformation language for Power Query.

Power Query Editor online

To learn more about Dataflows, I suggest reading my article below.

Dataflows support a few destinations at the time of writing this article which are;

  • Azure Data Explorer (Kusto)
  • Azure SQL Database
  • Data Warehouse
  • Lakehouse

Data Pipelines

Although Dataflows are the main ETL component of the Data Factory, they can be enhanced when wrapped by a control flow execution component. This control flow execution component is called Data Pipeline. A Data Pipeline is a group of activities (or tasks) defined by a particular flow of execution. The activities in a Pipeline can involve copying data, running a Dataflow, executing a stored procedure, looping until a certain condition is met, or executing a particular set of activities if a condition is met, etc.

Data Pipeline

Data Pipelines can then be scheduled, and there is a monitoring tool to check the execution stage of the pipeline in addition to the activity-state-outputs where you can define what happens if a certain task fails or succeeds.

As mentioned, one of the most important activities that can be done in a Pipeline is the execution of a Dataflow. This is where Dataflows and Data Pipelines work together in their best way.

Executing Dataflows from Data Pipeline

To learn more about Data Pipelines, read my article below;

Summary

Data Factory is an ETL-in-cloud solution that is the data integration workload of Microsoft Fabric. Data Factory is not a new product or service; it comes from many years of Microsoft data transformation tools and services. It is built on top of Power Query and Azure Data Factory. Data Factory uses two main components to deliver the best ETL scenarios possible; Dataflows and Data Pipelines. Dataflows are for the main get data, transform, and load process, and the Data Pipeline can control the rest of the execution with control flow activities.

I highly recommend reading the articles below to study more about Data Factory;

The post What is Data Factory in Microsoft Fabric appeared first on RADACAD.

Follow Source

The post What is Data Factory in Microsoft Fabric appeared first on Microsoft Dynamics 365 Blog.

]]>
4760
Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow http://microsoftdynamics.in/2021/01/07/streamline-power-bi-refresh-refresh-dataset-after-a-successful-refresh-of-dataflow/ Thu, 07 Jan 2021 02:58:11 +0000 https://radacad.com/?p=14515 Do you have a Power BI dataset that gets data from a dataflow? have you ever thought; “can I get the dataset refreshed only after the refresh of dataflow completed and was successful?” The answer to this question is yes, you can. One of the recent updates from the data integration team of Power BI Read more about Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow[…]
The post Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow appeared first on RADACAD. ...

The post Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow appeared first on Microsoft Dynamics 365 Blog.

]]>
Facebooktwitterredditpinterestlinkedintumblrmail

streamline Power BI dataflow and dataset refresh

Do you have a Power BI dataset that gets data from a dataflow? have you ever thought; “can I get the dataset refreshed only after the refresh of dataflow completed and was successful?” The answer to this question is yes, you can. One of the recent updates from the data integration team of Power BI made this available for you. Let’s see in this blog and video, how this is possible.

The scenario

If you are using both dataflows and datasets in your Power BI architecture, then your datasets are very likely getting part of their data from Power BI dataflows. It would be great if you can get the Power BI dataset refreshed right after a successful refresh of the dataflow. In fact, you can do a scenario like below.

streamline the refresh of Power BI dataset automatically after successful refresh of the Power BI dataflow

Power Automate connector for dataflow

Power Automate recently announced availability of a connector that allows you to trigger a flow when a dataflow refresh completes.

Trigger for when the dataflow refresh completes

Choosing the dataflow

You can then choose the workspace (or environment if you are using Power Platform dataflows), and the dataflow.

Dataflow setting in the Power Automate dataflow connector

Condition on success or fail

The dataflow refresh can succeed or fail. You can choose the proper action in each case. For doing this, you can choose the result of refresh to be Success.

checking if the dataflow refresh was successful

Refresh Power BI dataset

In the event of successful refresh of the dataflow, you can then run the refresh of the Power BI dataset.

refresh Power BI dataset from Power Automate

Refreshing Power BI dataset through Power Automate is an ability that we had for sometime in the service.

Capture the failure

You can also capture the failure details and send a notification (or you can add a record in a database table for further log reporting) in the case of failure.

send email notification if the dataflow refresh failed

Overall flow

The overall flow seems a really simple but effective control of the refresh as you can see below.

refresh Power BI dataset after dataflow

My thoughts

Making sure that the refresh of the dataset happens after the refresh of the dataflow, was one of the challenges of Power BI developers if they use dataflow. Now, using this simple functionality, you can get the refresh process streamlined all the way from the dataflow.

Dataflow refresh can be done as a task in the Power Automate as well. Which might be useful for some scenarios, such as running the refresh of the dataflow after a certain event.

refresh a dataflow from Power Automate

This is not only good for refreshing the Power BI dataset after the dataflow, it is also good for refreshing a dataflow after the other one. Especially in best practice scenarios of dataflow, I always recommend having layers of the dataflow for staging, data transformation, etc, as I explained in the below article.

multi-layered dataflow. source: https://docs.microsoft.com/en-us/power-query/dataflows/best-practices-reusing-dataflows

Although, dataflow is not a replacement for the data warehouses. However, having features like this helps the usability and the adoption of this great transformation service.

Do you think of any scenarios that you use this for? let me know in the comments below, I’d love to hear about your scenarios.

Video

Facebooktwitterlinkedinrssyoutube

The post Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow appeared first on RADACAD.

Follow Source

The post Streamline Power BI Refresh: Refresh dataset after a successful refresh of dataflow appeared first on Microsoft Dynamics 365 Blog.

]]>
4368