Hands on Introduction to Azure Data Factory-Part1

0

 Data Factory is a cloud based data integration service, which allow us to establish data driven workflows within the cloud for orchestration and automation of data flow and data transformation. It's like an ETL (Extract Transform and Load) service offered in Azure cloud. Through Data Factory service, one can setup and schedule automated data pipelines which can extract data from various sources , transform data based on the business logic, rules and load it into the target. 

Hands on Introduction to Azure Data Factory-Part1

For better understanding, I'm going to split this topic into several parts. First let's start with the need of Data Factory.

Why do we need Data Factory ?

Hands on Introduction to Azure Data Factory-Part1
To understand about the need of Data Factory, let's assume that you have a company where you stored all of your data on your on premise data center in different databases for instance Oracle, SQL etc. At later stages, due to the expansion of your company or to take advantage of cloud benefits like high scalability, durability, availability and to perform advanced analytics etc you wan't to move your data from on premise to cloud. Here, any organization need to follow a sequence of steps in order to migrate the data from onpremise to cloud. Steps such as, Extraction of data by connecting to source, Transformation of data (It's like cleaning of data) and Loading of data into destination need to be done. Using Data Factory we can do this steps.  In on premise data center to perform ETL activities, generally we use Informatica or SSIS Softwares etc. 

Important Note : One can store unstructured data into Blob / Data Lake. If it is structured data, we can use SQL, SQL Warehouse etc.  

At this stage, I hope you understood the importance of Data Factory service. This service comes under (Paas) Platform as a service group. Now, let's dive into the Data Factory workflow. 

Data Factory Workflow :

Hands on Introduction to Azure Data Factory-Part1
First, we need to create a pipeline. Then, we need to connect to different data sources to extract data out of them. After that, we need to transform and enrich the data by doing cleaning and sorting operations. Now, we need to publish the changes so that, whenever we come back to our pipeline we can see the enriched data. Finally, we need to monitor our pipeline and operation to keep it alive and ensure the high availability, maintaining better performance on incoming data. 

How to create Data Factory from Azure console ?

1. Go to portal.azure.com 

2. Search for Data Factory in Azure console

Hands on Introduction to Azure Data Factory-Part1

3. First select your subscription. Give a name for your Data Factory, select or create a resource group for it, choose a region where you want to create your Data Factory Service and select your GIT and configure if you have one ready to use in order to version control your changes or ignore for now and setup at later stages. After filling all this data, now just click on create.

Hands on Introduction to Azure Data Factory-Part1

Hands on Introduction to Azure Data Factory-Part1

That's it, your Data Factory will be get available in couple of minutes. I hope you enjoyed this article. We recommend you to subscribe our zukunfttech blog to get notified for latest upcoming articles. Feel free to share and raise your questions through comments.

Post a Comment

0Comments
Post a Comment (0)

#buttons=(Accept !) #days=(20)

Our website uses cookies to enhance your experience. Learn More
Accept !