Copy an Azure Data Factory pipeline to Synapse Studio

1

In this post, I want to share an alternative way to copy an Azure Data Factory pipeline to Synapse Studio. Because I think it can be useful.

For those who are not aware, Synapse Studio is the frontend that comes with Azure Synapse Analytics. You can find out more about it in another post I did, which was a five-minute crash course about Synapse Studio.

By the end of this post, you will know one way to copy objects used for an Azure Data Factory pipeline to Synapse Studio. Which works as long as both are configured to use Git.

Azure Data Factory example

For this example, I decided to use the pipeline objects that I created for another post. Which showed an Azure Test Plans example for Azure Data Factory. It uses a mapping data flow, as you can see below.

Data flow in Azure Data Factory

In order to use this method both Azure Data Factory and Azure Synapse Analytics need to be setup to use source control. For this demo I have them stored in Azure Repos within Azure DevOps.

However, they can just as easily be in a GitHub Enterprise repository instead.

Copy an Azure Data Factory pipeline to Synapse Studio

How I did the copy was very simple. I just copied all the individual objects from the Azure Data Factory repository to the Azure Synapse repository using the same structure.

Below are the required objects I needed for the pipeline in the Azure Data Factory repository. Which are the linked services, datasets, the data flow and of course the pipeline itself. Shown as separate json files.

Azure Data Factory repository objects

I copied the json files from the Azure Data Factory repository to the same locations in the Azure Synapse workspace repository, as you can see below. Making sure they went into the same branch that I was working on in Synapse Studio.

Same objects in Azure Synapse repository

You can see that there are some extra objects in the Azure Synapse repository. Which get added by default when you connect an Azure Synapse workspace to a Git repository within Synapse Studio.

One key point is that it does this even if you do not select the option to import existing resources into Git.

Testing in Synapse Studio

Now, copying the Azure Data Factory objects this way is all well and good but does it work?

Well to test this thoroughly I recreated the two Azure SQL Databases that were used in the initial Data flow. With the source database based on the AdventureWorksLt sample database and the other database blank.

Afterwards, I opened up Synapse Studio and went to the Manage hub. Where I changed the Linked services for the two databases to connect to the new ones.

Linked services in the Manage hub

Once I had done that, I went into the Develop hub in Synapse Studio. I then opened the new Data flow and enabled Data flow debug.

I then tested the connection to the dataset, as you can see below. In addition, I was able to preview the data.

Data flow in Synapse Studio

Afterwards, I went to the Integrate hub. From there I ran the pipeline in Synapse Studio by clicking Debug. Which succeeded, as you can see below.

Pipeline in Synapse Studio

To be absolutely sure I went into the Azure SQL database that was used for the destination (aka sink) in the Azure Portal. To help with some syntax here, in pipelines and data flows the destination is called sink.

I then logged into the Query editor and ran the below query. To make sure that new rows were in the database.

Checking rows existed using Query editor

Which confirms that this method worked. Because that database was blank before we ran the pipeline.

DataWeekender lightning talks

In reality, something simple and effective like this can be explained within ten minutes as a lightning talk. With this in mind, if you have something like this you want to share with the community feel free to submit a lightning talk session to DataWeekender v4.2.

I thought I better mention this since call for speakers is still open. You can get to the sessionize page by clicking on this DataWeekender v4.2 call for speakers link or on the image below.

Final words about copying an Azure Data Factory pipeline to Synapse Studio

I hope this post about an alternative way to copy an Azure Data Factory pipeline to Synapse Studio helps some of you.

I like this method. Because it shows a simple and effective way to copy objects from Azure Data Factory to Azure Synapse.

I discovered this whilst looking to create more Azure DevOps templates after a previous post. Which introduced Azure DevOps templates for Data Platform deployments. So, expect more templates to appear in my GitHub site.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Please note – This blog was originally published on my personal blog here.

Kevin Chant

About

Lead BI & Analytics Architect originally from the UK and now living in the Netherlands. Currently Microsoft Data Platform MVP and Microsoft Certified Trainer Alumni. Many years experience in the IT sector, and has supported databases for companies in the top 10 of the fortune 500 list. In addition to a lot of Data Platform experience also has a fair few Microsoft Certifications, and was probably the last ever person in the world to gain the MCSD Azure Architect certification. Real life experience with Microsoft Data Platform and Azure Devops. Previously SQL Server Product Owner of around 1,900 instances. In addition, done various things for the Data Platform Community. With one of the last being one of the organizers of the online DataWeekender conference.

More on Kevin Chant.

Related Posts

Your email address will not be published.