Skip to Content

Azure Purview

Jan 15, 2021
Sogeti Labs

We create the new resource group

We create the Purview resource

One of the first actions I need to take is assign a role for Purview in the created storage accounts.

In our case

From that moment on, we already have the option to scan our data source. We note that the options on the left side of our Purview console have increased.

Once we have enabled the read role in our data sources, we can proceed to work with them through Azure Purview. We will start with our Azure Data Lake Gen2. Click on Register and select the resource from the set on the right.

And we register our resource, creating a collection called Azure-Synapse-Workshop.

We proceed to register it

Completed

Once registered, we proceed to perform a scan. To do this, we click on the AzureBlob target.

The Purview engine executes the process, connecting to the source and showing the different folders that exist in it. Click on continue

We select the scan rule. In our case, as we have not created any additional ones, we will work with the one that exists by default.

We can even program the scan frequency.

In this opportunity, we will set it as a single occasion.

We do a check

And we proceed. Now we can only wait for the results

Completed

We see that you have successfully scanned the resource and found three assets, but none of them have information identified as classified.

After this step, we are going to register another resource, in this case Azure Synapse

By linking the resource to the same collection, we see that it is included just below the Azure Blob.

IMPORTANT: In the case of Azure Synapse it is a bit more laborious than in the previous one. Here we must, on the one hand, have the SQL Pool running and also through TSQL we must create the permissions for our Azure Purview

Let’s how. The first thing is to open the Azure Synapse Workspace

We have it

We see all the tables and proceed to execute the scan

We see the result

As in Azure Synapse we have a table with customer information, let’s see how it looks.

To show us the lineage of the data, we must use Data Factory and

I create a new database, and a Data Factory pipeline that replicates the one previously created in Azure Synapse.

In order to view the server in Purview, we must add the read permissions and add permissions in the database
To do this, we must create a user in the Active Directory

Include the role to that user

Reset your password, for this you have to enter with this username and your temporary password to reset it and that will be the one used to connect with SSMS
And through SSMS connect with Active Directory – Password to be able to execute the script below.

CREATE USER [purviewaa] FROM EXTERNAL PROVIDER
GO

EXEC sp_addrolemember ‘db_owner’, [purviewaa]
GO

NOTE:
I did the same with the rest of the data sources I was working with. For example with the SQL Pool of Azure Synapse

And even with the Azure Blob Storage account

And our Azure Synapse

Include details in datasets
In the case of experts or owners

In the case of the classification, we observe that the tool has made a first classification, but we have the possibility to modify it and even increase it. We will see

Now we complete the set

Being that way

Creating a Glosary

Connection with Data Factory

To then be able to see the Lineage

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Slide to submit