We create the new resource group
We create the Purview resource
One of the first actions I need to take is assign a role for Purview in the created storage accounts.
In our case
From that moment on, we already have the option to scan our data source. We note that the options on the left side of our Purview console have increased.
Once we have enabled the read role in our data sources, we can proceed to work with them through Azure Purview. We will start with our Azure Data Lake Gen2. Click on Register and select the resource from the set on the right.
And we register our resource, creating a collection called Azure-Synapse-Workshop.
We proceed to register it
Once registered, we proceed to perform a scan. To do this, we click on the AzureBlob target.
The Purview engine executes the process, connecting to the source and showing the different folders that exist in it. Click on continue
We select the scan rule. In our case, as we have not created any additional ones, we will work with the one that exists by default.
We can even program the scan frequency.
In this opportunity, we will set it as a single occasion.
We do a check
And we proceed. Now we can only wait for the results
We see that you have successfully scanned the resource and found three assets, but none of them have information identified as classified.
After this step, we are going to register another resource, in this case Azure Synapse
By linking the resource to the same collection, we see that it is included just below the Azure Blob.
IMPORTANT: In the case of Azure Synapse it is a bit more laborious than in the previous one. Here we must, on the one hand, have the SQL Pool running and also through TSQL we must create the permissions for our Azure Purview
Let’s how. The first thing is to open the Azure Synapse Workspace
We have it
We see all the tables and proceed to execute the scan
We see the result
As in Azure Synapse we have a table with customer information, let’s see how it looks.
To show us the lineage of the data, we must use Data Factory and
I create a new database, and a Data Factory pipeline that replicates the one previously created in Azure Synapse.
In order to view the server in Purview, we must add the read permissions and add permissions in the database
To do this, we must create a user in the Active Directory
Include the role to that user
Reset your password, for this you have to enter with this username and your temporary password to reset it and that will be the one used to connect with SSMS
And through SSMS connect with Active Directory – Password to be able to execute the script below.
CREATE USER [purviewaa] FROM EXTERNAL PROVIDER
EXEC sp_addrolemember ‘db_owner’, [purviewaa]
I did the same with the rest of the data sources I was working with. For example with the SQL Pool of Azure Synapse
And even with the Azure Blob Storage account
And our Azure Synapse
Include details in datasets
In the case of experts or owners
In the case of the classification, we observe that the tool has made a first classification, but we have the possibility to modify it and even increase it. We will see
Now we complete the set
Being that way
Creating a Glosary
Connection with Data Factory
To then be able to see the Lineage
About Alberto Alonso Marcos
My name is Alberto Alonso. Actually I work with Sogeti Spain in Business Intelligence Department with Microsoft Technologies. My profile is very orientated to customer, and how the DATA can improve the organization. My first steps in the data management were in the Pharmaceutical Sector. (I´m pharmaceutical too). I worked hard to extract and built procedures for gathering all the information across the organization. Measurement all kind of events. Aggregating different sources like ERP, LIMS, HVAC, OEE tools, and productivity machine reports.
More on Alberto Alonso Marcos.