Dataverse Link to Azure Synapse: Dataverse to Data Lake and query via Azure Data Lake Gen2
In this blog you will learn about Dataverse Link to Azure Synapse
In this tutorial I will teach you how to use Dataverse link to Synapse and sync your Dataverse tables with Azure Data Lake Gen2. In this blog you will learn about:
- Create Data Lake Gen2 Storage Account
- Create Synapse Analytics and link it with Data Lake Gen2 Storage Account
- Create Dataverse to Synapse link
- Sync tables in Dataverse to Data Lake Gen2
- Query synced data in Data Lake Gen2 using Synapse Analytics
1. Navigate to your PowerApps environment and click "Dataverse"
2. Click "Azure Synapse Link"
3. Click "New link"
4. After clicking on the "New link", you will see your environment's location and some fields below it that needs to be configured to have this "link" working. Location of my environment is "North Europe" and the associated resources to this Synapse Link should also reside in the same region. Don't worry we will do it step by step.
5. For Synapse Link to work it requires configuration with Storage Account that would act as Data Lake. Let's create one by navigating to Azure Portal.
6. Go to your resource group if already created else create a new one. Click "Create" and type "storage account".
7. Click "Storage account"
8. Click "Create"
9. Add a storage account name and set your region to "North Europe" just like your PowerApps environments then click "Next"
10. Make sure to check "Enable hierarchical namespace"
11. Click "Create" the storage account
12. Next, you will also need to "Create" an instance of Azure Synapse Analytics Workspace.
13. Type "azure synapse" and you will see "Azure Synapse Analytics" below the search bar. Click on it
14. Set a name for Managed resource group, Workspace name and region. Region should be according to your PowerApps environment. In my case, its North Europe
15. In the fields below region, you will need to attach the "Data Lake Storage Gen2" that was created earlier.
16. Create a new file system, I am calling it "dataverse".
17. Make sure to check "Assign myself the Storage Blob .......". This will allow you to query Datalake from Synapse Workspace interactively.
18. Review and create you Azure Synapse Workspace instance
19. Navigate back to PowerApps environment and create "New link".
20. Fill in the necessary details based on the resources you created in Azure and click next.
21. Select the tables that you want to explore/export outside Dataverse and click "Save". You can have partitions based on "Month" or "Year". For this example, I am choosing "Month"
22. Dataverse Link to Synapse will take few seconds to minutes for the configuration. You can now navigate back to Storage Account and click "Containers".
23. Notice that a new container is created with a weird name. This weird name is not so weird if you get to know what it's made of. It is made up of dataverse-environmentName-organizationUniqueName. You can navigate into this container to explore.
24. Inside that container you will see a list of folders that resemble the name of your Dataverse tables. These folders will have csv files and snapshots of Dataverse tables.
"The model.json file, along with its name and version, provides a list of tables that have been exported to the data lake. The model.json file also includes the initial sync status and sync completion time."
25. Now, navigate to your Synapse Workspace instance
26. Click "Open"
27. Go to Data
28. Expand Lake database
29. Isn't this great! You can see you Dataverse synced tables in your Data Lake storage and not just that, you can query them using SQL.
30. Right-hand click on "systemuser" (which is User table in Dataverse) and "Select TOP 100 rows"
31. After running the query, you can see the results. I will conclude this part here, stay tuned for the remaining two parts.