Azure Data Factory: Copy Data tool

In this tutorial, you will be learning about one of the out-of-the-box solution in Azure Data Factory called "Copy Activity". This feature helps to copy data from one source to another.

For my use case, I will be copying and extracting zip files from one blob to storage to another.

1. Navigate to "Storage Account". Click "Containers"

2. Click "+ Container" to create new container.

3. Click the "Name" field.

4. Type "datazip". Click "Create"

5. Create another and call it "dataunzip". Click "Create"

6. Click the "Search containers by prefix" field. Type "data"

7. Click "datazip" and "Upload" the sample zip file.

8. Select the file and click "Upload"

9. Navigate to Azure Data Factory Instance and click "Launch studio"

10. Click "Manage".

11. I have already setup Linked Services, but I will not be using all of them for this tutorial.
Next, click on "Author" and let's start working on "Copy Data" activity.


For more information on creating Linked Services, you can refer

https://learn.microsoft.com/en-us/azure/data-factory/

12. Click "Add new resource" button.
Click "Copy Data tool"

13. You will be directed to a guided setup.
Select "Built in copy task".
Click "Next"

14. Select "Source type" to be "Azure Blob Storage".
For "Connections", choose the "Linked Service" for the "Azure Blob Storage"

15. For "File and Folder", click "Browse"

16. Navigate to "datazip" > "<your sample zip file>".
Click "OK"

17. Make sure to click on "Binary copy".
Select "ZipDeflate" for "Compression type". "ZipDeflate" will help to decompress files in the destination.
Click Next

18. Select "Destination type" to be "Azure Blob Storage".
For "Connections", choose the "Linked Service" for the "Azure Blob Storage".
For "Folder path", click "Browse".
Select "dataunzip" and click "OK".

Click "Next"</p>

19. Enter the "Task name". I am calling it CopyPipeline_Zip_To_Unzip. Click "Next"

20. You will be directed to "Review" section.
Click "Next"

21. In the "Deployment" section, "Datasets" and "Pipelines" will be created.
Click "Finish"

22. "CopyPipeline_Zip_To_Unzip" pipeline has been created

23. Similarly, source and destination datasets are also created

24. To test the pipeline, click "Debug"

25. Pipeline ran successfully.

26. Let's check "dataunzip" container if the extracted files are there.

27. Voila! the extracted files are in the container without a need to manually extract it using "Data Copy tool"

The Copy Data tool eases and optimizes the process of ingesting data from one location to another, which is usually a first step in an end-to-end data integration scenario. It saves time, especially when you use the service to ingest data from a data source for the first time. To know more about the "Copy Data tool" refer to

https://learn.microsoft.com/en-us/azure/data-factory/copy-data-tool?tabs=data-factory

Subscribe to Emad Afaq Khan

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe