page_type | languages | products | description | urlFragment | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
sample |
|
|
Samples on how to import data (JSON, CSV, Flat-Files, etc) into Azure SQL |
azure-sql-db-import-data-samples |
Samples on how to import data (JSON, CSV, Flat-Files, etc) into Azure SQL
All samples are in the script
folder. Sample data used for running the samples is in json
and csv
folder.
Make sure you have an database in Azure that you can use for tests. If you are new to Azure SQL and need help in creating a new database, make sure to watch this 5 minutes video:
Demo: Deploy Azure SQL Database
Remember that speed of import is always tied to the maximum "Log Rate Limits" that the database tier has. More detail on this here in this article: Raising log rate limits for General Purpose service tier in Azure SQL Database.
Remember that Azure SQL Hyperscale have a 100 MB/Sec limit no matter then number of .vCores: How much time would it take to bring in X amount of data to Hyperscale
Run the script 00-on-prem-tools-user-setup.sql
in order to have the used customer
table ready to be used.
The script will also create a demo user that will be used to run the script. Feel free to change user name and password if you wish.
Create a .env.ps1
file in the script
folder using the provided .env.ps1.template
file. Make sure to fill the variables with the correct that to access the demo Azure SQL database that you have decided to use.
BCP (Bulk Copy Program) is of course an option, probably the easiest one and one of the fastest. Make sure to get the latest version from: bcp Utility
Then, a working sample is available in the 02-import-bcp.ps1
script.
If your data is in an Azure Blob Storage, you can import or read the file right from Azure SQL, without the need to use any external tool.
Sample is here: 03-bulkinsert-openrowset.sql
If you are a Powershell user, you can use the Write-DbaDbTableData cmdlet made available by the amazing dbatools project.
Sample is available in the 01-import-csv.ps1
script.
Of course there are more ways to import (and export) data into Azure SQL. Here's the most common one
The easiest option, also with great performances. It supports a variety of different data sources, not only CSV and JSON, but also Parquet, AVRO, ORC and so on.
The Copy Data tool guides you through the entire process of importing data with an easy-to-follow, wizard-style, interface.
This YouTube video shows the best pratices to load data as fast as possibile in Azure SQL via Azure Data Factory: Best Practices Using Azure SQL as Sink in ADF
Apache Spark is another option if you need more flexibility. It can read from almost any data format, and it can efficiently write data into Azure SQL. Full end-to-end sample on how to import data as fast as possible here: Fast Data Loading in Azure SQL DB using Azure Databricks
Azure Synapse is another way to read common data formats, like Parquet or sets of CSVs from Azure SQL, using Azure Synapse Serverless SQL Pools as a bridge. Here's a detailed article on how to do that: Azure SQL can read Azure Data Lake storage files using Synapse SQL external tables