SQL Server INSERT Performance (SQL, Azure SQL Database)
Introduction
In this article, we will explore the performance issues related to inserting data into a SQL Server database, specifically in the context of Azure SQL Database. We will analyze the given scenario and discuss potential solutions to improve the insert performance.
Problem Description
The problem description provides an overview of the issue faced by the user. Two tables, A and B, are mentioned with approximately 400k rows in table A and 12k rows in table B. The user needs to select ~350k rows from table A and insert them into table B. This operation is performed within a stored procedure because it requires performing multiple tasks. The problem arises when the user tries to run this procedure, resulting in an insert time of around 1 minute and 55 seconds.
Table Structure
The problem description provides the table structure for both tables A and B.
CREATE TABLE [dbo].[A]
(
[Field1] [uniqueidentifier] NOT NULL,
[Field2] [int] NOT NULL,
[Field3] [uniqueidentifier] NOT NULL,
[Field4] [nvarchar](max) NULL,
[Field5] [bit] NOT NULL,
[Field6] [int] NULL,
[Field7] [tinyint] NULL,
CONSTRAINT [PK_A]
PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGEON [PRIMARY]
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [FK_...]
FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field1]) REFERENCES [dbo].[A] ([Id])
GO
CREATE TABLE [dbo].[B]
(
[Field1] [uniqueidentifier] NOT NULL,
[Field2] [int] NOT NULL,
[Field3] [uniqueidentifier] NOT NULL,
[Field4] [tinyint] NULL,
[Field5] [nvarchar](max) NULL,
[Field6] [bit] NOT NULL,
[Field7] [int] NULL,
CONSTRAINT [PK_B]
PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGEON [PRIMARY]
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field1]) REFERENCES [dbo].[...] ([Id])
GO
Infrastructure
The problem description provides information about the infrastructure used.
- SQL Azure Database: Plan S1 - 20 DTU
- Entity Framework Core for executing the stored procedure
Wait Stats
The wait stats are provided, which indicate that the query runtime is dominated by IO waits.
<a><img/></a>
Analysis of Wait Stats
The wait stats show that there are three types of waits: PAGEIOLATCH_EX, PAGEIOLATCH_SH, and <a>LOG_RATE_GOVERNOR</a>. These waits indicate that the query is waiting for IO operations, specifically writes to disk.
PAGEIOLATCH_EX: This wait occurs when a page is being written to disk.PAGEIOLATCH_SH: This wait occurs when a page is being read from disk.<a>LOG_RATE_GOVERNOR</a>: This wait occurs when the log rate governor is triggered, which happens when the database is writing to the log file.
The standard tier DTU model provisions only 1-4 IOPS (input/output operations per second) for a 20DTU database. With this low number of IOPS, it’s not surprising that the query is waiting for IO operations.
Potential Solutions
Based on the analysis of the wait stats and the infrastructure information, here are some potential solutions to improve the insert performance:
1. Reduce Data Size
One way to reduce the amount of data being written to disk is to eliminate columns, especially the nvarchar(max) column if it’s large.
- By compressing the data using Page Compression or a Clustered Columnstore index, we can reduce the amount of data being written to disk.
- Alternatively, we can use the
COMPRESST-SQL function for thenvarchar(max)column if it is large.
COMPRESSION = ON;
2. Provide More Resources
Another way to improve insert performance is to provide more resources to the database.
- We can scale up to a higher DTU, or VCore configuration.
- Alternatively, we can use Serverless configurations with elastic scale to reduce the cost and improve performance.
- Moving this database into an Elastic Pool where it can share a larger pool of resources with other databases is another option.
3. Table Partitioning
Table partitioning won’t reduce the amount of writes, but it can help distribute the data more evenly across multiple disks, reducing the overall IO wait time.
However, In-Memory OLTP is only available in the Premium/Business Critical Tier, which already has higher IOPS. Therefore, this option may not be feasible for all users.
4. Indexing
Indexing can also help improve insert performance by allowing SQL Server to quickly locate the data being inserted into a table.
However, adding indexes can slow down the insertion process, as it requires more IO operations.
To mitigate this, we can use index compression or Clustered Columnstore indexes, which allow us to compress the data and reduce the amount of disk space required.
CREATE CLUSTERED INDEX idx_A ON A ([Field1]);
Conclusion
Insert performance is a critical aspect of database development, especially when working with large datasets. By analyzing the wait stats and infrastructure information, we can identify potential solutions to improve insert performance.
In this article, we discussed four potential solutions: reducing data size, providing more resources, table partitioning, and indexing. We also touched on other options such as using Entity Framework Core for executing stored procedures and moving databases into Elastic Pools.
By understanding the factors that affect insert performance and implementing the right strategies, developers can create high-performance database applications that meet the needs of their users.
Last modified on 2024-08-26