SQL Server Optimizations for High Concurrency

It's time to get outdoors and Explore

SQL Server Optimizations for High Concurrency

Our business needs very robust, low latency, highly available and durable online transactional system which supports high concurrency for about four weeks in a year. It's almost like Thanksgiving sale where you mark down very popular item (think of iPhone ) by 100 % . We operate on state level ( K-12 ) online assessment market where entire state takes test in spring for few weeks. To operate in this market , you need to have a robust system which can absorb such a sudden spike in transaction volumes.

Infrastructure:
We use SQL Server 2016 on Windows Server Failover cluster. Our storage layer includes RAID 10 SAN Storage and Local SSD for TempDB.

LATCH Waits
During initial load testing, we have noticed high occurrences of LATCH Waits. While investigating the root cause for LATCH waits , we found that few tables being accessed by high number of concurrent sessions thus creating LATCH Waits. Most of our transactions are very small data set inserts and updates. We also encountered Last Page Insertion Contention.

Latches are internal to the SQL engine and are used to provide memory consistency, whereas locks are used by SQL Server to provide logical transactional consistency.

PAGELATCH is used to synchronize short term access to database pages that reside in the Buffer cache, as opposed to a PAGEIOLATCH, which is used to synchronize physical access to pages in disk. These are normal in every system, the problem being when there is contention . In our use case many concurrent sessions accesses a single page, causing waits and hindering the ability to perform these inserts and updates efficiently.

A page in SQL Server is 8KB and can store multiple rows. To increase concurrency and performance, buffer latches are held only for the duration of the physical operation on the page, unlike locks which are held for the duration of the logical transaction.

From the root cause Analysis, it became very clear that we have LATCH Contention on few tables which needed to be alleviated to improve the throughput.

Latch Waits Noticed

Latch Waits Noticed

From the above screenshot, you can see that we had very high LATCH contention for Page 5261488. You can turn on the Trace Flag 3604 to further investigate the Page Contents.

Latch Waits and PFS Page or Not

TempDB Allocation Page Contention


DBCC TRACEON(3604)
CREATE TABLE PageResults (ParentObject sysname, OBJECT sysname ,Field sysname ,VALUE nvarchar(MAX))
INSERT INTO PageResults (ParentObject,  Object  ,Field, VALUE)
EXEC ('DBCC PAGE(67,1,3987384,3) WITH tableresults')

Root Cause Identification:

Clear WaitStats and Buffer Cache to Initialize


-- Remove all elements from the plan cache for one database
DECLARE @intDBID INT;
SET @intDBID = ( SELECT [dbid]
FROM   master.dbo.sysdatabases
WHERE  name = 'LoadTestDB'
);
-- Flush the procedure cache for one database only
IF @intDBID IS NOT NULL
BEGIN
DBCC FLUSHPROCINDB (@intDBID);
END;
GO
-- Reset wait and latch statistics.
DBCC SQLPERF('sys.dm_os_latch_stats' , CLEAR)
DBCC SQLPERF('sys.dm_os_wait_stats' , CLEAR)

Wait Stats Query:



Declare @ExcludedWaits Table (WaitType sysname not null primary key)
-- Waits that can be ignored
Insert Into @ExcludedWaits
Values ('CLR_SEMAPHORE'),
('SQLTRACE_BUFFER_FLUSH'),
('WAITFOR'),
('REQUEST_FOR_DEADLOCK_SEARCH'),
('XE_TIMER_EVENT'),
('BROKER_TO_FLUSH'),
('BROKER_TASK_STOP'),
('CLR_MANUAL_EVENT'),
('CLR_AUTO_EVENT'),
('FT_IFTS_SCHEDULER_IDLE_WAIT'),
('XE_DISPATCHER_WAIT'),
('XE_DISPATCHER_JOIN'),
('BROKER_RECEIVE_WAITFOR');
Select SessionID = WT.session_id,
WaitDuration_ms = WT.wait_duration_ms,
WaitType = WT.wait_type,
WaitResource = WT.resource_description,
Program = S.program_name,
QueryPlan = CP.query_plan,
SQLText = SUBSTRING(ST.text, (R.statement_start_offset/2)+1,
((Case R.statement_end_offset
When -1 Then DATALENGTH(ST.text)
Else R.statement_end_offset
End - R.statement_start_offset)/2) + 1),
DBName = DB_NAME(R.database_id),
BlocingSessionID = WT.blocking_session_id,
BlockerQueryPlan = CPBlocker.query_plan,
BlockerSQLText = SUBSTRING(STBlocker.text, (RBlocker.statement_start_offset/2)+1,
((Case RBlocker.statement_end_offset
When -1 Then DATALENGTH(STBlocker.text)
Else RBlocker.statement_end_offset
End - RBlocker.statement_start_offset)/2) + 1)
From sys.dm_os_waiting_tasks WT
Inner Join sys.dm_exec_sessions S on WT.session_id = S.session_id
Inner Join sys.dm_exec_requests R on R.session_id = WT.session_id
Outer Apply sys.dm_exec_query_plan (R.plan_handle) CP
Outer Apply sys.dm_exec_sql_text(R.sql_handle) ST
Left Join sys.dm_exec_requests RBlocker on RBlocker.session_id = WT.blocking_session_id
Outer Apply sys.dm_exec_query_plan (RBlocker.plan_handle) CPBlocker
Outer Apply sys.dm_exec_sql_text(RBlocker.sql_handle) STBlocker
Where R.status = 'suspended' -- Waiting on a resource
And S.is_user_process = 1 -- Is a used process
And R.session_id  @@spid -- Filter out this session
And WT.wait_type Not Like '%sleep%' -- more waits to ignore
And WT.wait_type Not Like '%queue%' -- more waits to ignore
And WT.wait_type Not Like -- more waits to ignore
Case When SERVERPROPERTY('IsHadrEnabled') = 0 Then 'HADR%'
Else 'zzzz' End
And Not Exists (Select 1 From @ExcludedWaits
Where WaitType = WT.wait_type)
ORDER BY WaitDuration_ms DESC
Option(Recompile); -- Don't save query plan in plan cache

Common Latch Waits

;

WITH [Latches] AS
(SELECT
[latch_class],
[wait_time_ms] / 1000.0 AS [WaitS],
[waiting_requests_count] AS [WaitCount],
100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
FROM sys.dm_os_latch_stats
WHERE [latch_class] NOT IN (
N'BUFFER')
AND [wait_time_ms] > 0
)
SELECT
MAX ([W1].[latch_class]) AS [LatchClass],
CAST (MAX ([W1].[WaitS]) AS DECIMAL(14, 2)) AS [Wait_S],
MAX ([W1].[WaitCount]) AS [WaitCount],
CAST (MAX ([W1].[Percentage]) AS DECIMAL(14, 2)) AS [Percentage],
CAST ((MAX ([W1].[WaitS]) / MAX ([W1].[WaitCount])) AS DECIMAL (14, 4)) AS [AvgWait_S]
FROM [Latches] AS [W1]
INNER JOIN [Latches] AS [W2]
ON [W2].[RowNum] 

Latch Waits Resolution:

Approaches Considered:

  1. In -Memory OLTP
  2. Replacing Identity Integer Column with GUID as leading Column for Index
  3. HASH Partioning with Computed Column

Out of these three options, In-Memory OLTP looked very promising but we didn't have enough time to implement in-memory OLTP Migration. So we adopted second solution. We replaced Identity Integer Columns with GUID which did increase page splits and index fragmentation but our workload is insert heavy. So we made this trade-off.This is not our preferred resolution but we made this trade off based on available time and resources.

In future, we are planning on migrating these hot tables to In-Memory OLTP. Just replacing INT's with GUIDs, we did see about 20-30 X Performance improvements.

[table id=51 /]

TempDB Allocation Page Contention  



SELECT  session_id ,
wait_type ,
wait_duration_ms ,
blocking_session_id ,
resource_description ,
ResourceType = CASE WHEN CAST(RIGHT(resource_description,
LEN(resource_description)
- CHARINDEX(':',
resource_description,
3)) AS INT) - 1 % 8088 = 0
THEN 'Is PFS Page'
WHEN CAST(RIGHT(resource_description,
LEN(resource_description)
- CHARINDEX(':',
resource_description,
3)) AS INT) - 2
% 511232 = 0 THEN 'Is GAM Page'
WHEN CAST(RIGHT(resource_description,
LEN(resource_description)
- CHARINDEX(':',
resource_description,
3)) AS INT) - 3
% 511232 = 0 THEN 'Is SGAM Page'
ELSE 'Is Not PFS, GAM, or SGAM page'
END
FROM    sys.dm_os_waiting_tasks
WHERE   wait_type LIKE 'PAGE%LATCH_%'
AND resource_description LIKE '2:%'


TempDB PFS Page Contention  Noticed

PFS_Page_Contention

PFS_Page_Contention

TempDB Non Allocation Page Contention

TempDB Non Allocation Page Contention

TempDB Non Allocation Page Contention

PFS Page Contention Alleviation:

From the root cause Analysis, it became very clear that number of TempDB Files we had was not sufficient. We followed the industry standard best practice of 8 Equal Sized TempDB Data Files because we had 8 Cores. We increased our TempDB File Counts by 4 and repeated the load Test Run. At 24 TempDB Files, we didn't encounter PFS Page Contention and we settled on 24 TempDB Files. This number is lot higher than the best practice recommendation but that's what worked for us and we derived this through numerous Load Test Runs.



ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp9',FILENAME = N'T:TempDBtempdb_mssql_9.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp10',FILENAME = N'T:TempDBtempdb_mssql_10.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp11',FILENAME = N'T:TempDBtempdb_mssql_11.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp12',FILENAME = N'T:TempDBtempdb_mssql_12.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp13',FILENAME = N'T:TempDBtempdb_mssql_13.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp14',FILENAME = N'T:TempDBtempdb_mssql_14.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp15',FILENAME = N'T:TempDBtempdb_mssql_15.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO
ALTER DATABASE [tempdb] ADD FILE (NAME = N'temp16',FILENAME = N'T:TempDBtempdb_mssql_16.ndf',SIZE = 2097152KB,FILEGROWTH =1048576KB )
GO

ThreadPool Waits Observed

We also encountered ThreadPool waits because we had few thousand transactions coming in per second. We increased our Max Worker Thread size. Extreme caution needs to be exercised to modify this configuration because this might lead to unwanted consequences.

Default Values

Max Number of Worker Threads Defaults

Max Number of Worker Threads Defaults



EXEC sp_configure 'max worker threads', 1472 ;  
GO  
RECONFIGURE;  
GO

ACCESS_METHODS_HOBT_VIRTUAL_ROOT Waits Noticed

ACCESS_METHODS_HOBT_VIRTUAL_ROOT

ACCESS_METHODS_HOBT_VIRTUAL_ROOT

This latch class is when a thread is waiting for access to the in-memory metadata entry containing a B-tree’s root page ID. EX access is required to change the root page ID, which typically happens when a B-tree becomes a level deeper during heavy insertions into a new index and the existing root page has to split. Every B-tree traversal has to start at the root page, which requires obtaining this latch in SH mode.

Max Degree of Parallelism and Cost Threshold

We did several run with different Max Degree of Parallelism and Cost Threshold and settled on following values based on our load Test Observation.



EXEC sp_configure 'max degree of parallelism','12'
RECONFIGURE

EXEC sp_configure 'cost threshold for parallelism','30'
RECONFIGURE

Recommended Values by Microsoft

[table id=52 /]

Lock Pages in Memory:

We also enabled lock Pages in Memory

[caption id="attachment_2984" align="alignnone" width="968"]Lock Pages in Memory Lock Pages in Memory[/caption]

Trace Flag 1118

In SQL Server 2016, By default this trace flag is on. This trace flag switches allocations in tempdb from single-page at a time for the first 8 pages, to immediately allocate an extent (8 pages). It’s used to help alleviate allocation bitmap contention in tempdb under a heavy load of small temp table creation and deletion. But previous versions, you need to explicitly turn on this trace flag.

Learnings:

We learned a lot during this exercise. Supporting high concurrency ( few thousands transactions per second ) needs very robust system and majority of the standard best practices are not applicable in these scenarios. We ran into Latch Waits, Thread Pool Waits, TempDB Allocation Contention and Write Log Waits and solved one problem at a time after carefully considering the tradeoffs. It took several weeks of coordinated effort from several teams to identify the bottleneck in the system and alleviate those bottlenecks.

References:

[table id=53 /]

 

Posted in Microsoft Technologies, SQL Server on Apr 09, 2017


Comments

Please sign in to comment!