Category Archives: Performance Tuning

Get Index Column Info with Includes for One or Many Tables

— by Lori Brown @SQLSupahStah

I was recently working with one of my clients on some low hanging fruit type of query tuning. We had checked the cache for plans with missing index warnings in them and were trying to see if we could tweak and existing index or add a new index to speed things up. If you ever work with missing indexes, you surely have seen it recommend crazy things, duplicates or existing indexes or it wants something that can be added to an existing index.

The bottom line for missing index recommendations is that you should NEVER, EVER create a missing index unless you know for absolutely sure that the index does not exist and that it will really help a known performance issue. For instance, you can have missing index warnings on small tables that only return a couple of rows to the query. Those are usually not worth working on since scanning a small table can usually be done very quickly by SQL. To this day, I still find many databases that are way over indexed with indexes that were implemented simply because someone found a missing index warning and did not do their homework or they ran the dreaded Database Tuning Advisor which shoved a bunch of duplicate indexes into tables.

If you are wondering how to get a list of missing indexes, please check out Jeff Schwartz’s blog post on how to do this. (https://blog.sqlrx.com/2017/06/02/query-tuning-and-missing-index-recommendations/ ) This will give you a place to start. It is better if you know what query is throwing the missing index warning so it is a good idea to collect those either in a trace or extended events. Jeff builds on his first post and in his second post on the subject (https://blog.sqlrx.com/2017/07/20/handling-multiple-missing-index-recommendations-for-the-same-table/ ) also goes over the fun of having multiple missing index recommendations for a single table and how to deal with them.

Here’s a handy set of links for some of Jeff’s great index tuning work that you should really check out:

https://blog.sqlrx.com/2016/01/28/sql-server-20122014-extended-events-for-developers-part-1/

https://blog.sqlrx.com/2017/06/02/query-tuning-and-missing-index-recommendations/

https://blog.sqlrx.com/2017/07/20/handling-multiple-missing-index-recommendations-for-the-same-table/

https://blog.sqlrx.com/2017/08/10/how-indexing-affects-deletion-queries/

One of the things that I usually need when performance tuning is to know information about the existing indexes on specific tables. I always want to know what columns are in the indexes along with the included columns so that I can compare the existing indexes to the missing recommendations. This way I can better figure out if a recommendation is something that can be added to an existing index (like an included column) or if I really need to create a brand new index if it does not exist at all.

Like most DBA’s, I keep a toolkit with all kinds of handy scripts. However, I did not have one that would give me index included columns. I also wanted the query to be able to return info from one or many tables at the same time. This would be useful when dealing with things with lots of joins. I know that there are a few bloggers who have posted something similar but I wanted to have the ability to filter on one or multiple tables. So, here is what I came up with:

/********************************

Returns index columns with included columns

plus other needed index info for tables

in @tablelist variable

*********************************/

 

DECLARE @tablelist VARCHAR(1000)

DECLARE @sqlstr NVARCHAR(MAX)

 

SET @tablelist = ‘InvoiceLines,OrderLines,StockItemHoldings’ — comma delimited list of tables, can be one or multiples EX: ‘mytable’ or ‘mytable,nothertable,thirdtable’

 

— Query the tables

IF @tablelist <> OR @tablelist <> ‘?,?,?’

BEGIN

SET @tablelist = REPLACE(QUOTENAME(@tablelist,””), ‘,’, ”’,”’) — Add quotes so the IN clause will work

 

SET @sqlstr = ‘SELECT SCHEMA_NAME(o.schema_id) AS SchemaName

       ,o.name AS TableName

       ,i.name AS IndexName

       ,i.type_desc AS IndexType

       ,c.name AS ColumnName

       ,ic.is_included_column

       ,ic.index_column_id

       ,t.name

       ,c.max_length

       ,c.precision

       ,c.scale

FROM sys.indexes i

JOIN sys.index_columns ic ON i.index_id = ic.index_id

JOIN sys.columns c ON ic.column_id = c.column_id AND ic.object_id = c.object_id

JOIN sys.objects o ON o.object_id = i.object_id AND c.object_id = i.object_id

JOIN sys.types t on t.user_type_id = c.user_type_id

WHERE o.name IN (‘+@tablelist+‘)

ORDER BY SchemaName, TableName, IndexName, index_column_id’

 

EXEC sp_executesql @sqlstr

–PRINT @sqlstr

END

All you have to provide is a comma separated list of the table(s) you are interested in for the @tablelist variable and it will do the rest. The output looks like this:

Blog_20170928_1

I found several bloggers who had made queries that would concatenate the columns together but truthfully I found those hard to read so I settled for a list with some extra info on the data and index types. Hope this is useful to someone out there.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

How Indexing Affects Deletion Queries

— by Jeff Schwartz

The Problem

Many articles concerning SQL Server discuss how record insertion overhead increases with each additional index. They discuss b-tree manipulations and page splits in addition to leaf and non-leaf levels. However, few discuss the fact that deletion overhead increases as well, especially when large numbers of records are deleted by individual queries. Recently, I was working with a client who regularly purged large numbers of records from tables that ranged in size from large to gigantic. For example, one table contained over 6.5 billion records. I added an index (4th overall) to one table expressly for the purpose of expediting the large deletion process, and the deletion run ran longer, despite using the new index! To determine how the numbers of indices and records to be deleted interact, I conducted an experiment to test several combinations. The specifics of the tests and their corresponding results are summarized below.

Test Table Creation & Load

To determine deletion/index behavior, a generic table was constructed and filled with 20 million records. Each record contained an identity column, an ID column, a text column, and 47 metric columns whose random values ranged between 1 and 1,000,000,000. The large number of table columns was used to insure SQL Server would choose an index option when appropriate. To minimize duplication of column values and create a uniform distribution of values, the ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase formula was used to generate values that were as random as possible, where @ModBase was set to one billion. Three indices were created initially: a clustered index that used the identity column as its only key, a nonclustered index that used DupID as its only column, and a nonclustered index that used Metric43 and Metric14 as keys with Metric01, Metric02, Metric03, and Metric04 as included columns. The scripts for the creation, loading, and initial indexing of the table are shown below.

— ##############################################################

— Create test table

— ##############################################################

drop table FewDuplicates;

CREATE TABLE FewDuplicates (

IDCol bigint identity (20000000,1),

DupID bigint,

MyText varchar(10),

Metric01 bigint, Metric02 bigint, Metric03 bigint, Metric04 bigint,

Metric05 bigint, Metric06 bigint, Metric07 bigint, Metric08 bigint,

Metric09 bigint, Metric10 bigint, Metric11 bigint, Metric12 bigint,

Metric13 bigint, Metric14 bigint, Metric15 bigint, Metric16 bigint,

Metric17 bigint, Metric18 bigint, Metric19 bigint, Metric20 bigint,

Metric21 bigint, Metric22 bigint, Metric23 bigint, Metric24 bigint,

Metric25 bigint, Metric26 bigint, Metric27 bigint, Metric28 bigint,

Metric29 bigint, Metric30 bigint, Metric31 bigint, Metric32 bigint,

Metric33 bigint, Metric34 bigint, Metric35 bigint, Metric36 bigint,

Metric37 bigint, Metric38 bigint, Metric39 bigint, Metric40 bigint,

Metric41 bigint, Metric42 bigint, Metric43 bigint, Metric44 bigint,

Metric45 bigint, Metric46 bigint, Metric47 bigint

)

— ##############################################################

— Load original table

— ##############################################################

declare @DupID bigint = 1

declare @NumRecs bigint = 20000000

declare @ModBase bigint = 1000000000

truncate table FewDuplicates

set nocount on

while (@DupID <= @NumRecs)

begin

insert into [dbo].[FewDuplicates] (

[DupID], [MyText]

,[Metric01], [Metric02], [Metric03], [Metric04], [Metric05], [Metric06]

,[Metric07], [Metric08], [Metric09], [Metric10], [Metric11], [Metric12]

,[Metric13], [Metric14], [Metric15], [Metric16], [Metric17], [Metric18]

,[Metric19], [Metric20], [Metric21], [Metric22], [Metric23], [Metric24]

,[Metric25], [Metric26], [Metric27], [Metric28], [Metric29], [Metric30]

,[Metric31], [Metric32], [Metric33], [Metric34], [Metric35], [Metric36]

,[Metric37], [Metric38], [Metric39], [Metric40], [Metric41], [Metric42]

,[Metric43], [Metric44], [Metric45], [Metric46], [Metric47]

)

VALUES (

@DupID,‘my text’,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

 ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

ABS(cast(CHECKSUM(NewId()) as bigint)) % @ModBase,

)

set @DupID += 1

end — group option loop

set nocount off

— ##############################################################

— Create indices on the test table

— ##############################################################

CREATE UNIQUE CLUSTERED INDEX [ci_RecID] ON [dbo].[FewDuplicates]

(

[IDCol] ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

CREATE NONCLUSTERED INDEX [ix_DupID] ON [dbo].[FewDuplicates]

(

DupID ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

CREATE NONCLUSTERED INDEX ix_CombinedIndex ON [dbo].[FewDuplicates]

(

[Metric43],

[Metric14]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric03],

[Metric04]

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Queries that Perform Varying Numbers of Record Deletions

To illustrate the issue, six queries were created that deleted various numbers of records and they are listed in Table 1. The deletions were designed to use the ix_CombinedIndex index, thereby emulating the client situation in which the optimal index for deletions was used. The only differences involved the numbers of records deleted and these are highlighted in the table below for easier comparison. The selected values were chosen so that small, medium, and large numbers of records would be deleted.

Blog_20170810_1

Table 1: All Six Deletion Queries

The plan for Query #4 with three indices is shown below in Figure 1 and it is fairly simple, as one would expect. Things become more complicated with additional indices, as discussed in the next section.

Blog_20170810_2

Figure 1: Query Plan for Deletion Query #4 with Three Indices

Indices that Cause Additional Deletion Overhead

To examine the effects of additional indices on the deletion queries, up to eight were added to the table using different keys. None of the indices was used for direct data access and the index definitions are shown below, followed by the commands to delete them. Only 4 of these indices were used for the 7-index test, whereas all 8 were used for the 11-index test.

— create indices

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber1 ON [dbo].[FewDuplicates]

(

[Metric01],

[Metric02]

)

INCLUDE (

[Metric03],

[Metric04],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber2 ON [dbo].[FewDuplicates]

(

[Metric03],

[Metric04]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber3 ON [dbo].[FewDuplicates]

(

[Metric05],

[Metric06]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber4 ON [dbo].[FewDuplicates]

(

[Metric07],

[Metric08]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber5 ON [dbo].[FewDuplicates]

(

[Metric09],

[Metric10]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber6 ON [dbo].[FewDuplicates]

(

[Metric11],

[Metric12]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber7 ON [dbo].[FewDuplicates]

(

[Metric13],

[Metric14]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric44]

)

CREATE NONCLUSTERED INDEX ix_AddedIndexNumber8 ON [dbo].[FewDuplicates]

(

[Metric15],

[Metric16]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric43],

[Metric14]

)

— drop indices

drop INDEX ix_AddedIndexNumber1 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber2 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber3 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber4 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber5 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber6 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber7 ON [dbo].[FewDuplicates]

drop INDEX ix_AddedIndexNumber8 ON [dbo].[FewDuplicates]

Figure 2 shows the plan for the same query that was shown in Figure 1, but this time with 11 indices on the table instead of 3. Clearly, the plan is MUCH more complex and the effects of the additional indices are obvious. This suggests what the test metrics will ultimately confirm: work grows demonstrably as the number of indices increases.

Blog_20170810_3

Blog_20170810_4

Figure 2: Query Plan for Deletion Query #4 with 11 Indices

Test Results

A total of 18 tests were conducted. Three index configurations were used with the following numbers of indices: 3, 7, and 11. Queries #1 – #6 were run against each configuration. As shown in Figure 3 through Figure 6, most of the metrics are comparable until the 100,000-record level is exceeded, at which point great divergence occurs. One of the most interesting findings involves writing, whose curve shape differs completely from those of the other metrics. These figures graphically illustrate why many deletions of small-to-moderate numbers of records are not often noticed as the number of indices increases. However, it also illustrates clearly how the size of the deletion and the number of indices can combine to negate the improvement provided by a specialty deletion index.

Blog_20170810_5

Figure 3: Deletion Queries – Duration (Seconds)

Blog_20170810_6

Figure 4: Deletion Queries – CPU (Seconds)

Blog_20170810_7

Figure 5: Deletion Queries – Reads

Blog_20170810_8

Figure 6: Deletion Queries – Writes

Conclusion

This article illustrated a situation in which the amount of work performed by deletion queries increased dramatically as the number of records to be deleted increased beyond approximately 100,000. Although this particular value is obviously applicable only to the test case, the overall message is clear – as the number of records to be deleted increases well beyond 100,000, the work performed by the query increases dramatically. In addition, this situation worsens considerably as the number of indices spanning a table increases. Therefore, although the overhead associated with deleting low numbers of records may not be noticed as indices are added, the performance of queries that delete large numbers of records will degrade noticeably as the number of indices increases. In some cases, the overhead may negate any improvement that might be gained by adding an index whose purpose is to expedite the deletion process. Therefore, before considering adding an index to improve deletion performance, insure that the batch of deleted records is not too large and the number of indices on the table is small.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

Handling Multiple Missing Index Recommendations for the Same Table

— By Jeff Schwartz

The Problem

Many articles concerning SQL Server missing index recommendations demonstrate the mechanics for obtaining them and often highlight whether the suggested key columns are used in equality or inequality relationships. Most of these examples emphasize a single index that can be implemented to improve performance. However, real-life situations often involve multiple or many suggested indices. For example, during a recent customer study, I observed 28 recommended indices for one table and 52 for another. Clearly, metrics such as improvement measure and user impact frequently can be used to determine the most important recommendations, but sometimes there are either too many similar recommendations – or – improvement measures are almost the same for several proposed indices, which makes determining an optimal index difficult. A large number of recommendations also often results in index proliferation, i.e., missing index recommendations are implemented piecemeal with no overall strategy resulting in tables with 14, 17, or 24 indices as the author observed in a recent customer performance study. The size of the table compounds this problem because it is especially desirable to limit the number of indices on these tables. For example, the table that had 24 indices on it contained over 30 million records.

The following examples of proposed indices that inspired this article illustrate the multiple recommendation phenomenon – RecIndex1: Keys (DateVal), Included Columns (Metric, ReptCat, LocationID, Total_Amount) and RecIndex2: Keys (LocationID, DateVal) Included Columns (Metric, ReptCat, Total_Amount). Clearly, without additional information, it is difficult to determine whether these recommendations must remain separate or could be combined into a single index. In this situation, knowing whether the proposed key columns are used in equality or inequality where clauses can be critical. This article will discuss how to use knowledge of equality and inequality relationships to determine an appropriate course of action.

Test Table Creation & Load

To determine missing index recommendation behavior, a generic table was constructed and filled with 20 million records. Each record contained an identity column, an ID column, a text column, and 47 metric columns whose values ranged between 1 and 10,000,000. The large number of table columns was used to insure SQL Server would choose an index option when appropriate. Six queries that incorporated various column combinations were executed (some of which differed only in column ordering). To minimize duplication of column values and skewing of query plans, the ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000 formula was used to generate values that were as random as possible. Two indices were created: a clustered index that used the identity column as its only key and a second nonclustered index that used DupID as its only column. The scripts for the creation, loading, and initial indexing of the table are shown below.

— ##############################################################

— Create test table

— ##############################################################

DROP TABLE FewDuplicates;

CREATE TABLE FewDuplicates (

IDCol bigint identity (20000000,1),

DupID bigint,

MyText varchar(10),

Metric01 bigint, Metric02 bigint, Metric03 bigint, Metric04 bigint,

Metric05 bigint, Metric06 bigint, Metric07 bigint, Metric08 bigint,

Metric09 bigint, Metric10 bigint, Metric11 bigint, Metric12 bigint,

Metric13 bigint, Metric14 bigint, Metric15 bigint, Metric16 bigint,

Metric17 bigint, Metric18 bigint, Metric19 bigint, Metric20 bigint,

Metric21 bigint, Metric22 bigint, Metric23 bigint, Metric24 bigint,

Metric25 bigint, Metric26 bigint, Metric27 bigint, Metric28 bigint,

Metric29 bigint, Metric30 bigint, Metric31 bigint, Metric32 bigint,

Metric33 bigint, Metric34 bigint, Metric35 bigint, Metric36 bigint,

Metric37 bigint, Metric38 bigint, Metric39 bigint, Metric40 bigint,

Metric41 bigint, Metric42 bigint, Metric43 bigint, Metric44 bigint,

Metric45 bigint, Metric46 bigint, Metric47 bigint

)

 

— ##############################################################

— Load original table

— ##############################################################

declare @DupID bigint = 1

declare @NumRecs bigint = 20000000

 

truncate table FewDuplicates

set nocount on

while (@DupID <= @NumRecs)

begin

insert into [dbo].[FewDuplicates] (

[DupID], [MyText],

[Metric01], [Metric02], [Metric03], [Metric04], [Metric05], [Metric06], [Metric07], [Metric08], [Metric09], [Metric10], [Metric11], [Metric12], [Metric13], [Metric14], [Metric15], [Metric16], [Metric17], [Metric18], [Metric19], [Metric20], [Metric21], [Metric22], [Metric23], [Metric24], [Metric25], [Metric26], [Metric27], [Metric28], [Metric29], [Metric30], [Metric31], [Metric32], [Metric33], [Metric34], [Metric35], [Metric36], [Metric37], [Metric38], [Metric39], [Metric40], [Metric41], [Metric42],

[Metric43], [Metric44], [Metric45], [Metric46], [Metric47]

)

VALUES (

@DupID,‘my text’,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000

)

set @DupID += 1

end — group option loop

set nocount off

 

— ##############################################################

— Create indices on the test table

— ##############################################################

CREATE UNIQUE CLUSTERED INDEX [ci_RecID] ON [dbo].[FewDuplicates]

(

[IDCol] ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

CREATE NONCLUSTERED INDEX [ix_DupID] ON [dbo].[FewDuplicates]

(

DupID ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Queries that Create Missing Index Recommendations

To illustrate the issue, two queries were created that were guaranteed to generate missing index recommendations as well as mimic the behavior of the ones cited in The Problem section. They are listed in Table 1 and the differences are highlighted for easier comparison. The query plans for the two queries are displayed in Table 2 and Table 3. Both queries performed full clustered index scans and generated missing index recommendations. The recommendations are shown below in two pieces within each table. The most important points are that Query Plan #1 specifies Metric14 first and Metric43 second, whereas Query Plan #2 specifies Metric43 alone with Metric14 as an included column. At first glance, these appear to be contradictory and potentially incompatible differences.

Blog_20170720_1

Blog_20170720_2

Blog_20170720_3

Table 4 summarizes the metrics, columns, and full index definitions suggested by SQL Server. Given the very large improvement measure values, the key ordering of the proposed indices, and the somewhat different included columns, it may be tempting to implement both indices despite the fact that the table contained 20 million records. However, more detailed analysis using the data in Table 5 shows that a single index that is constructed properly can accommodate both recommendations and, therefore, both queries. The most important metrics are displayed in column_usage because one uses two equality comparisons and the other only uses one. Therefore, if we specify the equality column used in both queries first and specify the equality/included column second, both queries will be satisfied. VERY IMPORTANT NOTE: Although the key and included column ordering appear obvious because of the column names used in this example table, i.e., suffixes in numerical order, when normal column names like DateVal or LocationID are used, ordering is much less obvious. As cited in my previous blog entitled Query Tuning and Missing Index Recommendations, when ordering is not crucial, e.g., when only equality operations or included columns are specified, SQL Server uses the ordering of the columns in the table itself rather than the ordering specified in the query.

In most cases SQL Server attempts to create covering indices, which are defined to be indices that contain all the columns of a particular query. Please reference the following web page for further information regarding covering indices: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-indexes-with-included-columns. In the author’s experience, implementing the keys of a suggested index wtihout the corresponding included columns often results in SQL Server ignoring the new index. Therefore, the included columns are vital to any missing index strategy. Clearly, a point of diminshing returns exists when the number of included columns approaches the total number of columns in the table (especially very large tables), but as long as the number of columns is reasonable, included columns should always be considered. [The queries to obtain the data shown in Table 4 and Table 5 are provided in Table 6.]

Blog_20170720_4

Blog_20170720_5

SELECT avg_total_user_cost * avg_user_impact * (user_seeks + user_scans)

as [Improvement Measure],

avg_user_impact as [Avg User Impact], user_seeks as Executions,

[statement] as TableName,

equality_columns as [Equality Usage], inequality_columns as [Inequality Usage], included_columns as [Included Columns],

‘Create Nonclustered Index ix_SQLRx_’ + PARSENAME([statement],1) + ‘_’ + CONVERT(varchar, group_handle) + ‘_’ + CONVERT(varchar, g.index_handle) + ‘ ON ‘ +

[statement] +‘ (‘ + ISNULL(replace(equality_columns,‘ ‘,),) +

CASE WHEN equality_columns IS NOT NULL AND inequality_columns IS NOT NULL THEN ‘,’ ELSE END +

ISNULL (replace(inequality_columns,‘ ‘,),) +‘)’ +

CASE WHEN included_columns IS NOT NULL THEN ‘ INCLUDE (‘ + included_columns + ‘)’ ELSE END AS [Create Index Statement]

FROM       sys.dm_db_missing_index_groups g

INNER JOIN   sys.dm_db_missing_index_group_stats s ON

s.group_handle = g.index_group_handle

INNER JOIN   sys.dm_db_missing_index_details d ON d.index_handle = g.index_handle

ORDER BY avg_total_user_cost * avg_user_impact * (user_seeks + user_scans) DESC;

 

SELECT statement AS [Table], column_id , column_name, column_usage,

migs.user_seeks as Executions, migs.avg_user_impact as [Avg User Impact]

FROM sys.dm_db_missing_index_details AS mid

CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle)

INNER JOIN sys.dm_db_missing_index_groups AS mig ON mig.index_handle = mid.index_handle

INNER JOIN sys.dm_db_missing_index_group_stats AS migs ON mig.index_group_handle = migs.group_handle

ORDER BY mig.index_group_handle, mig.index_handle, column_id

Blog_20170720_8

Table 7 contains the composite index that satisfies both recommendations and queries. It is important to note that Metric43 is the FIRST key and Metric14 is the second. If this ordering is not followed, Query #1 will perform a full scan. Reviewing both queries demonstrates the following:

  1. The key ordering enables the equality operators to apply in both cases.
  2. Having Metric14 as the second key satisfies both the equality condition and the included column condition.
  3. The other columns specified by the queries are supplied so the table data need never be referenced.

 

CREATE NONCLUSTERED INDEX ix_CombinedIndex ON [dbo].[FewDuplicates]

(

[Metric43],

[Metric14]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric03],

[Metric04]

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Blog_20170720_9

Table 8 and Table 9 contain the updated query plans, which illustrate the facts that the clustered index scans have been replaced by index seek operators and the clustered index is not accessed to satisfy either query. Note also that parallelism was present in the query plans shown in Table 2 and Table 3, but is absent in the new query plans displayed in Table 8 and Table 9.

Blog_20170720_6

Blog_20170720_7

Conclusion

This article illustrated a situation in which multiple missing index recommendations first appeared to necessitate separate indices, resulting in a single index implementation. Careful examination of the key relationships used in the queries and specified by the missing index recommendations enabled the author to develop one composite recommendation that enabled optimal query performance.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

 

 

Query Tuning and Missing Index Recommendations

–By Jeffry Schwartz

Overview

Most analysts are familiar with missing index recommendations provided by SQL Server when query plans are displayed within SSMS or reported by various missing index DMVs. Several questions arise concerning these recommendations:

  1. What determines whether a column becomes a key or included in the new index?
  2. Where does the ordering of the included columns come from? Does column ordering in queries affect the recommendations, e.g., will query orderings of A, B, C and C, B, A in the selection column list result in the one or two recommendations?
  3. How sophisticated are the missing index recommendations, i.e., does SQL Server compare recommendations and perform any kind of recommendation consolidation or optimization?

This article provides a reproducible example using six different queries that deliberately caused SQL Server to generate missing index recommendations and provide answers to these questions. The example is used to study missing index recommendations in detail, including how they relate to the underlying table and how query columns affect these recommendations. The article also illustrates how a single consolidated index can address the performance needs of all six queries.

Test Table Creation & Load

To determine missing index recommendation behavior, a generic table was constructed and filled with 20 million records. Each record contained an identity column, an ID column, a text column, and 47 metric columns whose values ranged between 1 and 10,000,000. The large number of table columns was used to insure SQL Server would choose an index option when appropriate. Six queries that incorporated various column combinations were executed (some of which differed only in column ordering). To minimize duplication of column values and skewing of query plans, the ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000 formula was used to generate values that were as random as possible. Two indices were created: a clustered index that used the identity column as its only key and a second nonclustered index that used DupID as its only column. The scripts for the creation, loading, and initial indexing of the table are shown below.

— ##############################################################

— Create test table

— ##############################################################

drop table FewDuplicates;

CREATE TABLE FewDuplicates (

IDCol bigint identity (20000000,1),

DupID bigint,

MyText varchar(10),

Metric01 bigint, Metric02 bigint, Metric03 bigint, Metric04 bigint, Metric05 bigint,

Metric06 bigint, Metric07 bigint, Metric08 bigint, Metric09 bigint, Metric10 bigint,

Metric11 bigint, Metric12 bigint, Metric13 bigint, Metric14 bigint, Metric15 bigint,

Metric16 bigint, Metric17 bigint, Metric18 bigint, Metric19 bigint, Metric20 bigint,

Metric21 bigint, Metric22 bigint, Metric23 bigint, Metric24 bigint, Metric25 bigint,

Metric26 bigint, Metric27 bigint, Metric28 bigint, Metric29 bigint, Metric30 bigint,

Metric31 bigint, Metric32 bigint, Metric33 bigint, Metric34 bigint, Metric35 bigint,

Metric36 bigint, Metric37 bigint, Metric38 bigint, Metric39 bigint, Metric40 bigint,

Metric41 bigint, Metric42 bigint, Metric43 bigint, Metric44 bigint, Metric45 bigint,

Metric46 bigint, Metric47 bigint

)

 

— ##############################################################

— Load original table

— ##############################################################

declare @DupID bigint = 1

declare @NumRecs bigint = 20000000

 

truncate table FewDuplicates

set nocount on

while (@DupID <= @NumRecs)

begin

insert into [dbo].[FewDuplicates] (

[DupID], [MyText],

[Metric01], [Metric02], [Metric03], [Metric04], [Metric05], [Metric06], [Metric07],

[Metric08], [Metric09], [Metric10], [Metric11], [Metric12], [Metric13], [Metric14],

[Metric15], [Metric16], [Metric17], [Metric18], [Metric19], [Metric20], [Metric21],

[Metric22], [Metric23], [Metric24], [Metric25], [Metric26], [Metric27], [Metric28],

[Metric29], [Metric30], [Metric31], [Metric32], [Metric33], [Metric34], [Metric35],

[Metric36], [Metric37], [Metric38], [Metric39], [Metric40], [Metric41], [Metric42],

[Metric43], [Metric44], [Metric45], [Metric46], [Metric47]

)

VALUES (

@DupID,‘my text’,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000

)

set @DupID += 1

end — group option loop

set nocount off

 

— ##############################################################

— Create indices on the test table

— ##############################################################

CREATE UNIQUE CLUSTERED INDEX [ci_RecID] ON [dbo].[FewDuplicates]

(

[IDCol] ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

CREATE NONCLUSTERED INDEX [ix_DupID] ON [dbo].[FewDuplicates]

(

DupID ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

Queries & Execution Plans with Missing Index Recommendations

The six queries all performed a range scan based upon DupID and Metric01. Clearly, the ix_DupID index could be used for the first portion of the where clause, but none of the existing indices could assist in the resolution of the second portion of the where clause. Note: All queries used identical where clauses to insure that the only query differences involved the columns that were requested in the select clause. These variations employed different combinations of the first six metric columns included a variety of column orderings. All of the queries returned the same 809 rows. Note: Due to the random nature of the data, the reader will not obtain identical results, but they will be functionally similar. Each query was run separately after the following commands had been executed:

dbcc dropcleanbuffers with no_infomsgs

dbcc freeproccache with no_infomsgs

dbcc freesystemcache(‘TokenAndPermUserStore’) with no_infomsgs

These commands clear all the pertinent caches to insure reproducibility and prevent memory-resident portions of the database from skewing the results. Each query required approximately 90 seconds to execute as shown in Table 1. Each query required approximately one million logical and physical reads to complete. The SSMS option for returning the actual execution plan was set prior to execution, and the six queries and execution plans are shown in the individual query sections below. The first five queries resulted in different recommendations, but the sixth query’s plan and recommended missing index were identical to that of the fifth because the only difference between Queries #5 and #6 is the ordering of the columns (part of the answer to question #2). The differences among all the queries are summarized in Table 2 below.

Blog_20170601_T1

Table 1: Summary of Initial Query Executions & Timings

Blog_20170601_T2

Table 2: Summary of Query Selection Columns

Query #1

select [DupID], [MyText], [Metric01]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_1

Query #2

select [DupID], [MyText], [Metric02]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_2

Query #3

select [DupID], [MyText], [Metric03], [Metric06], [Metric04]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_3

Query #4

select [DupID], [MyText], [Metric04], [Metric06], [Metric02], [Metric01]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_4

Query #5

select [DupID], [MyText], [Metric01], [Metric03], [Metric05], [Metric02]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_5

Query #6

select [DupID], [MyText], [Metric05], [Metric02], [Metric01], [Metric03]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_6

Missing Index Recommendations Displayed via DMVs

The following query was used to provide some of the information enumerated in Table 3 below:

— ##############################################################

— Missing Index DMV Query

https://blogs.msdn.microsoft.com/bartd/2007/07/19/are-you-using-sqls-missing-index-dmvs/

— ##############################################################

SELECT migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,

‘CREATE INDEX [missing_index_’ + CONVERT (varchar, mig.index_group_handle) + ‘_’ + CONVERT (varchar, mid.index_handle)

+ ‘_’ + LEFT (PARSENAME(mid.statement, 1), 32) + ‘]’

+ ‘ ON ‘ + mid.statement

+ ‘ (‘ + ISNULL (mid.equality_columns,)

+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ‘,’ ELSE END

+ ISNULL (mid.inequality_columns,)

+ ‘)’

+ ISNULL (‘ INCLUDE (‘ + mid.included_columns + ‘)’, ) AS create_index_statement,

migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

The following table lists the missing index recommendations as displayed by the DMVs. It also summarizes all of the missing index recommendations to make it easier to determine what a consolidated index might require. It is important to note that the missing index recommendations for Query #5 and #6 are the same, just as they were in the query plan listings above. Although discussion of the calculation and interpretation of the Improvement Measure column is beyond the scope of this article, it should be noted that the total improvement measure was approximately twice that of any other index recommendation and further investigation reveals that this index would satisfy both Query #5 and #6.

Blog_20170601_T3

Table 3: Summary of All Missing Index Recommendations

Composite Index Recommendation for All Six Queries

Either Table 2 or Table 3 could be used to determine a composite index. In-depth investigation reveals that since DupID and Metric01 are specified in the where clauses as inequalities, these should be key columns in any index, and review of Table 3 highlights this for all index recommendations. Since MyText is used in all of the queries and Metric01 is to be used as a key column, the only remaining columns are Metric02, Metric03, Metric04, Metric05, and Metric06. An index that employs these columns as included columns can “cover” all the queries shown in this article. For further information about covering indices, please refer to the following article: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-indexes-with-included-columns. The composite (and covering) index is shown below:

CREATE NONCLUSTERED INDEX ix_ResolveMultipleQueriesNeeds ON [dbo].[FewDuplicates] (

[DupID],

[Metric01]

)

INCLUDE (

[MyText],

[Metric02],

[Metric03],

[Metric04],

[Metric05],

[Metric06]

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Query Plans for Queries #1 – #6 with Composite Index Added

Once the combined index was implemented, each query was run as before: caches cleared and SSMS actual execution plan option on. No query ran longer than one second even when the caches had been purged. Although the queries were run individually for the purposes of obtaining accurate timings, they were also run together for the purpose of showing the similarities of the execution plans with the new index in place. Table 4 lists the execution results and clearly, performance improved greatly for all queries. For example, durations decreased from approximately 90 seconds to about one second and reads dropped from approximately one million to four thousand. Examination of the query plans shows that all six queries used the new index to obtain the observed extremely fast performance.

Blog_20170601_T4

Table 4: Summary of Query Executions & Timings After Composite Index Added

Blog_20170601_7

Summary

Most readers probably already knew the answer to the first question: key columns are used to filter query results. The simplest example of this involves a where clause. Columns referenced by equalities are placed first, followed by columns that are used with inequalities. Some of the missing index DMVs actually differentiate between the types of relationships involved. The example queries show that table column ordering dictates included column order thereby answering question #2. Since table column ordering is used to specify included column ordering, SQL Server WILL NOT duplicate missing index recommendations when the columns are identical regardless of their ordering in the queries. The answer to the last question is that SQL Server does not appear to perform any consolidations or optimizations other than those cited in the answer to question #2. This knowledge is invaluable, particularly when evaluating missing index recommendations on production systems because unless the queries that cause SQL Server to generate the missing index recommendations have been executed recently, there is little information to associate the actual queries with the recommendations. Understanding those factors that affect SQL Server’s recommendations can simplify causing query identification and give an analyst more certainty that the correct associations have been made.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

Common Criteria Compliance and LCK_M_SCH_M

–By Lori Brown

If you have ever wondered what happens when you mistakenly enable Common Criteria compliance in SQL, well wonder no more!!

Blog_20170427_1

We have a client who has no idea how or when Common Criteria was enabled on their production system. All they know is that performance has been slowly degrading. After collecting performance data, we found that there were high LCK_M_SCH_M waits which is a schema modification lock that prevents access to a table while a DDL operation occurs. We also found blocked process records where a LOGIN_STATS table in the master database was waiting a lot. This table is used to hold login statistics. When there are a lot of logins and outs there can be contention in this table.

When you enable Common Criteria compliance, something called Residual Information Protection (RIP) is enabled. RIP is an additional security measure for memory and it makes it so that in memory a specific bit pattern must be present before memory can be reallocated(overwritten) to a new resource or login. So with lots of logins and outs, there is a performance hit in memory because overwriting the memory allocation has to be done.

Keep in mind if you enable Common Criteria compliance, you can run into slowdowns from locking and memory. Make sure that your server is able to handle this well and that applications are designed to minimize the impact of high logins\logouts.

To disable Common Criteria compliance you can use sp_configure or the GUI.

sp_configure ‘common criteria compliance enabled’, 0

GO

RECONFIGURE

GO

However, it is not really disabled until you reboot the server (it actually says to reboot the server in MSDN). Restarting the instance will not work for this configuration change. I believe that this is because of the Residual Information Protection that secures memory. It stands to reason that without a reboot that flushes memory that RIP is still doing its work and causing performance issues.

Check out the documentation from MSDN on Common Criteria compliance here:

https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/common-criteria-compliance-enabled-server-configuration-option

The biggest takeaway from this is to make sure to change your SQL configuration with good reason. Always know the side effects that can show up when you set any of the advanced configuration options.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

List Tables That May Be Over Indexed

— By Lori Brown

While not having enough indexes can be bad for query performance, having too many indexes can also be just as bad. Use the query below to get a list of tables in your database that has more than 10 indexes.

— Tables with large number of indexes

select t.name as TablesWithLargeNumInx, count(i.name) as CountIndexes

from sys.indexes i

inner join sys.tables t

on i.object_id = t.object_id

where i.index_id > 0

group by t.name

having count(i.name) > 10

If you suspect that you have too many indexes on your tables, you can also check the sys.dm_db_index_usage_stats dynamic management view to know if indexes on your heavily indexed tables are being used well. (Hint: seeks are good and scans are not so much)

select u.user_seeks, u.user_lookups, u.user_scans

from sys.dm_db_index_usage_stats u

inner join sys.indexes i

on u.object_id = i.object_id and u.index_id = i.index_id

WHERE u.object_id=object_id(‘dbo.SomeTableName’)

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-index-usage-stats-transact-sql

Query Tuning – Adding Included Columns To An Existing Index

— by Lori Brown @SQLSupahStah

For the most part, I stick to my daily DBA duties of making sure that SQL Server maintenance is running and that the proverbial lights are on in all my SQL boxes. However, sometimes it is fun to do some tuning and it’s even better when you have a success.

I have a SQL Server that hosts some Microsoft ERP databases. There was a problem with some slow performance and after collecting some performance data, I narrowed the root cause down to a view (vp_03400Exception1) that was used in a stored procedure. Out of the 15 million reads performed by the proc when I captured it, 12 million were directly from the offending view. In checking the execution plan for the view I found that a large table (APTran) was scanned in one spot and there were key lookups going on in other places (APTran & APDoc).

Bad stuff in bad plan…thick arrow indicates that almost 4 million records were scanned

Blog_20170323_1

Key lookups are not good (**on small tables these might not matter but on large tables they do) and should be eliminated if possible. Key lookups occur when you have an index seek against a table, but your query wants extra columns that are not in that index. This causes SQL Server to have to go back and get those extra columns. Lots of trips back to get extra columns results in slow performance. Of course scans are not good either so I wanted to try to get rid of that too.

Since the key lookups have an output list in the execution plan that lists what extra columns were needed, I took a look and simply added those columns as included columns on the existing indexes that were being used in the plan. In effect, I created a covering index using included columns. The included columns are only used when specific queries need it. If other queries use the index with included columns but doesn’t actually need them, those won’t be pulled in.

Here are the modified indexes with the included columns.

Blog_20170323_4

With the new columns included in the indexes, index seeks with no key lookups are now being performed.  The thick arrow has disappeared because instead of needing to scan the whole APTran table, it knows to return only 4 records.  This means that the rest of the query does not have to perform anywhere near as much work as it did before.  See the new execution plan below:

Much better stuff in more efficient plan…small arrow indicates that only 4 records were needed from APTran so much less work is being done.

Blog_20170323_5

Below you can see that overall the new plan uses less memory, is smaller and just plain old more efficient.

Blog_20170323_6

Boom!! Mic drop!!

Blog_20170323_7

You might be thinking that I should not celebrate since my query only returned 1 row. However, what if the results were hundreds or thousands or millions of rows? The work done by the old execution plan would be significant and the users would feel it. So, what’s the bottom line here? If you have queries accessing large tables, check execution plans to see if they are being accessed efficiently. Sometimes a small change to an existing index will work wonders. Sometimes you have to create a new index to make queries efficient. The execution plan might just tell you what it needs if you know where to look.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

SQL 2016 SP1 USE HINT

–By Lori Brown   @SQLSupahStah

After installing Service Pack 1 for SQL 2016, I ran across USE HINT and thought I would put out what I could find to document it. Here’s where you find info from Microsoft: https://msdn.microsoft.com/en-us/library/ms181714.aspx and here is a link for all of the bugs that are fixed in SP1: https://support.microsoft.com/en-us/kb/3182545

USE HINT ( hint_name ) Provides one or more additional hints to the query processor as specified by a hint name inside single quotation marks. Hint names are case-insensitive. USE HINT can be utilized without having to be a member of the sysadmin server role.

The following hint names are supported:

  • ‘DISABLE_OPTIMIZED_NESTED_LOOP’ Instructs the query processor not to use a sort operation (batch sort) for optimized nested loop joins when generating a query plan. Equivalent to trace flag 2340.
  • ‘FORCE_LEGACY_CARDINALITY_ESTIMATION’ Forces the query optimizer to use Cardinality Estimation model of SQL Server 2012 and earlier versions. Equivalent to trace flag 9481 or Database Scoped Configuration setting LEGACY_CARDINALITY_ESTIMATION=ON.
  • ‘ENABLE_QUERY_OPTIMIZER_HOTFIXES’ Enables query optimizer hotfixes (changes released in SQL Server Cumulative Updates and Service Packs). Equivalent to trace flag 4199 or Database Scoped Configuration setting QUERY_OPTIMIZER_HOTFIXES=ON.
  • ‘DISABLE_PARAMETER_SNIFFING’ Instructs query optimizer to use average data distribution while compiling a query with one or more parameters, making the query plan independent of the parameter value which was first used when the query was compiled. Equivalent to trace flag 4136 or Database Scoped Configuration setting PARAMETER_SNIFFING=OFF.
  • ‘ASSUME_MIN_SELECTIVITY_FOR_FILTER_ESTIMATES’ Causes SQL Server to generate a plan using minimum selectivity when estimating AND predicates for filters to account for correlation. Equivalent to trace flag 4137 when used with cardinality estimation model of SQL Server 2012 and earlier versions, and has similar effect when trace flag 9471 is used with cardinality estimation model of SQL Server 2014 or higher.
  • ‘DISABLE_OPTIMIZER_ROWGOAL’ Causes SQL Server to generate a plan that does not use row goal adjustments with queries that contain TOP, OPTION (FAST N), IN, or EXISTS keywords. Equivalent to trace flag 4138.
  • ‘ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS’ Enables automatically generated quick statistics (histogram amendment) for any leading index column for which cardinality estimation is needed. The histogram used to estimate cardinality will be adjusted at query compile time to account for actual maximum or minimum value of this column. Equivalent to trace flag 4139.
  • ‘ASSUME_JOIN_PREDICATE_DEPENDS_ON_FILTERS’ Causes SQL Server to generate a query plan using the Simple Containment assumption instead of the default Base Containment assumption for joins, under the query optimizer Cardinality Estimation model of SQL Server 2014 or newer. Equivalent to trace flag 9476.
  • ‘FORCE_DEFAULT_CARDINALITY_ESTIMATION’ Forces the Query Optimizer to use Cardinality Estimation model that corresponds to the current database compatibility level. Use this hint to override Database Scoped Configuration setting LEGACY_CARDINALITY_ESTIMATION=ON or trace flag 9481.

The list of all supported USE HINT names can be queried using the dynamic management view sys.dm_exec_valid_use_hints. More this view can be found here: https://msdn.microsoft.com/en-us/library/mt791356.aspx

Some USE HINT hints may conflict with trace flags enabled at the global or session level, or database scoped configuration settings. In this case, the query level hint (USE HINT) always takes precedence. If a USE HINT conflicts with another query hint or a trace flag enabled at the query level (such as by QUERYTRACEON), SQL Server will generate an error when trying to execute the query.

Example:

DECLARE @qty INT

SET @qty = 4

SELECT h.OrderDate, h.Freight, d.OrderQty, d.UnitPrice

FROM Sales.SalesOrderDetail d

JOIN Sales.SalesOrderHeader h ON (d.SalesOrderID = h.SalesOrderID)

WHERE d.OrderQty > @qty

OPTION (USE HINT(‘DISABLE_PARAMETER_SNIFFING’,‘DISABLE_OPTIMIZED_NESTED_LOOP’));

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

Table-Valued Parameter Performance Using C# Data Tables

The following testing and resulting data is a direct result of a recent issue that was discovered at one of our client sites.  After setting up SQL monitoring that does use a SQL trace to capture long running queries, suddenly certain processes would lock up and be unable to complete.  Initially we thought it was everything except the SQL trace causing the issue until we finally turned off all trace captures and the problem went away.  This was very concerning to us since using either SQL trace or XEvents is the method that we generally use to capture query performance. Without a fix, our performance tuning attempts would be severely hamstrung and slowed down. This problem had to be fixed yesterday!

Another side effect that would happen when things locked up was that despite having the max server memory set, SQL would end up using substantially more memory that was allocated to it.  At this client, developers were importing data from an application using C# and table-valued parameters which should have not been an issue. After a lot of searching, we finally came across some comments by another SQL expert who mostly described our issue and recommended that any C# TVP’s should have their columns defined or SQL could lock up if either trace or XEvents Completed Events are captured.  When we checked the C# code we found that the string columns that were being imported were not defined in an effort to make the import code flexible.  We were able definitively fix this very obscure problem with just one line of C# code that is in the below post.  The end result is a FASTER import of data that can be captured in trace or XEvents.  Nice!!

Since we had to put together this fix for a very undocumented issue, we sincerely hope that any DBA or developer who runs into this problem can correct it much faster than we did.  A link to download a zip file of all code displayed in this post is included at the end.  Enjoy!

********************************************

Table-Valued Parameter Performance Using C# Data Tables

— By Jeff Schwartz

Overview

Much has been written about the use of table-valued parameters (TVP) and their performance. These narratives appear to be contradictory because some state that TVPs should be used only for small datasets, whereas others assert that TVPs scale very well. Adding to the confusion is that the fact that most of the performance-related discussions focused on TVP use within T-SQL. Many of these examples employ either SQL trace or Extended Events (XEvents) to evaluate performance, but almost none of them has discussed what can occur when C# data tables and SQL trace or XEvents are used simultaneously. This paper details the analysis and resolution of a situation that occurred at a customer site when an existing C# application suddenly began to run extremely slowly after a very lightweight SQL trace that utilized a several-minute duration threshold and captured no frequent events was activated.

Initial Research

Preliminary research indicated that under very specific conditions the combination of TVPs and SQL trace could result in unintended performance consequences, e.g., excessive SQL Server memory consumption to the point where SQL Server exhausted all memory on the server regardless of the maximum memory setting value. The research also suggested that string usage and lengths in unspecified locations might have an impact on this issue. The question of whether TVPs might have some record-count performance threshold above which performance would degrade was also examined. Although bulk insert was not used by the application, the research team also decided to compare bulk and TVP insertion speeds because that is another source of conflicting information.

Since very little literature exists that discusses the combination of TVPs and SQL trace, the only recourse involved creating test frameworks, performing actual data loads, and using internal application timers in conjunction with either SQL trace or XEvents to monitor performance.

Test Frameworks

1.Two frameworks were constructed: T-SQL-only and C# calling a T-SQL stored procedure.

a. The T-SQL-only test loaded data into a TVP, which then called a stored procedure to perform the insertion into the final table.

b. The development team provided C# code that illustrated how the user application read a flat file into a C# data table, which was then passed as a TVP to a T-SQL stored procedure. This code was followed as closely as possible in the test framework.

2.A SQL server table containing data captured from the sys.dm_db_index_operational_stats Data Management View (DMV) supplied the data for testing because

a. The data table contained approximately 3.2 million records, thereby being large enough to stress the application code and SQL Server adequately. The largest data file provided by the application team was approximately 1.6 million records, so the testing could push well beyond current record levels.

b. Each data record contained 47 columns, which was wider in terms of data columns than any application table being loaded.

c. No string value columns existed in the data. This insured that user string data could not affect any interaction between the C# application and SQL Server, and that any string-related behavior was strictly due to the manner in which the application loaded the data table or the TVP, or the way the TVP parameter-passing was recorded in the SQL trace. All data columns were smallint, int, bigint, or date. This also insured exact record lengths so that any record count-related behavior was independent of record size. Some of the data files being loaded by the development team contained only fixed-width data types, whereas others contained at least a few variable length string columns.

3.The frameworks were designed to load various numbers of records using the same starting point to determine whether a performance threshold existed for TVPs, regardless of whether SQL trace (or XEvents) was active, as well as to determine whether even small numbers of records could be processed with the SQL trace (or XEvents) active.

4.Thirty-one different record levels, shown in Table 1, were used for testing to insure any record-count gaps were small. A separate CSV file of each length was created from the data cited in #2, always beginning with the first record.

Blog_20160901_1

Table 1: Record Counts used in Testing

5.Several methods of loading the flat file CSV data into C# data tables were developed and tested. The C# data table configurations were as follows:

a. Unlimited string length columns (initially used in the production C# application) – herein noted as NoStringMax (String #N/A on graphs)

b. Fixed length string columns of at most 30, 100, 500, or 1000 characters each – herein noted as StringMax (String <#> on graphs). Note: the development team stated that no column would exceed 1,000 characters in length.

c. Exact data type mapping so that the data table column types matched those of the receiving SQL Server table exactly – herein noted as Map

d. A common routine was used by options a, b, and c above to send the data table to SQL Server using a TVP.

e. The section of code that loaded the data table from the flat file was timed separately from the routine cited in d, which also was surrounded by its own timers. This enabled comparison of data load, TVP passing, and combined times.

f. The name of the CSV file is passed in as a parameter along with an adjusted version of the file name so the appropriate test identifiers can be written to the application log file for later analysis.

6.The tests were run with the following SQL Server monitoring options:

a. No SQL Server monitoring at all.

b. SQL trace monitoring including all normally captured events except for completed ones. These were handled as follows:

i.Exclude all completed events

ii.Include ad-hoc and batch completed events with a two-second duration threshold

iii.Include ad-hoc and batch completed events as well as completed statement events with a two-second duration threshold

c. SQL Extended Events (XEvents) including all normally captured events except for completed ones. These were handled as follows:

i.Exclude all completed events

ii.Include ad-hoc and batch completed events with a two-duration threshold

iii.Include ad-hoc and batch completed events as well as completed statement events with a two-duration threshold

7.All tests were performed on the same server to eliminate network and hardware variability. This server had a quad-core Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz processor, 32 GB of RAM, and a 500 GB SSD HD.

8.Various combinations of cold and warm SQL Server caches, cold and warm Windows caches, and SQL Server maximum memory limit sizes were tested. Some of the earlier attempts will be discussed in the Results section below. The final specifications were as follows:

a. No applications other than SQL Server, SSMS, and the C# application were active on the test server.

b. SQL Server’s maximum memory limit was set to 12 GB. This allowed SQL to go over its limit without distorting overall system behavior by exhausting server memory entirely.

c. Hot Windows file cache, i.e., a full run that loaded ALL of the flat file data into RAM was performed before the timing tests were run. This insured that the HD was involved minimally.

d. Hot SQL Server buffer cache, achieved by running the next series soon after the last run completed (with no intervening runs), and by running each sequence four times in immediate succession.

C# Code

The NoStringMax routine cited in #5a in the previous section is shown in Figure 1.

Blog_20160901_2

Figure 1: Test C# Data Table Load Routine Using No String Length Maximums

The StringMax routine cited in #5b in the previous section is shown in Figure 2. The ONLY difference between this routine and the previous one is the addition of the MaxLength assignment that is highlighted below. This single line sets the maximum length of each data table column to the value of the iMaxStringLen parameter. As cited in #5b, this value ranged from 30 to 1,000 during testing.

Blog_20160901_3

Figure 2: Test C# Data Table Load Routine Using String Length Maximums

The Map routine cited in #5c in the previous section is shown in Figure 3 through Figure 6. Logically, this code is identical to that of #5a and #5b, but the individual column mappings make the routine much longer. Note: since no strings are used for the data table columns, the MaxLength parameter is unnecessary.

Blog_20160901_4

Figure 3: Test C# Data Table Load Routine Using Data Type Mapping – Part 1

Blog_20160901_5

Figure 4: Test C# Data Table Load Routine Using Data Type Mapping – Part 2

Blog_20160901_6

Figure 5: Test C# Data Table Load Routine Using Data Type Mapping – Part 3

Blog_20160901_7

Figure 6: Test C# Data Table Load Routine Using Data Type Mapping – Part 4

The routine cited in #5d in the previous section is shown in Figure 7.

Blog_20160901_8

Figure 7: C# Test Routine that Sends Data Table to SQL Server Using a TVP

Results

1.The T-SQL code alone, cited in Test Frameworks #1a did not recreate the problem observed at the customer site when either SQL trace or Extended Events (XEvents) were used, so additional testing was unnecessary.

2.No problems occurred when the C# application was used in conjunction with SQL trace or XEvents as long as no completed events were captured. Therefore, capturing only specific trace events created the problems.

3.Further research showed that when the C# code was running, adding ad-hoc or batch completed events to SQL traces or XEvent sessions caused issues. Adding completed statement events did not change things appreciably.

4.Interestingly, the extreme misbehavior was triggered by the combination of using NoStringMax C# code and having SQL trace or XEvents capture ad-hoc or batch completed events. Although the StringMax and Map routines ran a little more slowly for certain record count levels when SQL trace or XEvents captured ad-hoc or batch completed events, the dramatic memory consumption issues did not occur at all when these methods were used.

5.Initially, testing employed a 28 GB maximum SQL Server memory setting, but as occurred at the customer site, when the problem arose, SQL Server consumed all the memory on the server, which caused the server fundamentally to stop working. Using this setting and the NoStringMax C# code, CSV files with record counts up to 25,000 could be processed without taking hours, failing, or causing the server to run completely out of memory. However, the application could not reach the 50,000 record level.

6.Since testing needed to process MUCH higher numbers of records, the decision was made to reduce SQL Server memory to 4 GB in an attempt to provide additional memory space for SQL Server memory overflow. Another reason for lowering the SQL Server memory allocation was to insure that the tests consumed all of SQL Server’s buffer pool and that SQL Server had to operate against memory limits. This consideration, as well as the run-time variations, necessitated the four successive iterations for each combination.

7.Unfortunately, using the 4 GB setting, the lower-end NoStringMax C# code runs while capturing SQL trace ad-hoc or batch completed events caused application failures at approximately the 4,000 or 8,000 record levels. This clearly indicated an increase in SQL Server memory was required, so it was increased to 8 GB. Although the runs processed more records, they still crashed well before the 25,000 record level. Finally, moving the memory limit up to 12 GB enabled everything to run as before, effectively mimicking the 28 GB testing without running the risk of exhausting Windows memory.

8.Figure 8 highlights the NoStringMax C# code runs through the 25,000 record level while SQL trace or XEvents ad-hoc or batch completed events were being captured. It is evident that the run times were unacceptable. When the same test levels are viewed in Figure 9, the problem becomes abundantly clear. With SQL trace or XEvents capturing ad-hoc or batch completed events, the run times for the 10,000 record tests using the NoStringMax C# code ranged between 483 and 584 seconds. When ad-hoc or batch completed events were not captured, the run times were approximately 0.14 SECONDS! The values for all other tests, including NoStringMax C# code with SQL trace and XEvents ad-hoc or batch completed events off, are shown in Figure 9.

Blog_20160901_9

Figure 8: TVP Load Timing when Trace On or XEvents on

9.Although many lines appear in Figure 9, three groupings exist and these are the most important. Group 1, which consists of dotted lines, includes all tests during which ad-hoc or batch completed events were not captured. Group 2, which consists of solid lines, includes all tests during which SQL trace captured ad-hoc or batch completed events. Group 3, which consists of dashed lines, includes all tests during which XEvents captured ad-hoc or batch completed events. Important note: The NoStringMax runs shown in Figure 8 are NOT in Figure 9 because of scaling considerations. Figure 9 highlights several notable issues. Most importantly, once a maximum string length is specified, performance improves even when SQL trace or XEvents ad-hoc or batch completed events are not captured. In addition, the terrible performance problems go away. Interestingly, the behaviors of the various StringMax and Map runs were almost identical through approximately 175,000 records and then again at about 800,000 records. In between, unexplained divergence occurs, but it is only a few seconds and occurred primarily between the 200,000 and 700,000 record levels. The pattern shown in Figure 9 was repeated in every test sequence conducted. It is noteworthy that the StringMax and Map versions of the routine outperformed the NoStringMax under all comparable trace or XEvent-related conditions, and at higher volumes, even when completed events were captured.

Blog_20160901_10

Figure 9: TVP Load Timing when Trace/XEvents off OR Trace or XEvents on and Mapped Data Table or Limited-Length String Values Used

10.Once the number of records exceeeded two million, the load of the C# data table began to crash due to application memory exhaustion. Note: neither Windows nor SQL Server memory was consumed excessively during this phase, so the limitation was strictly application-related.

11.Figure 10 summarizes the overall insertion rates of all previously cited methods at the two million record level, in addition to the bulk insert method. This graph shows clearly that the worst performer was the unlimited string method, and that the best one was the bulk insert method. Note: these rates exclude the reading of the flat file into the C# data table as well as the creation of the file that was suitable for bulk insertion.

Blog_20160901_11

Figure 10: TVP versus Bulk Insert Records per Second

Conclusions

The use of unlimited-length C# strings with data tables not only performs worse without capturing any SQL trace or XEvents ad-hoc or batch completed events, it performs dreadfully when they are. Their use can cause poor overall server performance if the server’s memory is exhausted. Finally, their use prevents the ability to monitor SQL Server query performance using either SQL trace or XEvents. However, when a maximum string length is specified, regardless of its length, performance without SQL trace or XEvents improved and the problematic interaction with SQL trace or XEvents was mitigated almost completely. Since the Map method is inefficient and limiting from an application development perspective and performance was not substantially better than with limited-length strings, its use does not appear to be beneficial.

No table-valued parameter scalability issues were observed with the possible exception of the 200,000 to 700,000 record range when performance inexplicably, and consistently, dipped. However, from 800,000 to 2 million, performance experienced no such dip. Interestingly, standard bulk insert performance exceeded that of TVPs when the same data was used.

Recommendations

Since the code to implement the maximum-length string involves only one line and the performance monitoring and improvement benefits are substantial, the <data table name>.MaxLength = <n> statement should be included in data table load routines as shown by the highlighted line in Figure 2. <n> should be the maximum expected length of any data column. Limiting string length will allow C# TVP data loads to run more efficiently and also enable customers to conduct normal query performance monitoring.

********************************************

The code shown in this post can be downloaded from here….  SQLRX_TVP_CSharp_Code

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

SQLRx Presenting at SQL Saturday Baton Rouge 2016

SQLRx (@SQLRX) is proud to announce that two of our resident SQL Experts will be presenting sessions at SQL Saturday #515 in Baton Rouge, LA on August 6, 2016. If you have not signed up to attend, you are going to miss out!!

Blog_20160804_4

http://www.sqlsaturday.com/515/eventhome.aspx

Lori Brown (@SQLSupahStah) will give a lesson on how to be an Actively Proactive DBA. Come have fun and learn at her session, Actively Proactive – Straightforward Ways to Manage and Diagnose SQL. Attendees will be taken through the things a proactive DBA does and will also be treated to some quick ways to troubleshoot like a SQL Super Star!

Blog_20160804_1

Scripts will be given out to all who download the presentation materials. Come and get ‘em!!

Blog_20160804_2

Jeff Schwartz will presenting Rx for Demystifying Index Tuning Decisions. Jeff will cover ways to find poor performing queries and tuning them using various methods including execution plans, usage statistics and missing indexes.

Blog_20160804_3

Since Jeff has been tuning things for almost 40 years, he can show you some things that will be extremely useful and will help you tune like a pro.

If you are in Baton Rouge this weekend come by and visit with us! SQLRx is a sponsor of SQL Saturday Baton Rouge 2016 and will have a booth where you can come sit down with Lori or Jeff and talk tech. We will also be handing out goodies as well as giving away a 2 TB external drive at the end of the day. Also, one lucky attendee will win a free Ultra Light Performance Assessment from SQLRx at one of our presenter’s sessions. We will tell that lucky winner about their SQL performance for free….Winner, Winner, Chicken Dinner!!

Hope to see you in Baton Rouge!