Category Archives: Query Tuning

Get Index Column Info with Includes for One or Many Tables

— by Lori Brown @SQLSupahStah

I was recently working with one of my clients on some low hanging fruit type of query tuning. We had checked the cache for plans with missing index warnings in them and were trying to see if we could tweak and existing index or add a new index to speed things up. If you ever work with missing indexes, you surely have seen it recommend crazy things, duplicates or existing indexes or it wants something that can be added to an existing index.

The bottom line for missing index recommendations is that you should NEVER, EVER create a missing index unless you know for absolutely sure that the index does not exist and that it will really help a known performance issue. For instance, you can have missing index warnings on small tables that only return a couple of rows to the query. Those are usually not worth working on since scanning a small table can usually be done very quickly by SQL. To this day, I still find many databases that are way over indexed with indexes that were implemented simply because someone found a missing index warning and did not do their homework or they ran the dreaded Database Tuning Advisor which shoved a bunch of duplicate indexes into tables.

If you are wondering how to get a list of missing indexes, please check out Jeff Schwartz’s blog post on how to do this. (https://blog.sqlrx.com/2017/06/02/query-tuning-and-missing-index-recommendations/ ) This will give you a place to start. It is better if you know what query is throwing the missing index warning so it is a good idea to collect those either in a trace or extended events. Jeff builds on his first post and in his second post on the subject (https://blog.sqlrx.com/2017/07/20/handling-multiple-missing-index-recommendations-for-the-same-table/ ) also goes over the fun of having multiple missing index recommendations for a single table and how to deal with them.

Here’s a handy set of links for some of Jeff’s great index tuning work that you should really check out:

https://blog.sqlrx.com/2016/01/28/sql-server-20122014-extended-events-for-developers-part-1/

https://blog.sqlrx.com/2017/06/02/query-tuning-and-missing-index-recommendations/

https://blog.sqlrx.com/2017/07/20/handling-multiple-missing-index-recommendations-for-the-same-table/

https://blog.sqlrx.com/2017/08/10/how-indexing-affects-deletion-queries/

One of the things that I usually need when performance tuning is to know information about the existing indexes on specific tables. I always want to know what columns are in the indexes along with the included columns so that I can compare the existing indexes to the missing recommendations. This way I can better figure out if a recommendation is something that can be added to an existing index (like an included column) or if I really need to create a brand new index if it does not exist at all.

Like most DBA’s, I keep a toolkit with all kinds of handy scripts. However, I did not have one that would give me index included columns. I also wanted the query to be able to return info from one or many tables at the same time. This would be useful when dealing with things with lots of joins. I know that there are a few bloggers who have posted something similar but I wanted to have the ability to filter on one or multiple tables. So, here is what I came up with:

/********************************

Returns index columns with included columns

plus other needed index info for tables

in @tablelist variable

*********************************/

 

DECLARE @tablelist VARCHAR(1000)

DECLARE @sqlstr NVARCHAR(MAX)

 

SET @tablelist = ‘InvoiceLines,OrderLines,StockItemHoldings’ — comma delimited list of tables, can be one or multiples EX: ‘mytable’ or ‘mytable,nothertable,thirdtable’

 

— Query the tables

IF @tablelist <> OR @tablelist <> ‘?,?,?’

BEGIN

SET @tablelist = REPLACE(QUOTENAME(@tablelist,””), ‘,’, ”’,”’) — Add quotes so the IN clause will work

 

SET @sqlstr = ‘SELECT SCHEMA_NAME(o.schema_id) AS SchemaName

       ,o.name AS TableName

       ,i.name AS IndexName

       ,i.type_desc AS IndexType

       ,c.name AS ColumnName

       ,ic.is_included_column

       ,ic.index_column_id

       ,t.name

       ,c.max_length

       ,c.precision

       ,c.scale

FROM sys.indexes i

JOIN sys.index_columns ic ON i.index_id = ic.index_id

JOIN sys.columns c ON ic.column_id = c.column_id AND ic.object_id = c.object_id

JOIN sys.objects o ON o.object_id = i.object_id AND c.object_id = i.object_id

JOIN sys.types t on t.user_type_id = c.user_type_id

WHERE o.name IN (‘+@tablelist+‘)

ORDER BY SchemaName, TableName, IndexName, index_column_id’

 

EXEC sp_executesql @sqlstr

–PRINT @sqlstr

END

All you have to provide is a comma separated list of the table(s) you are interested in for the @tablelist variable and it will do the rest. The output looks like this:

Blog_20170928_1

I found several bloggers who had made queries that would concatenate the columns together but truthfully I found those hard to read so I settled for a list with some extra info on the data and index types. Hope this is useful to someone out there.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

Handling Multiple Missing Index Recommendations for the Same Table

— By Jeff Schwartz

The Problem

Many articles concerning SQL Server missing index recommendations demonstrate the mechanics for obtaining them and often highlight whether the suggested key columns are used in equality or inequality relationships. Most of these examples emphasize a single index that can be implemented to improve performance. However, real-life situations often involve multiple or many suggested indices. For example, during a recent customer study, I observed 28 recommended indices for one table and 52 for another. Clearly, metrics such as improvement measure and user impact frequently can be used to determine the most important recommendations, but sometimes there are either too many similar recommendations – or – improvement measures are almost the same for several proposed indices, which makes determining an optimal index difficult. A large number of recommendations also often results in index proliferation, i.e., missing index recommendations are implemented piecemeal with no overall strategy resulting in tables with 14, 17, or 24 indices as the author observed in a recent customer performance study. The size of the table compounds this problem because it is especially desirable to limit the number of indices on these tables. For example, the table that had 24 indices on it contained over 30 million records.

The following examples of proposed indices that inspired this article illustrate the multiple recommendation phenomenon – RecIndex1: Keys (DateVal), Included Columns (Metric, ReptCat, LocationID, Total_Amount) and RecIndex2: Keys (LocationID, DateVal) Included Columns (Metric, ReptCat, Total_Amount). Clearly, without additional information, it is difficult to determine whether these recommendations must remain separate or could be combined into a single index. In this situation, knowing whether the proposed key columns are used in equality or inequality where clauses can be critical. This article will discuss how to use knowledge of equality and inequality relationships to determine an appropriate course of action.

Test Table Creation & Load

To determine missing index recommendation behavior, a generic table was constructed and filled with 20 million records. Each record contained an identity column, an ID column, a text column, and 47 metric columns whose values ranged between 1 and 10,000,000. The large number of table columns was used to insure SQL Server would choose an index option when appropriate. Six queries that incorporated various column combinations were executed (some of which differed only in column ordering). To minimize duplication of column values and skewing of query plans, the ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000 formula was used to generate values that were as random as possible. Two indices were created: a clustered index that used the identity column as its only key and a second nonclustered index that used DupID as its only column. The scripts for the creation, loading, and initial indexing of the table are shown below.

— ##############################################################

— Create test table

— ##############################################################

DROP TABLE FewDuplicates;

CREATE TABLE FewDuplicates (

IDCol bigint identity (20000000,1),

DupID bigint,

MyText varchar(10),

Metric01 bigint, Metric02 bigint, Metric03 bigint, Metric04 bigint,

Metric05 bigint, Metric06 bigint, Metric07 bigint, Metric08 bigint,

Metric09 bigint, Metric10 bigint, Metric11 bigint, Metric12 bigint,

Metric13 bigint, Metric14 bigint, Metric15 bigint, Metric16 bigint,

Metric17 bigint, Metric18 bigint, Metric19 bigint, Metric20 bigint,

Metric21 bigint, Metric22 bigint, Metric23 bigint, Metric24 bigint,

Metric25 bigint, Metric26 bigint, Metric27 bigint, Metric28 bigint,

Metric29 bigint, Metric30 bigint, Metric31 bigint, Metric32 bigint,

Metric33 bigint, Metric34 bigint, Metric35 bigint, Metric36 bigint,

Metric37 bigint, Metric38 bigint, Metric39 bigint, Metric40 bigint,

Metric41 bigint, Metric42 bigint, Metric43 bigint, Metric44 bigint,

Metric45 bigint, Metric46 bigint, Metric47 bigint

)

 

— ##############################################################

— Load original table

— ##############################################################

declare @DupID bigint = 1

declare @NumRecs bigint = 20000000

 

truncate table FewDuplicates

set nocount on

while (@DupID <= @NumRecs)

begin

insert into [dbo].[FewDuplicates] (

[DupID], [MyText],

[Metric01], [Metric02], [Metric03], [Metric04], [Metric05], [Metric06], [Metric07], [Metric08], [Metric09], [Metric10], [Metric11], [Metric12], [Metric13], [Metric14], [Metric15], [Metric16], [Metric17], [Metric18], [Metric19], [Metric20], [Metric21], [Metric22], [Metric23], [Metric24], [Metric25], [Metric26], [Metric27], [Metric28], [Metric29], [Metric30], [Metric31], [Metric32], [Metric33], [Metric34], [Metric35], [Metric36], [Metric37], [Metric38], [Metric39], [Metric40], [Metric41], [Metric42],

[Metric43], [Metric44], [Metric45], [Metric46], [Metric47]

)

VALUES (

@DupID,‘my text’,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000

)

set @DupID += 1

end — group option loop

set nocount off

 

— ##############################################################

— Create indices on the test table

— ##############################################################

CREATE UNIQUE CLUSTERED INDEX [ci_RecID] ON [dbo].[FewDuplicates]

(

[IDCol] ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

CREATE NONCLUSTERED INDEX [ix_DupID] ON [dbo].[FewDuplicates]

(

DupID ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Queries that Create Missing Index Recommendations

To illustrate the issue, two queries were created that were guaranteed to generate missing index recommendations as well as mimic the behavior of the ones cited in The Problem section. They are listed in Table 1 and the differences are highlighted for easier comparison. The query plans for the two queries are displayed in Table 2 and Table 3. Both queries performed full clustered index scans and generated missing index recommendations. The recommendations are shown below in two pieces within each table. The most important points are that Query Plan #1 specifies Metric14 first and Metric43 second, whereas Query Plan #2 specifies Metric43 alone with Metric14 as an included column. At first glance, these appear to be contradictory and potentially incompatible differences.

Blog_20170720_1

Blog_20170720_2

Blog_20170720_3

Table 4 summarizes the metrics, columns, and full index definitions suggested by SQL Server. Given the very large improvement measure values, the key ordering of the proposed indices, and the somewhat different included columns, it may be tempting to implement both indices despite the fact that the table contained 20 million records. However, more detailed analysis using the data in Table 5 shows that a single index that is constructed properly can accommodate both recommendations and, therefore, both queries. The most important metrics are displayed in column_usage because one uses two equality comparisons and the other only uses one. Therefore, if we specify the equality column used in both queries first and specify the equality/included column second, both queries will be satisfied. VERY IMPORTANT NOTE: Although the key and included column ordering appear obvious because of the column names used in this example table, i.e., suffixes in numerical order, when normal column names like DateVal or LocationID are used, ordering is much less obvious. As cited in my previous blog entitled Query Tuning and Missing Index Recommendations, when ordering is not crucial, e.g., when only equality operations or included columns are specified, SQL Server uses the ordering of the columns in the table itself rather than the ordering specified in the query.

In most cases SQL Server attempts to create covering indices, which are defined to be indices that contain all the columns of a particular query. Please reference the following web page for further information regarding covering indices: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-indexes-with-included-columns. In the author’s experience, implementing the keys of a suggested index wtihout the corresponding included columns often results in SQL Server ignoring the new index. Therefore, the included columns are vital to any missing index strategy. Clearly, a point of diminshing returns exists when the number of included columns approaches the total number of columns in the table (especially very large tables), but as long as the number of columns is reasonable, included columns should always be considered. [The queries to obtain the data shown in Table 4 and Table 5 are provided in Table 6.]

Blog_20170720_4

Blog_20170720_5

SELECT avg_total_user_cost * avg_user_impact * (user_seeks + user_scans)

as [Improvement Measure],

avg_user_impact as [Avg User Impact], user_seeks as Executions,

[statement] as TableName,

equality_columns as [Equality Usage], inequality_columns as [Inequality Usage], included_columns as [Included Columns],

‘Create Nonclustered Index ix_SQLRx_’ + PARSENAME([statement],1) + ‘_’ + CONVERT(varchar, group_handle) + ‘_’ + CONVERT(varchar, g.index_handle) + ‘ ON ‘ +

[statement] +‘ (‘ + ISNULL(replace(equality_columns,‘ ‘,),) +

CASE WHEN equality_columns IS NOT NULL AND inequality_columns IS NOT NULL THEN ‘,’ ELSE END +

ISNULL (replace(inequality_columns,‘ ‘,),) +‘)’ +

CASE WHEN included_columns IS NOT NULL THEN ‘ INCLUDE (‘ + included_columns + ‘)’ ELSE END AS [Create Index Statement]

FROM       sys.dm_db_missing_index_groups g

INNER JOIN   sys.dm_db_missing_index_group_stats s ON

s.group_handle = g.index_group_handle

INNER JOIN   sys.dm_db_missing_index_details d ON d.index_handle = g.index_handle

ORDER BY avg_total_user_cost * avg_user_impact * (user_seeks + user_scans) DESC;

 

SELECT statement AS [Table], column_id , column_name, column_usage,

migs.user_seeks as Executions, migs.avg_user_impact as [Avg User Impact]

FROM sys.dm_db_missing_index_details AS mid

CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle)

INNER JOIN sys.dm_db_missing_index_groups AS mig ON mig.index_handle = mid.index_handle

INNER JOIN sys.dm_db_missing_index_group_stats AS migs ON mig.index_group_handle = migs.group_handle

ORDER BY mig.index_group_handle, mig.index_handle, column_id

Blog_20170720_8

Table 7 contains the composite index that satisfies both recommendations and queries. It is important to note that Metric43 is the FIRST key and Metric14 is the second. If this ordering is not followed, Query #1 will perform a full scan. Reviewing both queries demonstrates the following:

  1. The key ordering enables the equality operators to apply in both cases.
  2. Having Metric14 as the second key satisfies both the equality condition and the included column condition.
  3. The other columns specified by the queries are supplied so the table data need never be referenced.

 

CREATE NONCLUSTERED INDEX ix_CombinedIndex ON [dbo].[FewDuplicates]

(

[Metric43],

[Metric14]

)

INCLUDE (

[Metric01],

[Metric02],

[Metric03],

[Metric04]

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Blog_20170720_9

Table 8 and Table 9 contain the updated query plans, which illustrate the facts that the clustered index scans have been replaced by index seek operators and the clustered index is not accessed to satisfy either query. Note also that parallelism was present in the query plans shown in Table 2 and Table 3, but is absent in the new query plans displayed in Table 8 and Table 9.

Blog_20170720_6

Blog_20170720_7

Conclusion

This article illustrated a situation in which multiple missing index recommendations first appeared to necessitate separate indices, resulting in a single index implementation. Careful examination of the key relationships used in the queries and specified by the missing index recommendations enabled the author to develop one composite recommendation that enabled optimal query performance.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

 

 

Query Tuning and Missing Index Recommendations

–By Jeffry Schwartz

Overview

Most analysts are familiar with missing index recommendations provided by SQL Server when query plans are displayed within SSMS or reported by various missing index DMVs. Several questions arise concerning these recommendations:

  1. What determines whether a column becomes a key or included in the new index?
  2. Where does the ordering of the included columns come from? Does column ordering in queries affect the recommendations, e.g., will query orderings of A, B, C and C, B, A in the selection column list result in the one or two recommendations?
  3. How sophisticated are the missing index recommendations, i.e., does SQL Server compare recommendations and perform any kind of recommendation consolidation or optimization?

This article provides a reproducible example using six different queries that deliberately caused SQL Server to generate missing index recommendations and provide answers to these questions. The example is used to study missing index recommendations in detail, including how they relate to the underlying table and how query columns affect these recommendations. The article also illustrates how a single consolidated index can address the performance needs of all six queries.

Test Table Creation & Load

To determine missing index recommendation behavior, a generic table was constructed and filled with 20 million records. Each record contained an identity column, an ID column, a text column, and 47 metric columns whose values ranged between 1 and 10,000,000. The large number of table columns was used to insure SQL Server would choose an index option when appropriate. Six queries that incorporated various column combinations were executed (some of which differed only in column ordering). To minimize duplication of column values and skewing of query plans, the ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000 formula was used to generate values that were as random as possible. Two indices were created: a clustered index that used the identity column as its only key and a second nonclustered index that used DupID as its only column. The scripts for the creation, loading, and initial indexing of the table are shown below.

— ##############################################################

— Create test table

— ##############################################################

drop table FewDuplicates;

CREATE TABLE FewDuplicates (

IDCol bigint identity (20000000,1),

DupID bigint,

MyText varchar(10),

Metric01 bigint, Metric02 bigint, Metric03 bigint, Metric04 bigint, Metric05 bigint,

Metric06 bigint, Metric07 bigint, Metric08 bigint, Metric09 bigint, Metric10 bigint,

Metric11 bigint, Metric12 bigint, Metric13 bigint, Metric14 bigint, Metric15 bigint,

Metric16 bigint, Metric17 bigint, Metric18 bigint, Metric19 bigint, Metric20 bigint,

Metric21 bigint, Metric22 bigint, Metric23 bigint, Metric24 bigint, Metric25 bigint,

Metric26 bigint, Metric27 bigint, Metric28 bigint, Metric29 bigint, Metric30 bigint,

Metric31 bigint, Metric32 bigint, Metric33 bigint, Metric34 bigint, Metric35 bigint,

Metric36 bigint, Metric37 bigint, Metric38 bigint, Metric39 bigint, Metric40 bigint,

Metric41 bigint, Metric42 bigint, Metric43 bigint, Metric44 bigint, Metric45 bigint,

Metric46 bigint, Metric47 bigint

)

 

— ##############################################################

— Load original table

— ##############################################################

declare @DupID bigint = 1

declare @NumRecs bigint = 20000000

 

truncate table FewDuplicates

set nocount on

while (@DupID <= @NumRecs)

begin

insert into [dbo].[FewDuplicates] (

[DupID], [MyText],

[Metric01], [Metric02], [Metric03], [Metric04], [Metric05], [Metric06], [Metric07],

[Metric08], [Metric09], [Metric10], [Metric11], [Metric12], [Metric13], [Metric14],

[Metric15], [Metric16], [Metric17], [Metric18], [Metric19], [Metric20], [Metric21],

[Metric22], [Metric23], [Metric24], [Metric25], [Metric26], [Metric27], [Metric28],

[Metric29], [Metric30], [Metric31], [Metric32], [Metric33], [Metric34], [Metric35],

[Metric36], [Metric37], [Metric38], [Metric39], [Metric40], [Metric41], [Metric42],

[Metric43], [Metric44], [Metric45], [Metric46], [Metric47]

)

VALUES (

@DupID,‘my text’,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000,

ABS(cast(CHECKSUM(NewId()) as bigint)) % 10000000

)

set @DupID += 1

end — group option loop

set nocount off

 

— ##############################################################

— Create indices on the test table

— ##############################################################

CREATE UNIQUE CLUSTERED INDEX [ci_RecID] ON [dbo].[FewDuplicates]

(

[IDCol] ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

CREATE NONCLUSTERED INDEX [ix_DupID] ON [dbo].[FewDuplicates]

(

DupID ASC

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

 

Queries & Execution Plans with Missing Index Recommendations

The six queries all performed a range scan based upon DupID and Metric01. Clearly, the ix_DupID index could be used for the first portion of the where clause, but none of the existing indices could assist in the resolution of the second portion of the where clause. Note: All queries used identical where clauses to insure that the only query differences involved the columns that were requested in the select clause. These variations employed different combinations of the first six metric columns included a variety of column orderings. All of the queries returned the same 809 rows. Note: Due to the random nature of the data, the reader will not obtain identical results, but they will be functionally similar. Each query was run separately after the following commands had been executed:

dbcc dropcleanbuffers with no_infomsgs

dbcc freeproccache with no_infomsgs

dbcc freesystemcache(‘TokenAndPermUserStore’) with no_infomsgs

These commands clear all the pertinent caches to insure reproducibility and prevent memory-resident portions of the database from skewing the results. Each query required approximately 90 seconds to execute as shown in Table 1. Each query required approximately one million logical and physical reads to complete. The SSMS option for returning the actual execution plan was set prior to execution, and the six queries and execution plans are shown in the individual query sections below. The first five queries resulted in different recommendations, but the sixth query’s plan and recommended missing index were identical to that of the fifth because the only difference between Queries #5 and #6 is the ordering of the columns (part of the answer to question #2). The differences among all the queries are summarized in Table 2 below.

Blog_20170601_T1

Table 1: Summary of Initial Query Executions & Timings

Blog_20170601_T2

Table 2: Summary of Query Selection Columns

Query #1

select [DupID], [MyText], [Metric01]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_1

Query #2

select [DupID], [MyText], [Metric02]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_2

Query #3

select [DupID], [MyText], [Metric03], [Metric06], [Metric04]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_3

Query #4

select [DupID], [MyText], [Metric04], [Metric06], [Metric02], [Metric01]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_4

Query #5

select [DupID], [MyText], [Metric01], [Metric03], [Metric05], [Metric02]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_5

Query #6

select [DupID], [MyText], [Metric05], [Metric02], [Metric01], [Metric03]

from FewDuplicates

where DupID between 100000 and 500000 and Metric01 between 10000 and 30000

Blog_20170601_6

Missing Index Recommendations Displayed via DMVs

The following query was used to provide some of the information enumerated in Table 3 below:

— ##############################################################

— Missing Index DMV Query

https://blogs.msdn.microsoft.com/bartd/2007/07/19/are-you-using-sqls-missing-index-dmvs/

— ##############################################################

SELECT migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,

‘CREATE INDEX [missing_index_’ + CONVERT (varchar, mig.index_group_handle) + ‘_’ + CONVERT (varchar, mid.index_handle)

+ ‘_’ + LEFT (PARSENAME(mid.statement, 1), 32) + ‘]’

+ ‘ ON ‘ + mid.statement

+ ‘ (‘ + ISNULL (mid.equality_columns,)

+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ‘,’ ELSE END

+ ISNULL (mid.inequality_columns,)

+ ‘)’

+ ISNULL (‘ INCLUDE (‘ + mid.included_columns + ‘)’, ) AS create_index_statement,

migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

The following table lists the missing index recommendations as displayed by the DMVs. It also summarizes all of the missing index recommendations to make it easier to determine what a consolidated index might require. It is important to note that the missing index recommendations for Query #5 and #6 are the same, just as they were in the query plan listings above. Although discussion of the calculation and interpretation of the Improvement Measure column is beyond the scope of this article, it should be noted that the total improvement measure was approximately twice that of any other index recommendation and further investigation reveals that this index would satisfy both Query #5 and #6.

Blog_20170601_T3

Table 3: Summary of All Missing Index Recommendations

Composite Index Recommendation for All Six Queries

Either Table 2 or Table 3 could be used to determine a composite index. In-depth investigation reveals that since DupID and Metric01 are specified in the where clauses as inequalities, these should be key columns in any index, and review of Table 3 highlights this for all index recommendations. Since MyText is used in all of the queries and Metric01 is to be used as a key column, the only remaining columns are Metric02, Metric03, Metric04, Metric05, and Metric06. An index that employs these columns as included columns can “cover” all the queries shown in this article. For further information about covering indices, please refer to the following article: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-indexes-with-included-columns. The composite (and covering) index is shown below:

CREATE NONCLUSTERED INDEX ix_ResolveMultipleQueriesNeeds ON [dbo].[FewDuplicates] (

[DupID],

[Metric01]

)

INCLUDE (

[MyText],

[Metric02],

[Metric03],

[Metric04],

[Metric05],

[Metric06]

)

WITH (fillfactor = 100, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

ON [PRIMARY]

Query Plans for Queries #1 – #6 with Composite Index Added

Once the combined index was implemented, each query was run as before: caches cleared and SSMS actual execution plan option on. No query ran longer than one second even when the caches had been purged. Although the queries were run individually for the purposes of obtaining accurate timings, they were also run together for the purpose of showing the similarities of the execution plans with the new index in place. Table 4 lists the execution results and clearly, performance improved greatly for all queries. For example, durations decreased from approximately 90 seconds to about one second and reads dropped from approximately one million to four thousand. Examination of the query plans shows that all six queries used the new index to obtain the observed extremely fast performance.

Blog_20170601_T4

Table 4: Summary of Query Executions & Timings After Composite Index Added

Blog_20170601_7

Summary

Most readers probably already knew the answer to the first question: key columns are used to filter query results. The simplest example of this involves a where clause. Columns referenced by equalities are placed first, followed by columns that are used with inequalities. Some of the missing index DMVs actually differentiate between the types of relationships involved. The example queries show that table column ordering dictates included column order thereby answering question #2. Since table column ordering is used to specify included column ordering, SQL Server WILL NOT duplicate missing index recommendations when the columns are identical regardless of their ordering in the queries. The answer to the last question is that SQL Server does not appear to perform any consolidations or optimizations other than those cited in the answer to question #2. This knowledge is invaluable, particularly when evaluating missing index recommendations on production systems because unless the queries that cause SQL Server to generate the missing index recommendations have been executed recently, there is little information to associate the actual queries with the recommendations. Understanding those factors that affect SQL Server’s recommendations can simplify causing query identification and give an analyst more certainty that the correct associations have been made.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!