Monthly Archives: December 2017

AlwaysOn – Connection Handshake Failure

— by Ginger Keys

While doing some work on a test environment recently I realized the AlwaysOn Group was failing. (On a production system I would have had several Alerts set to notify me of any issues). The dashboard on my test instance showed that the secondary replica was not synchronizing and from clicking on the blue warning links it actually showed that the secondary was not connected:

Blog_20171221_1

Blog_20171221_2

Blog_20171221_3

I confirmed that both the primary and secondary instances were up and running, but the secondary Availability Replica and Databases were not connected to the AlwaysOn Group:

Blog_20171221_4

Problem

The AlwaysOn Health Events were consistent with the SQL Server Logs. I discovered that the Windows Server Failover Cluster had been down (but was back online), and that the AlwaysOn Group had failed over and back. But the culprit to my secondary replica and databases being disconnected was an endpoint issue. I found in my SQL Server logs the service account running AlwaysOn had for some reason lost its permissions to connect to the endpoint:

Blog_20171221_5

Message

Database Mirroring login attempt by user ‘Domain\user.’ failed with error: ‘Connection handshake failed. The login ‘Domain\user’ does not have CONNECT permission on the endpoint. State 84.’. [CLIENT: 10.0.0.0]

Solution

Thankfully the solution to this issue was pretty simple. Run the following script on both the primary and secondary instances to grant the service account running AlwaysOn connect permission:

GRANT CONNECT ON ENDPOINT::hadr_endpoint TO [Domain\user]

GO

This immediately corrected the problem, and the secondary replica and databases were connected to the AlwaysOn Group without any further action.

Unfortunately the cause of issues on a SQL Server are not always apparent after the fact, so that is why it is important to be proactive and set up robust alerts and notifications. For more information on setting up alerting and monitoring for your AlwaysOn Group click here https://blog.sqlrx.com/2015/08/27/alwayson-monitoring-and-alerting/ .

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

 

 

 

 

Rx for Demystifying Index Tuning Decisions – Part 8

— by Jeffry Schwartz

Review

Due to the holiday week, this part will be shorter than usual. In Parts 1 through 7 of this blog series the author discussed various aspects of SQL Server indices such as:

  • Types of indices and their variations, e.g., clustered, nonclustered, filtered, and covering
  • Physical versus logical index access methods, e.g., seeks, scans, key/RID lookups, singleton lookups, and range scans
  • Why index tuning should begin with queries
  • Determining an Appropriate Strategy for Index Tuning
  • Determining Queries that Need Indices the Most
  • Capturing query performance information using Extended Events
  • Using query plans to determine relationships between queries and indices
  • Query Optimizer
  • Statistics and how they affect index usage and query performance
  • Consequences of too many indices
  • Overlapping Indices
  • Overview of Dynamic Management Functions and a detailed discussion of the sys.dm_db_index_usage_stats DMV
  • Detailed discussion of the sys.dm_db_index_operational_stats DMF (and how to use it)

 

https://blog.sqlrx.com/2017/10/26/rx-for-demystifying-index-tuning-decisions-part-1/

https://blog.sqlrx.com/2017/11/02/rx-for-demystifying-index-tuning-decisions-part-2/

https://blog.sqlrx.com/2017/11/09/rx-for-demystifying-index-tuning-decisions-part-3/

https://blog.sqlrx.com/2017/11/16/rx-for-demystifying-index-tuning-decisions-part-4/

https://blog.sqlrx.com/2017/11/30/rx-for-demystifying-index-tuning-decisions-part-5/

https://blog.sqlrx.com/2017/11/30/rx-for-demystifying-index-tuning-decisions-part-6/

https://blog.sqlrx.com/2017/12/07/rx-for-demystifying-index-tuning-decisions-part-7/

 

Part 8 contains discusses how to determine missing indices and their estimated impact (if implemented) as well as incorporating missing index recommendations either into existing indices or into as few new indices as possible.

Missing Index Information

SQL Server monitors missing index warnings (same ones as shown in query plans) and records the estimates of the relative cost of not having the recommended index as queries execute. I/O plays a dominant role in these statistics because the expected I/O amounts affect estimated cost and improvement factors. Execution counts also affect these numbers. The impact shown in Figure 9 is quite high and indicates that the query could make good use of the recommended index if it were implemented. In this example, the key lookup accounts for all the work performed by the query and is no doubt the reason for the missing index recommendation.

Blog_20171214_1

Figure 9: Missing Index Example

 

Missing Index DM Views

Four views are used to derive the missing index statistics recommendations:

  • sys.dm_db_missing_index_columns
  • sys.dm_db_missing_index_details
  • sys.dm_db_missing_index_group_stats
  • sys.dm_db_missing_index_groups

These four DMV/DMFs are dependent upon the missing index warnings that would appear in SSMS query plans. Although Table 13 does not show the actual index recommendations, it does show the numerics involved. Improvement Measure and User Impact are frequently good indicators of potentially useful indices. The number of query executions, referred to as user seeks in the DMV, is also a good indicator of an index that would benefit multiple query executions. This metric can also be useful for locating specific queries that might need this index because the counts can sometimes limit the search candidates. The reader should note that the names of some tables are repeated multiple times, indicating that multiple index recommendations were suggested by SQL Server. Generally, improvement measures of one million or higher should be considered seriously. The formula for improvement measure is as follows: avg_total_user_cost * avg_user_impact * (user_seeks + user_scans). Note: User_scans are almost always zero. Although the 894 million value shown in Table 13 may seem very high, the values shown in Table 14 emphasize how bad things can get.

Blog_20171214_2

Table 13: Consolidated Missing Index DM Views Sample Report

Blog_20171214_2a

Table 14: Extremely High Missing Index Values

 

Table 15 summarizes the recommendations by table and shows another way this data can be viewed. The number of recommended indices for the jkl_abc.dbo.dly_st_mtc_smy, jkl_abc.dbo.hr_st_mtc_smy, and jkl_def.dbo.nglentUS2017 tables are instructive because they are 28, 52, and 51, respectively. Clearly, these cannot be implemented without creating significant amounts of overhead and repeating the excessive indexing problems cited earlier in this blog series. Clearly, as indicated by the Total Improvement Measure and the Recommended Index Count, the jkl_abc.dbo.dly_st_mtc_smy table needs a great deal of indexing tuning help. Resolving the jkl_def.dbo.KDS_spofsvc table’s indexing issues should be simpler to resolve because SQL Server only recommended three indices.

Blog_20171214_3

Table 15: Missing Index Improvement by Table

 

It is also important to realize that these recommendations are driven by the queries, so very often many similar recommendations will be provided by SQL Server since it does not compare existing indices with recommended ones. The recommendations often make extensive use of included columns even to the extent that sometimes almost all of the columns in the table are recommended for inclusion. SQL Server treats every index as if it were new, even if an existing index could be modified slightly to accommodate the recommendation. The duplication issue is shown clearly in Table 16. All of these recommendations are for the same table and only show those recommendations that began with CID as the first key. Most of these recommendations are not worth implementing given the low improvement measure, but the 274,170 one might be worth considering. The table also shows that 35 indices already exist on this table, so adding more indices is not advisable unless absolutely necessary. Even then, every opportunity to drop an existing index to make room for a new one should be exercised. Using a cross-tabular format like the one below makes comparing the various recommendations much easier. In addition, the reader should note the operator in parentheses (= or <). The equality ones make things much easier because as long as both CID and CO are present, the order is not as critical. Therefore, an existing index that had CO as the first key already might be useful if the CID key were added. Using the color-coded cross-tab format also makes it much easier to identify the included columns that are common to or different from those of the other missing index recommendations.

Blog_20171214_4

Table 16: Missing Index Cross-Tab Example

 

Some articles suggest that the included columns in missing index recommendations are in no particular order. As discussed in the article that is available at the following link, the included columns are produced in TABLE column order, regardless of the order specified in the query that generated the missing index recommendation: https://blog.sqlrx.com/2017/06/02/query-tuning-and-missing-index-recommendations. Two other important points should be made regarding these recommendations and the DMVs. The cited DMVs also provide the data for the Database Tuning Advisor, although Microsoft says the tool is more comprehensive and “much better.” It is often tempting to assume that any poorly performing query will generate some sort of warning, especially those that perform full scans. As cited previously, versions of SQL Server prior to SQL Server 2014 never seemed to generate any missing index recommendations for queries that performed full table or index scans. Under certain conditions, the new optimizer seems to generate missing index recommendations for some full scans. However, if the query makes extensive use of numerous nested loops, recommendations will often still not be produced by SQL Server. For further information regarding missing index DMVs, please consult the following links:

http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx and http://www.mssqltips.com/tip.asp?tip=1634.

Database Engine Tuning Advisor (DTA)

This tool replays a single, previously captured workload, and enables the analyst to determine how this workload might perform on a new server, with a new version of SQL Server, or using a revised index structure. Unfortunately, the analyst does not have a great deal of control over how the workload is replayed and it uses a single T-SQL script or trace table replay quite often. The analyst can choose to manually or automatically implement the recommendations. Since it uses the missing index DMVs, it often implements many of the indices with the highest factors without regard to existing indices, i.e., no consideration is given to whether functionality would be duplicated or whether any existing index could be adjusted to satisfy the query’s need. This often results in significant duplication of indices, i.e., the overlapping indices that were discussed earlier. Since it often favors covering indices, it frequently implements indices that duplicate large portions of the table in order to make the index a covering one. The new indices use the _DTA prefix to distinguish them from user-generated indices.

Alternatives to Database Engine Tuning Advisor

Several methods exist for replaying a workload. One method is to replay the workload using Profiler and reload the database after each test. Other methods must be used when the databases being tested are huge, therefore requiring many hours to reload. One method is to use a read-only database, but this method does not work as well as it once did. On earlier releases of SQL Server, a query that changed the database would execute up to the point at which the database change was attempted. However, attempting this method on SQL Server 2016 does not work because the query fails immediately. Another method is to run the inquiry-only queries in parallel while controlling thread levels, and run the update queries sequentially utilizing a begin tran/rollback tran for each query that changes the database. This causes the updates to take longer, not only because the queries are run sequentially, but also because work is required to roll the transaction out. Fortunately, the work performed by the query is recorded separately from the rollback portion, so before-and-after query performance comparison is still easy to do. A variation of the last method is to execute only the queries that work with specific tables. This allows the testing to be more surgical.

Conclusion

Index tuning can be difficult, but it definitely can be accomplished. It is essential to understand how queries and indices interact as well as how to read and understand Query Plans, Index Usage and Operational Stats metrics, and Missing Index recommendations and DMVs. Capturing performance data from SQL Server instance that has been running for as long as possible is critical to insure that index usage is fully understood. The longer the instance has been running, the more comprehensive these metrics will be. Using some form of SQL trace replay is the ONLY way to perform comprehensive index tuning because it is surprising how many times index usage will be counterintuitive. Therefore, the analyst MUST see it in action! Significant index redesign can often be completed with 2-5 weeks’ worth of effort, and the results can be quite dramatic as shown in this series.

This concludes the Rx for Demystifying Index Tuning Decisions blog series. Hopefully, the reader has found the series enlightening and useful. Please let us know if you have any questions or suggestions for additional articles. Best of luck with all your tuning efforts and please contact us with any questions.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

Rx for Demystifying Index Tuning Decisions – Part 7

— by Jeffry Schwartz

Review

In Parts 1 through 6 of this blog series the author discussed various aspects of SQL Server indices such as:

  • Types of indices and their variations, e.g., clustered, nonclustered, filtered, and covering
  • Physical versus logical index access methods, e.g., seeks, scans, key/RID lookups, singleton lookups, and range scans
  • Why index tuning should begin with queries
  • Determining an Appropriate Strategy for Index Tuning
  • Determining Queries that Need Indices the Most
  • Capturing query performance information using Extended Events
  • Using query plans to determine relationships between queries and indices
  • Query Optimizer
  • Statistics and how they affect index usage and query performance
  • Consequences of too many indices
  • Overlapping Indices
  • Overview of Dynamic Management Functions and a detailed discussion of the sys.dm_db_index_usage_stats DMV

https://blog.sqlrx.com/2017/10/26/rx-for-demystifying-index-tuning-decisions-part-1/

https://blog.sqlrx.com/2017/11/02/rx-for-demystifying-index-tuning-decisions-part-2/

https://blog.sqlrx.com/2017/11/09/rx-for-demystifying-index-tuning-decisions-part-3/

https://blog.sqlrx.com/2017/11/16/rx-for-demystifying-index-tuning-decisions-part-4/

https://blog.sqlrx.com/2017/11/30/rx-for-demystifying-index-tuning-decisions-part-5/

https://blog.sqlrx.com/2017/11/30/rx-for-demystifying-index-tuning-decisions-part-6/

 

Part 7 contains a detailed discussion of the sys.dm_db_index_operational_stats DMF (and how to use it).

sys.dm_db_index_operational_stats DMF

The sys.dm_db_index_operational_stats DMF requires four parameters and returns one row per index. It provides additional information regarding how the seeks and scans from the sys.dm_db_index_usage_stats DMV are actually implemented within SQL Server. For example, this DMF records how many range and table scans or single record retrievals were performed against an index or heap. Note: although heaps are not indices, their activity is recorded by this DMF using an index ID of zero. Other important information regarding leaf and non-leaf level page and row lock and latch wait times is also provided. Finally, page split information for both the leaf and non-leaf levels is provided via this DMF as well.

Four parameters are required:

{ database_id | NULL | 0 | DEFAULT } (use db_id() for current db)

{ object_id | NULL | 0 | DEFAULT }

{ index_id | NULL | -1 | DEFAULT }

{ partition_number | NULL | 0 | DEFAULT }

 

Use NULL parameters to obtain information for all available entities for a given level, e.g., databases or tables. The following command will return information for all databases, tables, indices, and partitions that are attached to a particular SQL Server instance: select * from sys.dm_db_index_operational_stats (NULL, NULL, NULL, NULL).

Combining the outputs of sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats enable the analyst to determine whether the majority of usage seeks actually resulted index range scans or single record lookups. This combination can also be used to identify the approximate percentage of heap or clustered single record accesses that resulted from RID/Key lookups because these are usually performed in loops of single record accesses. Another excellent source of information involves row, page, page latch, and page I/O latch wait times because these can point to specific tables and indices (and indirectly, keys) that are involved in application delays. Finally, invaluable information involving page splits is also available via this DMF. Note: all of this information is available for both leaf and non-leaf levels. The lists below are provided to aid the reader in addressing specific performance problems.

To analyze common table or index partition access pattern use

  • leaf_insert_count
  • leaf_delete_count
  • leaf_update_count
  • leaf_ghost_count
  • range_scan_count
  • singleton_lookup_count

To identify overhead & contention caused by latching and locking

  • row_lock_count and page_lock_count

These values Indicate how many times Database Engine tried to acquire row and page locks (overhead)

  • row_lock_wait_in_ms and page_lock_wait_in_ms

These values Indicate whether lock contention exists on index or heap, and significance of contention

  • page_latch_wait_count and page_latch_wait_in_ms

These values Indicate whether latch contention exists on index or heap, and significance of contention

To analyze statistics of physical I/Os on an index or heap partition

  • page_io_latch_wait_count and page_io_latch_wait_in_ms

These values Indicate how many physical I/Os were issued to bring index or heap pages into memory and how much waiting was involved

Table 12 demonstrates the kind of report that is available using this data. The highlighted SQL Server tables in this example show that the ix_ci_RowNumEvent index is used almost exclusively for single record lookups, whereas the ix_RowNumEvent index is only ranged scanned. ix_CETRowNum and ix_Checksum are used similarly. This kind of report can be extended easily to include various wait types as well.

Blog_20171207_1

Table 12: Index Operational Stats Summary

 

The next and final article in this blog series will cover determining missing indices and their estimated impact if implemented as well as incorporating missing index recommendations, either into existing indices or as few indices as possible. Until then…good luck with your tuning efforts and please contact us with any questions.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!