Thursday 21 May 2020

Getting started with SQL Server Query Tuning

This article will cover some essential techniques for SQL query tuning. Query tuning is a very wide  topic to talk about, but some essential techniques never change in order to tune queries in SQL Server. Particularly, it is a difficult issue for those who are a newbie to SQL query tuning or who are thinking about starting it. So this article will be a good starting point for them. Also, other readers can refresh their knowledge with this article. In the next part of this article we will mention these techniques that help to tune queries.

Avoid using the asterisk "*" sign in SELECT statement


We should use the asterisk (star) sign solely when we need to return all columns of the table. However, this usage type becomes a bad habit by the programmers, and they start to type their queries with the "Select *" statement. At various times, I have experienced that the "Select *" statement is used to retrieve one column of the multi column tables and worse than that developer is not aware of this problem. This usage approach causes more network and IO activity, so it affects the query performance negatively because of the more resource consumption. Now we will make a pretty simple test to find out the performance difference between "Select *" statement against "Select column_name1, column_name2, column_nameN" statement. Assume that, on a web application, we only need to show two columns, but we used asterisk sign in our query.

SSMS offers a very helpful tool that help to analyze and compare the executed queries performance metrics. This tool name is Client Statistics, and we will activate this option before executing the following two queries.

As we can see obviously, there is a dramatic difference between the received network measurements of these two select statements in the Network Statistics section. At the same time, the Time Statistics measurement comparison result does not change for the "Select *" statement. It shows poor performance than the "Select column_name1, column_name2, column_nameN" statement. In light of this information we can come to this outcome "As possible as we should not use the asterisk signs in the SELECT statements".

Don't use scalar-valued functions in the WHERE clause


A scalar-valued function takes some parameters and returns a single value after a certain calculation. The main reason for why scalar-valued functions affect performance negatively, the indexes cannot be used on this usage approach. When we analyze the following query execution plan, we will see an index scan operator because of the SUBSTRING function usage in the WHERE clause.
The index scan operator reads all index pages in order to find the proper records. However, this operator consumes more I/O and takes more time. As possible, we should avoid the index scan operator when we see it in our execution plans.

On the other hand, particularly for this query, we can improve their performance with a little touch. Now we will change this query as below so the query optimizer will decide to use another operator.
The index seek operator only reads the qualified rows because this operator effectively uses the indexes to retrieve the selective rows. At the same time, it shows great performance when against the index scan operator.

Tip: Scalar-valued functions are executed for each row of the result set, so we should consider the rows number of the result set when we use them. They can damage the query performance when trying to use for the queries which will return a huge number of rows. However, Microsoft has broken this traditional chain with SQL Server 2019 and made some performance improvements in the query optimizer for the scalar-valued functions and it has been generating more accurate execution plans if any query contains the scalar-valued function.

Use the covering indexes to improve the query performance


Covering indexes contains all referenced columns of the query, so they improve the selectivity of the index, and if any query uses this index it accesses the data more efficiently. However, before creating any covered index, we need to figure out cost-benefit analyses because any new index directly affects the performance of the insert statement. Now we will analyze the execution plan of the following query.
As we learned, the index scan operations is not shown good performance during the execution of the query. To overcome this problem, we will create the following index. The main characteristics of this index is that it covers all the columns of the query through the index key or included columns.
The execution plan of the query has started to use an index seek operator, and this operator shows better performance than the index scan operator.

Use the UNION ALL operator instead of the UNION operator

UNION ALL and UNION operators are used to combine two or more than two result sets of the select statements. However, the main difference between these two operators is the UNION operator eliminates the duplicate rows from the result set. In terms of the query tuning, UNION ALL operator perform better performance than UNION operator. As a result, if we don't consider the duplicate rows in the result set, we should use UNION ALL operator in our select statements. When we compare execution plans of the following queries, we will see a noticeable difference between these two execution plans. At first, we will compare the execution plans.
When we analyze the comparison of the execution plans, we can see that the Sort Operator adds extra cost to the select statement, which is using the UNION operator. As a final word about these two operators, if we don't consider the duplicate records, we should use UNION ALL operator to combine the result sets.

Use Actual Execution Plans instead of Estimated Execution Plans

The execution plan offers us to the visual presentation of the query processing steps. When we analyze any execution plan, we can clearly understand the road map of the query, and it is also a significant beginning point to tune a query. The estimated and actual execution plans are the two types of execution plans that can be used by us to analyze the queries. During the creation of the estimated execution plan, the query does not execute but generated by the query optimizer. Despite that, it does not contain any runtime metrics and warnings.

On the other hand, the actual execution plan contains more reliable information and measurements about the query. Actual Execution plans provide an advantage to tune the query performance.

Another approach can be combo usage for the execution plans for the queries which have longer execution times. At first, we can check the estimated execution plan and then re-analyze the actual execution plan.

Tip: If a query includes any temp table, the estimated execution plan cannot be generated. It will be given an error when we try to generate it. The following query will return an error when trying to generate an estimated execution plan.

Conclusion

In this article, we learned essential techniques to tune SQL queries. Performance tuning is a very complicated and struggling task to accomplish, but we can learn this concept from easy to difficult, and this article can be a good beginning point to start. 

SQL Server Query Execution Plans for beginners – Clustered Index Operators

In this article we will continue discussing the various execution plan operators related to clustered indexes and what they do, when do the...