Baseline your servers and optimize your applications with Site24x7 SQL monitoring tool.
Maintaining up-to-date statistics is vital to ensure optimal SQL Server performance. SQL server statistics contain information about the distribution of data in a table or indexed view. The query optimizer uses this data to choose the best plan to execute a query.
Outdated statistics can lead to inaccurate query plans, which can, in turn, lead to inefficient queries. Therefore, it’s essential to ensure your statistics are up to date.
Let’s discuss how SQL Server uses statistics to optimize query execution plans. We’ll highlight how to update outdated statistics and explore best practices to help maintain your SQL Server statistics.
So, what are statistics in SQL Server? Statistics in SQL Server are binary large objects (BLOBs) stored in a particular internal structure called a statistics object. These objects contain information about the distribution of values in one or more columns of a table or indexed view. Here, the distribution of values refers to the pattern of values.
When you execute a query, SQL Server uses statistics to estimate the number of rows the query will return. Then, the SQL Server Query Optimizer uses this number to determine the most efficient execution plan for that query. Note that statistics aren’t used for all types of queries. For example, statistics aren’t used for queries that involve common table expressions (CTEs), table variables, or dynamic SQL queries.
Consider an Orders table with an OrderTotal column, which contains the cost value of an order. The statistics object for the OrderTotal column holds information about the range of order amounts, the number of distinct order amounts, and the frequency of each quantity. In this example, if there are more orders with small order totals than orders with large purchases, the distribution of values trends towards the smaller values.
This information is helpful for the SQL Server Query Optimizer to determine the most efficient way to execute queries. The SQL Server Query Optimizer analyzes the statistics to decide how to access and process the data in the table or indexed view. This approach significantly improves the query’s performance.
If a query selects all orders with an OrderTotal value greater than $100, SQL Server uses the column’s statistics to determine the best execution plan. Based on the distribution of the order totals, the SQL Server may choose an index scan or index seek method to execute this specific query.
When SQL Server estimates the query will return most rows in the table, the SQL Server Query Optimizer may use an index scan. The SQL Server reads all the data pages in the index and returns the rows that satisfy the query. In our example, these are the rows whose orderAmount value exceeds $100.
When few orders exceed $100, SQL Server may choose an index seek method to locate the specific rows matching the search criteria. This approach is more efficient than scanning the entire table.
Other SQL Server plans are nested loop join and hash join. SQL Server chooses the technique that executes fastest and uses the fewest resources.
As you just witnessed, the SQL Server Query Optimizer uses statistics to select the most efficient query execution plan. It relies on statistical information, such as the number of rows and distribution of data, to help the query optimizer to make informed decisions on which indexes and join algorithms to use when executing a query. When statistics aren’t updated and don’t reflect the current distribution of values, the SQL Server Query Optimizer may choose an execution plan that’s not optimal for the actual distribution.
For example, if the statistics suggest that there are only a few distinct values in a column when there are many, the optimizer chooses a plan that scans more data than necessary. This approach is detrimental to query performance because it causes the query to consume more central processing unit (CPU) and memory resources, slowing the execution times.
Statistics become outdated when the distribution of column values changes after operations like INSERT, UPDATE, and DELETE modify a table.
To identify outdated statistics in a SQL database, check the number of rows modified since the last update. If this number exceeds a certain threshold, such as 20%, consider them outdated. The 20% threshold isn’t fixed and may vary based on the table's size.
SQL server statistics reside in the database. You can use the DBCC SHOW_STATISTICS command to access them using the following syntax:
DBCC SHOW_STATISTICS (table_name, index_name)
For example, to view the statistics of the Orders table in a WideWorldImportersDW sample database, use the following command:
DBCC SHOW_STATISTICS ('WideWorldImporters.Sales.Orders', 'PK_Sales_Orders')
This command returns the statistics for the PK_Sales_Orders index.
You can also view the statistics for a specific table using the SQL Server Management Studio (SSMS). In SSMS Object Explorer, expand the Database folder to select the target database. Then, expand the Tables folder and choose the desired table. Next, expand the Statistics folder to view the statistics for various indexes. Finally, click on the sought-after statistics for a more detailed view.
Now you know how to view statistics and identify outdated ones. In the following section, you’ll learn how to update these statistics.
By default, the SQL Server Query Optimizer auto-updates the statistics. However, in some cases, you may need to manually update the statistics using the UPDATE STATISTICS command to improve query performance. This manual update gives you more control over the update process. However, it can be time-consuming, and Microsoft recommends that you refrain from updating frequently.
You can execute the UPDATE STATISTICS command using Transact-SQL to update a specific statistic object, all the statistics in a table, or all the statistics in a database.
To update a specific statistics object, specify the table and the target index. For example, the command below updates the statistics for the PK_Sales_Orders in the WideWorldImporters.Sales.Orders table:
UPDATE STATISTICS WideWorldImporters.Sales.Orders PK_Sales_Orders
Specify the target table to update all the statistics in a table. The following command updates all the statistics for the WideWorldImporters.Sales.Orders table:
UPDATE STATISTICS WideWorldImporters.Sales.Orders
You can also specify the number of rows the SQL Server should scan during the update. For example, using FULL SCAN scans all rows. The following command scans all the rows in the WideWorldImporters.Sales.Orders table:
UPDATE STATISTICS WideWorldImporters.Sales.Orders WITH FULL SCAN
Besides FULL SCAN, you can use SAMPLE to update the statistics based on a percentage. The WITH SAMPLE 50 PERCENT in the following command updates statistics based on a 50% sample of the rows:
UPDATE STATISTICS WideWorldImporters.Sales.Orders PK_Sales_Orders WITH SAMPLE 50 PERCENT;
Another option is RESAMPLE, which updates statistics for a specific table partition. For example, the WITH RESAMPLE ON PARTITIONS (1, 2, 3) option updates statistics for partitions 1, 2, and 3.
To update all the statistics in the entire database, use the sp_updatestats stored procedure. It updates all the tables and indexes in the specified database — even if only one row has changed. For example, executing the following command updates all the statistics in the WideWorldImporters table:
Using this command to update all table and index statistics can be resource intensive. Use it sparingly or, better yet, use the UPDATE STATISTICS command to update specific tables.
You can also use incremental updates to revise only the statistics of the modified partitions. You use fewer resources since the SQL Server doesn’t need to scan all the rows when updating the statistics.
Having up-to-date statistics is crucial for optimal database performance. Best practices for keeping statistics up to date include auto-updating statistics, regularly updating statistics, and monitoring performance metrics.
Enable the AUTO_UPDATE_STATISTICS option for SQL Server to auto-update statistics when a certain number of rows are modified. This SQL Server Query Optimizer executes the option synchronously. So, when you run a query and the table’s statistics are outdated, SQL Server updates the statistics before executing the query. Consequently, this query’s performance degrades.
To solve this performance issue, set AUTO_UPDATE_STATISTICS_ASYNC to ON. This way, the SQL Server Query Optimizer updates the statistics asynchronously.
Don’t just rely on SQL Server to auto-update statistics. Schedule regular statistics updates using SQL Server Agent jobs. Use the UPDATE STATISTICS command to update specific tables or indexes or the sp_updatestats command to update all tables.
Monitor query performance metrics such as query duration, CPU usage, and input/output (I/O) usage to determine whether statistics are outdated. If you notice a performance degradation, update statistics for the affected tables or indexes.
Schedule regular updates and monitor your database performance for degradation to identify when the statistics need updating. Also, enable AUTO_UPDATE_STATISTICS_ASYNC to update statistics asynchronously when the number of modified rows exceeds the set threshold. These regularly-updated statistics help SQL Query Optimizer choose query plans for optimal database performance.
Information is vital to keeping your SQL Server running at its best. Learn how Zoho Analytics helps visualize your SQL Server statistics to discover hidden insights for better performance.
Learn how to diagnose Azure SQL server performance problems. Troubleshoot Azure SQL performance issues following the best practices.➤
Learn how you can detect & resolve SQL deadlocks in SQL servers. Learn about the different types of SQL server deadlocks, how they happen & how to fix them.➤
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.Apply Now