Master of Query Performance

Add Query Analyzer, SQL Profiler, Performance Monitor and good query-writing habits to your bag of tricks, and you may never have to worry about performance complaints again.

Performance optimization is an important part of any IT pro's career. It's simply not good enough that your servers remain available and data backed up. Those are the prerequisites for a smoothly running environment. What's just as important is understanding performance issues related to your database servers. You should be able to pinpoint performance problems and find useful solutions.

If you've worked with SQL Server, you know that Microsoft has provided a great set of administrative tools. However, it's generally up to you — the database developer or administrator — to make the best of them. In this article, I'll discuss the sometimes-mysterious aspect of troubleshooting database-related performance.

The full topic of database performance optimization can (and has) filled many books. It's outside the scope of this article to dive into the details of every aspect of the topic. Instead, I'll focus on an overview of the various tools and technologies available for measuring and optimizing SQL Server-related performance issues. My goal is to provide a high-level overview of the tools and techniques at your disposal.

For further information and step-by-step details for various sections of this article, be sure to consult the extremely useful SQL Server 2000 Books Online (installed, by default, with SQL Server 2000). This article is intended for those who work with SQL Server and know the basics. I'll focus on SQL Server 2000, but the vast majority of information in this article will pertain to SQL Server 7.0, as well.

Understanding Database Performance Issues
Microsoft's SQL Server (and all modern relational database systems) use several features to ensure that queries can be processed as quickly as possible. There are two main goals: response time (generally, the amount of time that it takes to return the first few records) and throughput (the time it takes to return the entire result set).

The specific requirements for real-world applications will vary. For example, if you're supporting an online e-commerce site, it might be critical for users to be able to view up-to-the-minute pricing and availability information for products. Should they choose to buy something, the "checkout" process must be quick. Contrast that scenario with a decision-support system. Suppose your executive management team wants to compare numbers for Q1 and Q2 of this year. In that case, accuracy of information may be the most important goal, and the extra time that it takes to generate the requested data will be well spent.

The first and, arguably, most important step in performance optimization is to understand both your application and the business problems it's trying to solve. You should know whether a 200-millisecond access time is reasonable and how long your users are willing to wait to receive the information they requested. Armed with this information, it's time to look at how you can optimize the performance of your own applications. Let's start by discussing the extremely important topic of SQL joins.

Optimizing SQL Joins
Although the simplest SQL queries might retrieve information from only one table, most real-world queries combine information from multiple tables to get what they need. For example, you might want a list of customers (stored in the Customer table) and all of the orders they've placed in the last two weeks (stored in the Orders table). The solution is to use SQL joins — something any database developer is familiar with. However, although joins are extremely important for getting the information your users need, they can be the Achilles' heel of an otherwise well-designed system.

When performing queries that use multiple tables, you must always join all of the tables to each other to preserve the relationships between them. If you don't, you may get a result that's correct (or, worse yet, just seems correct) but takes a long time to process. Also, a typical mistake is to remove duplicates through the use of a SELECT DISTINCT clause (which makes the problem worse). To illustrate this, consider the case of three tables:

  • Customer (350 rows)
  • Employee (50 rows)
  • Order (500 rows)

If you were to run a query such as the following:

SELECT e.FirstName, e.LastName
FROM Customer c, Employee e, Order o
WHERE e.FirstName = 'Anil'

Since you haven't explicitly joined all of the tables, a Cartesian product will result. A Cartesian join (or "Cross Product") is the term used to describe every possible combination of rows that could result. In this case, that number would be at least 8,750,000 rows! If I added another table with just 10 rows, the product would be over 87 million rows! This would suck up a lot of time and system resources (RAM, CPU and disk space) to process. The correct way to write this would be:

SELECT e.FirstName, e.LastName
FROM Customer c, Employee e, Order o
WHERE e.FirstName = 'Anil'
AND c.ID = o.CustomerID
AND o.SalesPersonID = e.id

Note that all of the tables are joined together in this example. That will dramatically filter the result set during processing and allow SQL Server quickly to return the results you requested. In other words, users get their data quickly, and SQL Server doesn't break a sweat!

Measuring Server, Database and
Query Performance
Since performance monitoring and optimization are important concerns in real-world applications, Microsoft has included several tools that can be used to measure database performance. In this section, we'll cover the most important ones included with SQL Server 2000. Keep in mind that each method of performance monitoring focuses on a different aspect of system performance (with some overlap). However, information from all of the tools can be useful in pinpointing performance problems.

Measuring Server Statistics with Performance Monitor
An old favorite of experienced Windows NT/2000 systems administrators is the Performance tool, which gives you a great overall collection of information about how your existing server resources are being used. Points of interest might include memory use, CPU use, network usage information and, of course, statistics related to SQL Server.

Figure 1 shows an example of some standard SQL Server 2000 performance counters and objects, many of which may be of interest to database administrators and developers. For example, you might want to start by viewing overall CPU and memory usage. Is the number of pages/second high? If so, perhaps you're running some memory-intensive queries on the server. This is often the case for large databases, which have queries that return large result sets. Or, if you're seeing long, sustained CPU spikes, it might be time to look at upgrading processors (or adding more of them, if your hardware supports SMP).

Win2K Performance Monitor
Figure 1. Viewing server performance information in the Windows 2000 Performance tool. (Click image to view larger version.)

Fortunately, when you're adding a performance-related counter to the performance charts and reports, you can click on the Explain button to get some additional information about what you're measuring. And, if that's not enough, you can find additional details within SQL Books Online. Now that you have a method for collecting high-level performance data, let's drill down one step.

Monitoring Queries with SQL Profiler
Regardless of how much planning and testing you perform, the best test of any application is how it's used in the real world. The SQL Profiler tool was designed to provide you with just that type of information. You can monitor any or all queries that are executing on your database servers, including dozens of parameters for each operation executed by SQL Server. For example, you can measure how many CPU cycles and IO operations a specific query required. Or, you can find out which user called a long-running query. Furthermore, you can filter the results to those items of interest. Perhaps you know that Monica in Accounting has a tendency to write some extremely long-running ad-hoc queries. In that case, you can simply monitor the queries that she's running to have some hard evidence of this!

SQL Profiler was designed to be an easy and intuitive tool (both are good requirements for any product that systems administrators are expected to use). In order to use SQL Profiler, you begin by first designing a trace definition, which defines exactly what you want to monitor. Figure 2 shows the options at your disposal for storing collected information. Note that the data can be stored to a trace file (generally faster, but more difficult to analyze) or to a database table (which allows much more powerful analysis through SQL queries, but requires more system resources to store information). For ease of administration, you can develop trace templates for creating new traces in the future. If you choose to save your trace information, you'll be able to take advantage of SQL Server's Index Tuning Wizard (described later in this article).

Figure 2. Viewing the General settings for a new Trace file.

As you might have already guessed, running Profiler does come at a cost. Although in some cases the performance effects of logging queries might be negligible, on larger systems the performance impact could be huge. Therefore, you want to make sure to collect only the data of interest. For example, suppose you want to isolate poorly performing queries. You can tell Profiler to record information about queries that take longer than 15 seconds to complete. Figure 3 shows the options you can enable to specify exactly what information is recorded.

Filtering
Figure 3. Choosing filters to restrict the types of events that are recorded.

Once you've told Profiler what information you want to record, you're ready to begin the trace. Figure 4 shows a sample of the types of information you might choose to collect. In this case, I chose to record all queries that take more than a second to execute. I'm saving the text of the queries, how long they took to run and other server performance statistics for the operation.

SQL Profiler
Figure 4. Recording Query Information using SQL Profiler. (Click image to view larger version.)

SQL Server Profiler also lets you replay the trace data you've collected. This can be useful in determining the effects of any changes you might have made. A typical scenario might begin when a user complains of slow performance. To isolate and troubleshoot the problem, you might decide to create a trace of the specific queries being run. Then you go into the system and make changes to some queries. In order to ensure that overall performance has improved, you can rerun the operations you recorded and measure the time it takes on the server.

Thus far, we've looked at server- and database-level monitoring. Now, let's drill-down to the level of the SQL query itself.

Optimizing Query Logic using SQL Query Analyzer
Any developer who has worked with SQL Server 2000 is familiar with the Query Analyzer tool. Query Analyzer goes far beyond just color-coding queries. It allows for complex analysis of the steps that SQL Server has to go through when you ask it to run a SQL query. Figure 5 shows the various options available for viewing query-related information.

Query Analyzer
Figure 5. Viewing the available options in the Query menu of Query Analyzer. (Click image to view larger version.)

Of particular interest is the option to Show Execution Plan. When this option is enabled, SQL Server automatically displays the steps it had to perform to complete a query every time you run one. Figure 6 provides an example for a somewhat complicated query. Note that you can hover your mouse over any part of the diagram to view detailed statistics about the operation that was performed.

Query Analyzer execution plan review
Figure 6. Examining query execution plan information in Query Analyzer. (Click image to view larger version.)

By using the Query Analyzer, you can pinpoint what operations within a query might be taking a lot of time and server resources. For example, you might have a large table that's missing an index, causing SQL Server to perform a table scan every time you join to it (trust me, that's generally a bad thing). In this case, Query Analyzer might show that 45 percent of the time it took to complete the query was spent performing the table scan. You can right-click on the particular operation and choose, "Manage Indexes". A few well-placed mouse clicks, and you've instantly indexed the table! Overall, Query Analyzer's additional features can save hours of headaches (and customer support calls from users who are tired of getting up for coffee every time they run a query).

Dealing with Caching
When optimizing queries, it's important to understand how you can ensure reliable and consistent results. Let's discuss caching.

When you're testing queries, you might run something once and notice that it takes a long time. However, you run it again, and things look much better. Or you run the query on one system and it seems to be going great, but the same query takes forever on another machine. One reason for this is the way that SQL Server caches query information. Therefore, anytime you want to get consistent results on query performance, use the following two commands:

  • DBCC DropCleanBuffers: Cleans out all cached query results that might be stored in RAM from previous queries.
  • DBCC FreeProcCache: Cleans out the procedure cache, which may contain plans optimized on previous running of queries.

Building Indexes with the Index Tuning Wizard
The efficient creation of optimal indexes is an art form that often results in large paychecks for DBAs. The presence of optimal indexes can dramatically improve query performance. However, it's possible to have too much of a good thing. Indexing comes at a cost (disk space and the resources required to build the index), so it's important that you choose wisely. The trick becomes choosing what indexes are most worthwhile. In the real world, this can be a huge undertaking for all but the smallest applications. Fortunately, SQL Server's Index Tuning Wizard can dramatically reduce the amount of effort it takes to find missing or desirable indexes.

The Index Tuning Wizard can take a Profiler trace file as input and iteratively execute the queries within it to determine the optimal indexing solution. It can even consider the use of materialized views (in the case of SQL Server 2000, Enterprise Edition). Figures 7 and 8 show the Index Tuning Wizard in action. You can access the Index Tuning Wizard from within Enterprise Manager, Query Analyzer or SQL Profiler. It just doesn't get much easier than this!

Index Tuning Wizard (index recommendations)
Figure 7. Using the Index Tuning Wizard to find index recommendations.

Index Tuning Wizard (detail view)
Figure 8. Viewing details about expected performance improvements based on suggestions from the Index Tuning Wizard.

Choosing the Right Tool
As you've just seen, each of the above tools is useful for measuring a specific aspect of overall database server performance. For example, you might use Performance Monitor to find out whether or not your server might benefit from a server upgrade (additional RAM or more CPUs, for example). Suppose, you notice that load is high on one of your database servers and the database server supports many databases. How can you find out which databases are seeing the most load (so you can move the database or target it for performance optimizations)? SQL Profiler is the right tool for that job. You can choose to simply record a sample set of queries during the day. You can store this information in a database and then create your own SQL queries that pinpoint potential performance hotspots.

Finally, suppose you isolate the problem to a few misbehaving queries. Now, it's time to put on your development hat and try to optimize performance at the code level. Query Analyzer offers many different features that can assist in this otherwise unpleasant task. It can provide you with the types of information you need to modify the query for increased performance. If you determine that the problem is index-related, you might want to run it through the Index Tuning Wizard's wringers to see what shakes out. Once you get proficient with these steps, you'll find that you can isolate, measure and optimize performance issues very quickly (you're optimizing your own performance, so to speak).

The Bottom Line
If you've been paying any attention at all, you're probably already aware that SQL Server 2000 provides database administrators and developers with a wide array of performance monitoring and optimization tools. Although the tools are easy to use, it's up to you to understand how to make the best of them. Add Query Analyzer, SQL Profiler, Performance Monitor and good query-writing habits to your bag of tricks, and you may never have to worry about performance complaints again. Good luck and good speed!

comments powered by Disqus
Most   Popular