In my last post, I talked about covering the query and the join expression, with a single index, http://jahaines.blogspot.com/2010/01/optimizing-sql-server-joins.html. Covering the join expression allows the optimizer to seek the joining column and has the benefit of removing the very costly Key/RID lookup. In this post, I will be talking about optimizing join order.
Let’s start with a very simplistic definition of join order. Join order is the order in which the optimizer processes tables involved in the join clause. When tables are processed, the tables are placed in either the top or bottom input, which dictates how the data is searched and how many rows are affected. There is plenty of misinformation out there stating that the order you specify your joins in can directly impact query performance. While this is partially true, it is more the exception than the rule. This misinformation came about because the optimizer is a cost based optimizer. The optimizer has to determine the best plan in the shortest amount of time. The optimizer does not have all the time in the world to test each and every join order or permutation. Because the optimizer cannot evaluate every possible permutation (in some cases), it is possible for the optimizer to miss a more optimal join order, but this would only happen when a query has an exuberant number of joins. Like I said before this is not a regular occurrence. In almost all cases, the join order is of negligible concern. Let’s see this in action.
First I will create my sample tables and load them with data.
USE tempdb GO IF OBJECT_ID('tempdb.dbo.Customers') IS NOT NULL BEGIN DROP TABLE dbo.Customers; END GO CREATE TABLE dbo.Customers( CustId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, FName VARCHAR(50), LName VARCHAR(50) ); GO ;WITH L0 AS (SELECT 1 AS C UNION ALL SELECT 1) --2 rows ,L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B) --4 rows (2x2) ,L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B) --16 rows (4x4) ,L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B) --256 rows (16x16) ,L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B) --65536 rows (256x256) ,L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B) --4,294,967,296 rows (65536x65536) ,Nums AS (SELECT row_number() OVER (ORDER BY (SELECT 0)) AS N FROM L5) INSERT dbo.Customers(FName,LName) SELECT 'FName' + CAST(N AS VARCHAR(10)), 'LName' + CAST(N AS VARCHAR(10)) FROM Nums WHERE N<=1000; GO IF OBJECT_ID('tempdb.dbo.Orders') IS NOT NULL BEGIN DROP TABLE dbo.Orders; END GO CREATE TABLE dbo.Orders( OrderId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, CustId INT ); GO ;WITH L0 AS (SELECT 1 AS C UNION ALL SELECT 1) --2 rows ,L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B) --4 rows (2x2) ,L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B) --16 rows (4x4) ,L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B) --256 rows (16x16) ,L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B) --65536 rows (256x256) ,L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B) --4,294,967,296 rows (65536x65536) ,Nums AS (SELECT row_number() OVER (ORDER BY (SELECT 0)) AS N FROM L5) INSERT dbo.Orders(CustId) SELECT ABS(CHECKSUM(NEWID())%250+65) FROM Nums WHERE N<=100; GO CREATE NONCLUSTERED INDEX ncl_idx_CustId ON dbo.Orders(CustId); GO
SET STATISTICS IO ON; SET STATISTICS PROFILE ON; GO SELECT o.CustId,c.FName,c.LName FROM dbo.Orders o INNER JOIN dbo.Customers c ON c.CustId = o.CustId WHERE c.LName = 'LName151' SET STATISTICS IO OFF; SET STATISTICS PROFILE OFF; GO /* Table 'Orders'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Customers'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SELECT o.CustId,c.FName,c.LName FROM dbo.Orders o INNER JOIN dbo.Customers c ON c.CustId = o.CustId WHERE c.LName = 'LName151' |--Nested Loops(Inner Join, OUTER REFERENCES:([c].[CustId])) |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[Customers].[PK__Customers__7E6CC920] AS [c]), WHERE:([tempdb].[dbo].[Customers].[LName] as [c].[LName]='LName151')) |--Index Seek(OBJECT:([tempdb].[dbo].[Orders].[ncl_idx_CustId] AS [o]), SEEK:([o].[CustId]=[tempdb].[dbo].[Customers].[CustId] as [c].[CustId]) ORDERED FORWARD) */ GO
SET FORCEPLAN ON; SET STATISTICS IO ON; SET STATISTICS PROFILE ON; GO SELECT o.CustId,c.FName,c.LName FROM dbo.Orders o INNER JOIN dbo.Customers c ON c.CustId = o.CustId WHERE c.LName = 'LName151' SET FORCEPLAN OFF; SET STATISTICS IO OFF; SET STATISTICS PROFILE OFF; GO /* Table 'Customers'. Scan count 0, logical reads 200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Orders'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SELECT o.CustId,c.FName,c.LName FROM dbo.Orders o INNER JOIN dbo.Customers c ON c.CustId = o.CustId WHERE c.LName = 'LName151' |--Nested Loops(Inner Join, OUTER REFERENCES:([o].[CustId])) |--Index Scan(OBJECT:([tempdb].[dbo].[Orders].[ncl_idx_CustId] AS [o])) |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[Customers].[PK__Customers__7E6CC920] AS [c]), SEEK:([c].[CustId]=[tempdb].[dbo].[Orders].[CustId] as [o].[CustId]), WHERE:([tempdb].[dbo].[Customers].[LName] as [c].[LName]='LName151') ORDERED FORWARD) */
Until next time happy coding.
5 comments:
In FF i couldn't see a lot of the details in the coder until i highlighted it.
Hi Brian,
I am using a loner laptop and had to get a different code editor, which really did not work out well. It seems you have to highlight the code to view the underlying data.
I have since modified the code editor, are you still experiencing the problem?
Nope.
Thanx Adam. Keep up the good work. You're blog is excellent.
Thanks for the review! Also you may turn your attention on outsourced outsource software development.
We're glad to read about this. I'm happy to share with cheap quotes on insurance that is provided by common insurance companies. You may save on instant auto insurance rates which allow customers to get cheap policy.
Post a Comment