Saturday, July 17, 2010

Order By Does Not Always Guarantee Sort Order

A week or so ago, I saw an interesting question on the MSDN SQL Server forums and I thought it would make a great blog post.  The forum question asked about an Order By clause that does not guarantee sort.  The query was really two queries merged together via a UNION.  The OP noticed that when UNION ALL was used the sort order was different than the same query using UNION, even though an ORDER BY clause was used.  If you are familiar with UNION and UNION ALL, you know that UNION has to perform a distinct sort and remove duplicates, while UNION ALL does not.  The query plan between the two queries is identical, other than a sort vs. a distinct sort.

Here is a small scale repro of the problem.

SET NOCOUNT ON;
GO

IF OBJECT_ID('tempdb.dbo.#t') IS NOT NULL
DROP TABLE #t;
GO

CREATE TABLE #t(id INT, col CHAR(1),col2 BIT);

INSERT INTO #t VALUES (1,'a',1)
INSERT INTO #t VALUES (1,'a',0)
GO

SELECT id, col, col2
FROM #t 

UNION ALL

SELECT id,col,col2
FROM #t 
ORDER BY id, col
GO

SELECT id,col,col2
FROM #t 

UNION

SELECT id,col,col2
FROM #t 
ORDER BY id, col
GO

/*
id          col  col2
----------- ---- -----
1           a    1
1           a    0
1           a    1
1           a    0


id          col  col2
----------- ---- -----
1           a    0
1           a    1
*/

As you can see that the order of col2 is not the same between the two queries. The root of this problem is I am using columns that contain duplicates in the ORDER BY clause and col2 is not included in the ORDER BY clause.  I can never guarantee the order of the “duplicate” rows because I cannot guarantee how the optimizer will build and execute the query plan.  In this example, the UNION query sorts by all columns in the select list, which includes col2, while the UNION query does not. You can guarantee that the order will be  id, col, but the col2 value order may vary between executions.  You will need to add col2 to the ORDER BY clause to guarantee the sort.

SELECT id, col, col2
FROM #t 

UNION ALL

SELECT id,col,col2
FROM #t 
ORDER BY id, col, col2
GO

SELECT id,col,col2
FROM #t 

UNION

SELECT id,col,col2
FROM #t 
ORDER BY id, col, col2
GO

/*
id          col  col2
----------- ---- -----
1           a    0
1           a    0
1           a    1
1           a    1


id          col  col2
----------- ---- -----
1           a    0
1           a    1
*/

I thought this was a good reminder to all that even with an ORDER BY clause specified, the order of the rows may not be what you expect.  You have to use an ORDER BY clause and make sure all the columns you want to sort by are listed in the ORDER BY.

Until next time happy coding.

No comments: