Demystifying SQL Server: SQL Server 2005 – How To Move 10 Millions Rows In 1 Millisecond

Tuesday, December 15, 2009

SQL Server 2005 – How To Move 10 Millions Rows In 1 Millisecond

This blog post is more a tip that I picked up on while at PASS 2009. Have you ever had the need to copy the contents of an entire table into another table? Traditionally speaking, we as developers will use SELECT INTO or a INSERT INTO statement to load a destination table. This is a still a great way of accomplishing the task at hand, but it is not nearly fast as what I am about to show you. The method I am about to show you is not for all scenarios, but it can be very handy. I do not know how many Oracle guys are reading this, but I have one question for you, “can your RDMS move 10 millions rows of data in <= 1 millisecond?” I would be willing to bet that most will answer no, but I have to admit that this is not really a fair fight. Why is this a little unfair…. let’s have a look at how this works under the hood.

This method derives it power based on new partitioning functionality, in SQL Server 2005. If you have used partitioning in SQL Server 2005, you probably have a good idea where I am going with this. If not, SQL Server 2005 has built-in functionality that allows tables to be split or divided into what I will call virtual tables, whose values are dependent on predefined boundaries. When the partitioning function column is used in a query predicate the optimizer knows which partition the data resides in, which makes queries more IO efficient. This is amazing functionality because it does not require application changes and significantly reduces the amount of data SQL Server has to sift through. The partitioning feature I will be focusing on is the feature that allows SQL Server to switch or trade partitions out, ironically named SWTICH. This is commonly used for situations were you want to move data to a different partition either because the boundaries have changed or you need to phase data out. The real benefit in using the SWITCH function is SQL Server does not actually move the data, it updates the meta data pointers to the data. Because I am not actually moving data, I am able to move data around nearly instantaneously, regardless of the number of rows. This is why I said it is not fair, but hey what in life is fair :^)

Okay let’s see an example. I will start by creating a sample table.

USE [tempdb]
GO

IF EXISTS(SELECT 1 FROM sys.tables WHERE NAME = 'TestData')
BEGIN
    DROP TABLE [dbo].[TestData];
END
GO

CREATE TABLE [dbo].[TestData](
RowNum INT PRIMARY KEY,
SomeId INT,
SomeCode CHAR(2)
);
GO

INSERT INTO [dbo].[TestData]
SELECT TOP 10000000 
    ROW_NUMBER() OVER (ORDER BY t1.NAME) AS RowNumber,
    ABS(CHECKSUM(NEWID()))%2500+1 AS SomeId, 
    CHAR(ABS(CHECKSUM(NEWID()))%26+65)
    + CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS SomeCode
FROM 
    Master.dbo.SysColumns t1,
    Master.dbo.SysColumns t2
GO

IF EXISTS(SELECT 1 FROM sys.tables WHERE NAME = 'NewTestData')
BEGIN
    DROP TABLE [dbo].[NewTestData];
END
GO

--Create New Table To Move Data To
CREATE TABLE [dbo].[NewTestData](
RowNum INT PRIMARY KEY,
SomeId INT,
SomeCode CHAR(2)
);
GO

Now the fun part……. behold the power of SQL Server!!!!!!!!!!!!!!!!

--Move data to the new table
SET STATISTICS TIME ON;
SET STATISTICS IO ON;

ALTER TABLE [dbo].[TestData] SWITCH to [dbo].[NewTestData];

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
GO

/*
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms.
*/

Next, I will verify the results.

SELECT COUNT(*) FROM [dbo].[TestData]; --0
SELECT COUNT(*) FROM [dbo].[NewTestData]; --10,000,000

/*
-----------
0

(1 row(s) affected)


-----------
10000000

(1 row(s) affected)
*/

There you have. I have successfully moved 10 million rows into a new table in 1 MS and incurred no IO through IO stats; however, IO has to be incurred to update meta data, although it should be minimal. This method has limited use, but can be extremely advantageous. There are stipulations that have to be met for SWITCH, such as the table you are switching to must be empty and the table must have the same schema. For a comprehensive list of requirements please refer to this link, http://technet.microsoft.com/en-us/library/ms191160.aspx .

Until next time happy coding.

19 comments:

Michael J. Swart said...: That's awesome. I think I was at the same session at PASS. Here's my take on the same tip:

http://michaeljswart.com/?p=280

Cheers; December 16, 2009 at 6:11 AM
Adam Haines said...: Interesting comparison Michael. I had not really though about comparing sp_rename and SWITCH. Thanks for the feedback.; December 16, 2009 at 6:56 AM
Tharindu Dhaneenja said...: Hi Adam,
Wow that’s was awesome and thanks for the summarizing it here. And I’m huge fan of SQL Server and Oracle but I didn’t notice this until reading your blog. But I was done many times same method using Oracle. (Same execution time- Execution Times: 1 ms)

----
/*Oracle PL/SQL */

ALTER TABLE testdata
EXCHANGE PARTITION tab_h1
WITH TABLE newtestdata;

----

Thanks,
Tharindu Dhaneenja; December 29, 2009 at 4:30 AM
Adam Haines said...: Tharindu,

Thanks for the feedback. I figured Oracle had some variation of this functionality, but I was thinking it may be SQL Server specific. What version of Oracle is the syntax you posted compatible with?; December 29, 2009 at 5:56 AM
Tharindu Dhaneenja said...: Hi Adam,
it's - Oracle Database 10g Enterprise Edition Release 10.2.0.1.0

Thank you; December 29, 2009 at 6:27 AM
Anonymous said...: I was going to say. Oracle can too but somebody else did it ^_^

Partitioning is awesome.
I got an almost 2 billion rows table containing 1 month of data which we have to maintain.

Partitioning becomes a must.

In your case, you already got your data somewhere and all you need is a move.

I need to load data anywhere between 100 and 150 million rows a day.
The best way I've found is:
1) Load the data in a table similar to the master table except: No index nor primary key (That means you trust your source)
2) Create primary key, index and constraints

3) Switch to master table

--

At the other end of the master table.

1) Create a table with same structure
2) Switch to the temp table
3) Drop temp table

The speed gain then comes from the creation of indexes only on a small portion of the data

The loading is on an empty table with no index

Cleaning 150 million rows from the master table also becomes a breeze.

Worth mentioning: since it's a pointer switch and there's no data movement, the data has to be in the same filegroup.; January 10, 2010 at 8:08 PM
Anonymous said...: Also.

Thanks for the scheduling SP on Microsoft forums. highly helpful.

I'll have to go with something similar.; January 10, 2010 at 8:10 PM
Ranjith said...: This is Great stuff. Thanks for sharing.; January 20, 2010 at 3:16 AM
daspeac said...: that’s good, but you can try another excel repair tool; October 4, 2010 at 12:15 PM
Joshua Smith said...: Thanks a lot for turning attention on useful survey. It important to understand that mobile software development could help in your business by installing custom development software applications.; August 2, 2011 at 3:50 AM
Joshua Smith said...: It's well done info. Let me mention about cheap home insurance for USA policy owners house protection agents. Compare online free quotes on homeowners insurance.; November 7, 2011 at 2:44 AM
Sarah said...: Thanks much for this post.. How do you think it would affect replication. If the table to which the data gets moved over to has replication on.
Will truncating data and then applying the switch statement cause any negitive affect on replication?; February 28, 2012 at 1:38 PM
Anonymous said...: Hello,
can this be used with a select and where clause?

i.e.

Select * from mytable
where date = xxxxxx

Also what are the table size implications?

what's the size of TestData vs. NewTestData?; September 27, 2012 at 2:50 PM
Historical Option Data said...: Your method of explaining the whole thing in this post is in fact pleasant, every one be capable of effortlessly be aware of it, Thanks a lot.; June 19, 2014 at 10:17 AM
Software Development Company in Delhi said...: Choose the Best Software Development Services Company in Delhi is also made best designing and CMS web application.
http://www.razorse.com; April 2, 2017 at 11:32 PM
Anonymous said...: Thanks, Adam. This is a great tip.

Is there a way to quickly load data from one table to another table (both have same schema) but destination table already has data (in my case destination table has ~1.5 billion rows).; June 17, 2018 at 6:10 PM
Anonymous said...: We are an experienced team in one of the Best software company and product specialist for software development and implementation. Sovereign provides Website Design, Web Development and Mobile App Development, Digital marketing and SEO Services.
Web Development; September 15, 2020 at 9:04 PM
Jija Dahifale said...: Nice. Good example; April 19, 2021 at 9:37 AM
baku said...: Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, Pet Care adda
Sita Warrior Of Mithila Pdf Download , IDFC First Select Credit Card Benefits,Poem on Green and Clean Energy; February 3, 2022 at 4:53 AM

Demystifying SQL Server

Tuesday, December 15, 2009

SQL Server 2005 – How To Move 10 Millions Rows In 1 Millisecond

19 comments:

Subscribe

Search This Blog

About Me

Twitter

Twitter

Blogs I follow

Followers

Blog Archive

Keywords