tag:blogger.com,1999:blog-46461374383666878952024-03-14T10:10:09.769-07:00Demystifying SQL ServerA blog for SQL Server best practices, performance considerations, advanced TSQL techniques and handy tips and tricks.Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.comBlogger62125tag:blogger.com,1999:blog-4646137438366687895.post-68540944587510300952011-08-13T17:45:00.001-07:002011-08-16T16:17:49.114-07:00Book Review–Microsoft SQL Server 2008 R2 Administration Cookbook<p> </p> <h4>Brief Review:</h4> <p><a href="https://www.packtpub.com/microsoft-sql-server-2008-r2-administration-cookbook/book#in_detail" target="_blank">Microsoft SQL Server 2008 R2 Administration Cookbook</a> is a collection of recipes that help database administrators with day-to-day tasks.  The recipes from this book have a broad spectrum of topics, including  Piecemeal Restores, Optimistic Concurrency, Data-Tier Applications, Master Data Services, Replication, Multi-Server Management, Utility Control Point etc..  This book is a great reference on how to accomplish daily administrative tasks and challenges.</p> <h4>Detailed Review:</h4> <p><a href="https://www.packtpub.com/microsoft-sql-server-2008-r2-administration-cookbook/book#in_detail"><img style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: right; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="Microsoft%20SQL%20Server%202008%20R2%20Administration%20Cookbook" border="0" alt="Microsoft%20SQL%20Server%202008%20R2%20Administration%20Cookbook" align="right" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaiUd96f_NrwgaCCuuL-iDKxzjiWfW7jjWzGBBA7TX6l3Ml68tOv45OiMhsJcWBE-nKWIBjx15LCJwVSKhwCUEAG1zx9k-CssTUnvWK54aYhlCrR-99puqh_o_4MqSj5s50PMpS0nZGw/?imgmax=800" width="250" height="312" /></a>There are few books available that can keep your technical curiosity engaged, while presenting the information in a easy to understand manner.  Microsoft SQL Server 2008 R2 is one such book.  This book delivers quality content in a simplistic manner that will benefit all database administrators.  My favorite thing about this book is that it delivers real world solutions to real world problems.  In today’s IT environment, the database administrator can be a seasoned IT professional with 10 to 20 years of database experience, or what some people call an accidental DBA.  An accidental DBA is a database administrator that assumes the role of database administrator, either through necessity or circumstance, and usually has little to no real world experience as a database administrator.  This book caters to both the seasoned and the accidental database administrator.  </p> <p>As the name implies, Microsoft SQL Server 2008 R2 Administration Cookbook focusing on the administrative side of SQL Server.  This book contains over 70 quality recipes that database administrators can use in their day-to-day duties.  The great thing about this book is the diversity of the recipes.  The author does a great job of covering SQL Server administration from a lot of different angles using different immerging technologies.  The recipes include Resource Governor, Multi-Server Administration, Business Intelligence, High Availability and more.  There is an enormous amount of information spanning a broad spectrum of topics. Without sounding cliché, there is something for everybody, in this book.</p> <p>The author’s writing style is clear and concise.  You do not have to be a Microsoft Certified Master to understand the fundamental concepts presented in each recipe.  Another thing I like about this book is the code samples.  I cannot tell you how many times I have cracked open a book and have to analyze and reread the code sample to understand what it is doing.  This book does not have that problem. The code samples are very easy to understand and follow.  By keeping the samples simple, the author delivers a better experience to the reader. </p> <p>My favorite chapters in this book are Chapter 7 – Managing The Core Database Engine, Chapter 5 – Managing Core SQL Server 2008 R2 Technologies because these chapters contain a lot of useful information that a lot of database administrators do not know about, including Utility Control Point, SQL Azure, StreamInsight, and Master Data Services.  </p> <h4>The Verdict: 5/5</h4> <p>I recommend <a href="https://www.packtpub.com/microsoft-sql-server-2008-r2-administration-cookbook/book#in_detail" target="_blank">Microsoft SQL Server 2008 R2 Administration Cookbook</a> to any IT Professional currently working as a database administrator or wants to get into database administration, with SQL Server 2008 R2.  This book is a great administrative reference that you will want to keep close to your desk, regardless of experience level.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com13tag:blogger.com,1999:blog-4646137438366687895.post-84607498708160481682011-03-15T19:41:00.001-07:002011-03-15T19:41:30.703-07:00TSQL Challenge Quiz: Win an IPAD And Bragging Rights!<p>First and foremost, sorry for the long delay between posts, I have recently switched jobs and my laptop crashed.  I am just now getting back into a steady routine and will be posting more regularly in the coming weeks, so stayed tuned.  So now that I have the formalities out of the way….. How does a FREE IPAD sound?</p> <p>TSQLChallenges.com is  currently running TSQL Quiz 2011.  TSQL Quiz 2011 will be running a TSQL SQL Server question each day in March 2011.  Each question is orchestrated by SQL Server experts and community leaders to address SQL Server problem areas and/or best practices.  TSQL Quiz is a great opportunity to interact with fellow database professionals, strengthen your SQL Server knowledge, and most importantly win an IPAD, compliments of Red Gate Software ! Even if you feel like you are not online enough to compete, do not fret because the questions remain open for 30 days. You really do not have anything to lose.</p> <p>If you are interested in participating in TSQL Quiz 2011, you can start by clicking <a href="http://beyondrelational.com/quiz/SQLServer/TSQL/2011/default.aspx">http://beyondrelational.com/quiz/SQLServer/TSQL/2011/default.aspx</a>.  All the information you need is provided on the site.</p> <p>If you want to help TSQL Challenges by becoming a Quiz Master, you can click here, <a title="http://beyondrelational.com/quiz/nominations/0/new.aspx" href="http://beyondrelational.com/quiz/nominations/0/new.aspx">http://beyondrelational.com/quiz/nominations/0/new.aspx</a></p> <p><a href="http://beyondrelational.com/quiz/SQLServer/TSQL/2011/default.aspx"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="BeyondRelational_TSQL_Quiz" border="0" alt="BeyondRelational_TSQL_Quiz" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKr43pZATQSolbj9BMLuhvQOz4CxoKFsNTVsMz8bokTEkPV1O2gvL4az9LL9DIa8jUGhuR0VpUbWikX3V8JbYQyKdDN0gw2lO6xacrWJJ1udFp-V6bfhlBwDko7PlavELASbm2nLSPmA/?imgmax=800" width="108" height="110" /></a></p> <p>Good luck and happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com3tag:blogger.com,1999:blog-4646137438366687895.post-73352349348239663272010-08-19T19:16:00.001-07:002010-08-19T19:21:57.649-07:00SQL Server Parameter Sniffing<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/08/sql-server-parameter-sniffing.html";digg_title = "SQL Server Parameter Sniffing";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>Today on the MSDN TSQL forums I was asked about a performance problem and to me the problem seemed to be directly related to parameter sniffing.  The poster then stated that he is not using stored procedures, so it cannot be a parameter sniffing .  Truth be told there are a lot of misconceptions surrounding parameter sniffing.  The best way to understand parameter sniffing is to understand why it happens.  </p> <p>Parameter sniffing occurs when a parameterized query uses cached cardinality estimates to make query plan decisions.  The problem occurs when the first execution has atypical parameter values.  For each subsequent execution the optimizer is going to assume the estimates are good even though the estimates may be way off.  For example, say you have a stored procedure that returns all id values between 1 and 1000.  If the stored procedure is executed with this large range  of parameter values, the optimizer is going to cache these atypical values, which indirectly causes the optimizer to under estimate cardinality.  The problem is a typical execution may only return a few rows.  This “sniffing” can cause queries to scan a table oppose to seek because the optimizer is assuming inaccurate cardinality estimates.  The easiest way to tell if this problem is occurring in your environment, is to look at the query plan XML. Inside the query plan XML, you will see something similar to the code snippet below:</p> <pre class="csharpcode"><ColumnReference <span class="kwrd">Column</span>="@1" ParameterCompiledValue="(1000)" ParameterRuntimeValue="(10)" />
<ColumnReference <span class="kwrd">Column</span>="@0" ParameterCompiledValue="(1)" ParameterRuntimeValue="(1)" /></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>In the snippet above, the query plan is assuming that column@1 has a value of 1000 and column @0 has a value of 1, while the actual runtime values are 10 and 1 respectively.  </p>
<p>There are three different methods to incorporate parameterization in SQL Server, auto/simple parameterization, stored procedures, and dynamic TSQL (executed with sp_executesql).  One of the most common misconceptions I have seen surrounding parameter sniffing is thinking that it is limited to stored procedures.   Now that we know more about parameter sniffing, lets have a look at an example.  I will be using the AdventureWorks database for my example.  In this example, I will select a few rows from the Sales.SalesOrderHeader table and then issue the same query, but return a lot more rows.</p>
<p>Code:</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> Sales.SalesOrderHeader <span class="kwrd">WHERE</span> CustomerID <span class="kwrd">between</span> 1 <span class="kwrd">and</span> 10
<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> Sales.SalesOrderHeader <span class="kwrd">WHERE</span> CustomerID <span class="kwrd">between</span> 1 <span class="kwrd">and</span> 500</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Query Plan:</p>
<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/TG3laERqgdI/AAAAAAAAAbs/XkH-BIazCjw/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/TG3la1rH5VI/AAAAAAAAAbw/MpjuPu6Ac_w/image_thumb%5B1%5D.png?imgmax=800" width="602" height="305" /></a> </p>
<p>As you can see, the query plan changes based on the number of rows returned.  The reason being is in this case is the optimizer hit a tipping point where the cost of the key lookup is greater than an index scan.  Let’s see what happens when a parameter sniffing problem occurs.</p>
<pre class="csharpcode"><span class="kwrd">DBCC</span> freeproccache
<span class="kwrd">GO</span>
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End;'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=500
<span class="kwrd">GO</span>
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End;'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=10
GO</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Query Plans:</p>
<p><a href="http://lh3.ggpht.com/_ayZBUzPGG9A/TG3lbS7wE8I/AAAAAAAAAb0/L9P6uwhY4gE/s1600-h/image%5B7%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/TG3lbyXYpTI/AAAAAAAAAb4/QCubbFEGGHE/image_thumb%5B3%5D.png?imgmax=800" width="652" height="286" /></a> </p>
<p>The execution plans are identical for both queries even though the number of rows greatly decreased.  This is a parameter sniffing problem. This problem occurs because we executed and cached the atypical execution that is returning customerid values between 1 and 500.  We can look into the execution plan and see the compiled parameter values and we can look at the execution plan estimated rows to validate.</p>
<pre class="csharpcode"><ColumnReference <span class="kwrd">Column</span>="@<span class="kwrd">End</span>" ParameterCompiledValue="(500)" ParameterRuntimeValue="(10)" />
<ColumnReference <span class="kwrd">Column</span>="@<span class="kwrd">Start</span>" ParameterCompiledValue="(1)" ParameterRuntimeValue="(1)" /></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>What can you do to solve the parameter sniffing problem?  You have a few options that you can use to solve the parameter sniffing problem.  You can use a local variables, this makes the optimizer use the density of the table to estimate cardinality, option recompile, or use the optimize for hint.  </p>
<pre class="csharpcode">--<span class="kwrd">Declare</span> <span class="kwrd">local</span> variables
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'declare @dynStart INT,@dynEnd INT; SET @dynStart=@Start; SET @dynEnd=@End;SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @dynStart and @dynEnd;'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=500
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'declare @dynStart INT,@dynEnd INT; SET @dynStart=@Start; SET @dynEnd=@End;SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @dynStart and @dynEnd;'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=10
--Solution <span class="kwrd">Using</span> <span class="kwrd">option</span>(recompile)
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(RECOMPILE);'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=500
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(RECOMPILE);'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=10
--Solution <span class="kwrd">Using</span> OPTIMIZE <span class="kwrd">FOR</span> HINT
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start=1,@End=10));'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=500
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start=1,@End=10));'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=10
--Solution <span class="kwrd">Using</span> OPTIMIZE <span class="kwrd">FOR</span> <span class="kwrd">UNKNOWN (SQL 2008 only)</span>
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start UNKNOWN,@End UNKNOWN));'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=500
<span class="kwrd">EXEC</span> sp_executesql N<span class="str">'SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID between @Start and @End OPTION(OPTIMIZE FOR (@Start UNKNOWN,@End UNKNOWN));'</span>,N<span class="str">'@Start INT,@End INT'</span>,@<span class="kwrd">Start</span>=1,@<span class="kwrd">End</span>=10</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now each of the above methods will help alleviate some of the problems associated with parameter sniffing, but that does not mean it will give you an optimal query plan.  You should test each of the methods to see which makes the most sense for your environment.  If none of these options perform well, another option is to use control flow logic to execution different variations of the TSQL or stored procedure, allowing for more control over which execution plan gets used.  The thing to remember here, is you have to cater to your customers and their usage patterns to ultimately decide which solution is best for your environment.</p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com12tag:blogger.com,1999:blog-4646137438366687895.post-77941564873709419912010-08-07T13:59:00.001-07:002010-08-07T13:59:44.508-07:00TSQL Challenge 35 Available<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/08/tsql-challenge-35-available.html";digg_title = "TSQL Challenge 35 Available";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p><a href="http://beyondrelational.com/tc/" target="_blank">TSQLChallenges.com</a> recently released challenge 35, <a href="http://beyondrelational.com/blogs/tc/archive/2010/07/26/tsql-challenge-35-find-the-total-number-of-full-attendees-in-each-24-hop-session.aspx" target="_blank">“Find the total number of 'Full Attendees' in each 24 HOP Session”</a>.  For those not familiar with TSQL Challenges,  TSQL Challenges is a website that creates and evaluates SQL Server puzzles each and every week.  The goal of TSQL Challenges is to increase TSQL best practice awareness and to showcase solutions to common and sometimes uncommon TSQL problems, using set based programming logic. Not only do you compete in challenges, but more importantly TSQL Challenges gives you the opportunity to interact with your peers.  Essentially it is a mechanism to give back to and learn from the SQL Server community.  If you haven’t had a chance to stop by and checkout TSQL Challenges, I highly recommend you do so, <a href="http://beyondrelational.com/tc/" target="_blank">TSQLChallenges.com</a>.</p> <p>So…. What is the challenge?  The challenge should you choose to accept it is to count the number of attendees that fully watched each session at <a href="http://www.sqlpass.org/24hours/2010/default.aspx" target="_blank">24 hours of PASS</a>.  Note:  this data is artificial and does not reflect real 24 hours of PASS metrics.  If you love puzzles, TSQL, and PASS this challenge is for you.  </p> <p>Good luck and happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com2tag:blogger.com,1999:blog-4646137438366687895.post-37952925642834889892010-07-17T18:08:00.001-07:002010-07-17T18:08:50.925-07:00Order By Does Not Always Guarantee Sort Order<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/07/order-by-does-not-always-guarantee-sort.html";digg_title = "Order By Does Not Always Guarantee Sort Order";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>A week or so ago, I saw an interesting question on the MSDN SQL Server forums and I thought it would make a great blog post.  The forum question asked about an Order By clause that does not guarantee sort.  The query was really two queries merged together via a UNION.  The OP noticed that when UNION ALL was used the sort order was different than the same query using UNION, even though an ORDER BY clause was used.  If you are familiar with UNION and UNION ALL, you know that UNION has to perform a distinct sort and remove duplicates, while UNION ALL does not.  The query plan between the two queries is identical, other than a sort vs. a distinct sort.</p> <p>Here is a small scale repro of the problem.</p> <pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.#t'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> #t;
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> #t(id <span class="kwrd">INT</span>, col <span class="kwrd">CHAR</span>(1),col2 <span class="kwrd">BIT</span>);
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (1,<span class="str">'a'</span>,1)
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (1,<span class="str">'a'</span>,0)
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> id, col, col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> id, col
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">UNION</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> id, col
<span class="kwrd">GO</span>
/*
id col col2
---------<span class="rem">-- ---- -----</span>
1 a 1
1 a 0
1 a 1
1 a 0
id col col2
---------<span class="rem">-- ---- -----</span>
1 a 0
1 a 1
*/</pre>
<p>As you can see that the order of col2 is not the same between the two queries. The root of this problem is I am using columns that contain duplicates in the ORDER BY clause and col2 is not included in the ORDER BY clause.  I can never guarantee the order of the “duplicate” rows because I cannot guarantee how the optimizer will build and execute the query plan.  In this example, the UNION query sorts by all columns in the select list, which includes col2, while the UNION query does not. You can guarantee that the order will be  id, col, but the col2 value order may vary between executions.  You will need to add col2 to the ORDER BY clause to guarantee the sort.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> id, col, col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> id, col, col2
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">UNION</span>
<span class="kwrd">SELECT</span> id,col,col2
<span class="kwrd">FROM</span> #t
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> id, col, col2
<span class="kwrd">GO</span>
/*
id col col2
---------<span class="rem">-- ---- -----</span>
1 a 0
1 a 0
1 a 1
1 a 1
id col col2
---------<span class="rem">-- ---- -----</span>
1 a 0
1 a 1
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>I thought this was a good reminder to all that even with an ORDER BY clause specified, the order of the rows may not be what you expect.  You have to use an ORDER BY clause and make sure all the columns you want to sort by are listed in the ORDER BY.</p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com0tag:blogger.com,1999:blog-4646137438366687895.post-90723330781433122742010-07-08T19:18:00.001-07:002010-07-08T19:19:35.818-07:00Breaking the Print character limit<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/07/breaking-print-character-limit.html";digg_title = "Breaking the Print character limit";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>I got some grief regarding my SQL Meme post about PRINT. I specifically stressed that I believe PRINT needs a make over because its inability to handle max data types, <a title="http://jahaines.blogspot.com/2010/05/sql-meme-tagged-5-things-sql-server.html" href="http://jahaines.blogspot.com/2010/05/sql-meme-tagged-5-things-sql-server.html">http://jahaines.blogspot.com/2010/05/sql-meme-tagged-5-things-sql-server.html</a>.  I know I am not the only person out there that feels this functionality is a bit antiquated. In this post, I will provide a great alternative to PRINT.  I have been using this method for the past year or so to print really long dynamic SQL.  The concept is very simple.  Instead of printing the dynamic SQL to the messages tab, I will be converting the dynamic SQL to XML.  XML is a great alternative because it keeps the formatting and can hold up to 2 GB of data.  The key component here is naming the column [processing-instruction(x)].  This column name [processing-instruction(x)]sends special XML instruction allowing the text to be converted, along with any special characters.   It should be noted that whatever value you put in parenthesis will be incorporated in the XML tags, in my case “x”.</p> <p>Let’s have a look at how this works.</p> <pre class="csharpcode"><span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>)
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> =
<span class="kwrd">CAST</span>(REPLICATE(<span class="str">'a'</span>,5000) + <span class="kwrd">CHAR</span>(13) <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>)) +
<span class="kwrd">CAST</span>(REPLICATE(<span class="str">'b'</span>,5000) + <span class="kwrd">CHAR</span>(13) <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>)) +
<span class="kwrd">CAST</span>(REPLICATE(<span class="str">'c'</span>,5000) + <span class="kwrd">CHAR</span>(13) <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>)) +
<span class="str">'d'</span>
<span class="kwrd">SELECT</span> [processing-instruction(<span class="kwrd">x</span>)]=@<span class="kwrd">sql</span> <span class="kwrd">FOR</span> XML <span class="kwrd">PATH</span>(<span class="str">''</span>),TYPE</pre>
<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/TDaHA20i9_I/AAAAAAAAAbk/3ooEjb5L8J0/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/TDaHBpasUfI/AAAAAAAAAbo/2A91eD3s9Kg/image_thumb%5B1%5D.png?imgmax=800" width="685" height="125" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>Pretty simple right!!! There really is not much to this technique.  It is very simplistic and gets the job done.  If you find yourself getting aggravated with makeshift PRINT solutions, come on over to the dark side and get your XML on.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com13tag:blogger.com,1999:blog-4646137438366687895.post-66604810487480099182010-06-11T20:50:00.001-07:002010-06-11T20:51:08.122-07:00Why Are Transactions Blocked All Of A Sudden?<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/06/why-are-transactions-be-blocked-all-of.html";digg_title = "Why Are Transactions Blocked All Of A Sudden?";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>Have you ever had a query that runs perfectly fine one day and the then all of a sudden starts getting bombarded with blocking transactions?  Believe it or not this is not than uncommon an occurrence and more interestingly can occur when no changes occur in the schema at all!  Unbeknownst to most, you are susceptible to an imaginary data distribution tipping point that can go south at any point in time, if your application creates a specific type of workload.   Let’s dig deeper to find out what causes this problem.</p> <p>I will start off by creating some sample data.</p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
SomeBit <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 5000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
<span class="kwrd">CASE</span> <span class="kwrd">WHEN</span> ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) %5 = 0 <span class="kwrd">THEN</span> 1 <span class="kwrd">ELSE</span> 0 <span class="kwrd">END</span> <span class="kwrd">AS</span> SomeBit,
<span class="str">'A'</span> <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Nothing new here, just a table with some data.  Now I will begin a transaction and run a simple UPDATE statement. </p>
<pre class="csharpcode"><span class="kwrd">BEGIN</span> <span class="kwrd">TRANSACTION</span>
<span class="kwrd">UPDATE</span> dbo.TestData
<span class="kwrd">SET</span> SomeCode = <span class="str">'B'</span>
<span class="kwrd">WHERE</span> somebit = 0</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now run a simple select statement against the table in a new query window.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> RowNum <span class="kwrd">FROM</span> dbo.TestData <span class="kwrd">WHERE</span> RowNum = 1000
/*
RowNum
---------<span class="rem">--</span>
1000
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The query returned a resultset, just as we thought it would.  What I wanted to show here is that we currently do not have a blocking problem and users can still access the rows that do not have a SomeBit value of 0.   SQL Server will try to take the lowest or most granular lock possible when satisfying a query, such that other queries can still do what they need to.  Obviously there are limitations to this and SQL Server reacts differently based on system responsiveness and pressures.    You can verify that you cannot access a row with a SomeBit value of 0 by changing the predicate to a number that is not divisible by 5.</p>
<p>Okay…. big deal…. you probably already know this, but lets suppose that your manager tells you to migrate data from an existing system into this table.    The flat file has a measly 3000 rows is it, so its overall impact should really have no implications on our system right???? Let’s find out.  Please note that this problem can manifest itself by hitting a tipping point of data also… Meaning it does not take a huge influx of data to cause this problem, and this is why this problem can appear seemingly out of nowhere.</p>
<p>I will load the data with the same insert statement to mimic our data migration.</p>
<pre class="csharpcode">INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 3000
<span class="kwrd">CASE</span> <span class="kwrd">WHEN</span> ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) %5 = 0 <span class="kwrd">THEN</span> 1 <span class="kwrd">ELSE</span> 0 <span class="kwrd">END</span> <span class="kwrd">AS</span> SomeBit,
<span class="str">'A'</span> <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
GO</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Now let’s run our same update transaction again.</p>
<pre class="csharpcode"><span class="kwrd">UPDATE</span> dbo.TestData
<span class="kwrd">SET</span> SomeCode = <span class="str">'B'</span>
<span class="kwrd">WHERE</span> somebit = 0</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Now run the query below in a new window.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> RowNum <span class="kwrd">FROM</span> dbo.TestData <span class="kwrd">WHERE</span> RowNum = 1000</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>What you should see is the query is now blocked.  Keep in mind that nothing changed on this server, except new data was inserted into the table.  Any ideas why this problem is now occurring?  If you haven’t figured it out yet, this problem is caused by lock escalation, <a title="http://msdn.microsoft.com/en-us/library/ms184286.aspx" href="http://msdn.microsoft.com/en-us/library/ms184286.aspx">http://msdn.microsoft.com/en-us/library/ms184286.aspx</a>.  When SQL Server meets certain thresholds or memory pressure exists, SQL Server will escalate locks.  Lock escalation unfortunately goes from very granular locks to not so granular locks.  Lock escalation will go straight to a table lock from a rid/key or page lock.  What does this mean?  It means that SQL Server can save memory by acquiring a less granular lock, oppose to a lot of granular locks.  You can look at the transaction locks for each of the UPDATE statements to verify lock escalation is occurring.</p>
<p><em>Note: I removed intent locks as those locks from the resultset.</em></p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> *
<span class="kwrd">FROM</span> sys.[dm_tran_locks]
<span class="kwrd">WHERE</span> [request_session_id] = xx
<span class="kwrd">AND</span> [request_mode] = <span class="str">'X'</span>
<span class="kwrd">AND</span> [request_mode] <span class="kwrd">NOT</span> <span class="kwrd">LIKE</span> <span class="str">'I%'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<h3> </h3>
<h3>Initial Update Query</h3>
<p><a href="http://lh3.ggpht.com/_ayZBUzPGG9A/TBMD9kQds0I/AAAAAAAAAbU/6HrZx0XfKMU/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFyKKyhHe-m4puxX0CXIUMaRYoiiHnjY5oRFA-T2l9d5v1CMWBHLzrBhwWiul5TDqP0Gxla6mfp5-OdWVF7T0JBUCMjehku-5Fs5Ag65C6_dtbFlJDjXVOmM0yaiBD-pYq72lrX2JPUw/?imgmax=800" width="678" height="243" /></a>  </p>
<h3>Lock Escalated Update Query</h3>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwkwpSmTopBwS9BT-W_HUXZrHx5hqMTLj5_jDwPU9a-JpnCzjIZYV1ODekC6WZYLe8d8XISlI3IQN4HZGdWqm7pB5TyO8VTzuqFumVmpzHEXZWBx4q2Ih4hv2T6CYdqgiBixq755xshw/s1600-h/image%5B7%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizlr85MoiKssLv8iBZzLRTjcTJ9r1KebxdvvR5OYyaawfXhSsTX-aVutQXkSZ-FlHjKaJpu7bAVa-yCoxOHqKpOOq5ySWPaNnVpU_Ag2W0N74y5P5U0_dHjfw1KnEVou3hHdKepoHK9g/?imgmax=800" width="617" height="61" /></a> </p>
<p>If you run a lot of big DML transactions in your environment and still require concurrency, you may want to pay careful attention to lock escalation; otherwise, you may experience an abnormally large number of blocks.  While lock escalation  is great in most cases, in others it is less than ideal.</p>
<p>Here are the thresholds as described by BOL, <a title="http://msdn.microsoft.com/en-us/library/ms184286.aspx" href="http://msdn.microsoft.com/en-us/library/ms184286.aspx">http://msdn.microsoft.com/en-us/library/ms184286.aspx</a></p>
<p><em>Lock Escalation Thresholds</em></p>
<em>
<hr /></em>
<p><a></a></p>
<p><em>Lock escalation is triggered when lock escalation is not disabled on the table by using the ALTER TABLE SET LOCK_ESCALATION option, and when either of the following conditions exists: </em></p>
<ul>
<li>
<p><em>A single Transact-SQL statement acquires at least <strong>5,000</strong> locks on a single nonpartitioned table or index.</em></p>
</li>
<li>
<p><em>A single Transact-SQL statement acquires at least <strong>5,000</strong> locks on a single partition of a partitioned table and the ALTER TABLE SET LOCK_ESCALATION option is set to AUTO.</em></p>
</li>
<li>
<p><em>The number of locks in an instance of the Database Engine exceeds memory or configuration thresholds.</em></p>
</li>
</ul>
<p><em>If locks cannot be escalated because of lock conflicts, the Database Engine periodically triggers lock escalation at every 1,250 new locks acquired.</em></p>
<p></p>
<p>Now that we have identified the problem, what can we do to fix it?  There are a number of options that can be used  to solve this problem.  One solution is to use the nolock hint in your query or the read uncommited isolation level.  This particular solution is not recommend for all OLTP environments and should only be implemented with careful consideration.  The nolock hint and the read uncommitted isolation level can return inconsistent data.  If and only if this is okay, should you consider this as a solution.  Another solution is to use the read committed snapshot isolation level or the snapshot isolation level.  Both of these solutions require tempdb overhead, but do return transactional consistent data.  You can read more about these isolation levels here, <a title="http://msdn.microsoft.com/en-us/library/ms189122.aspx" href="http://msdn.microsoft.com/en-us/library/ms189122.aspx">http://msdn.microsoft.com/en-us/library/ms189122.aspx</a>.  The other approach is to remove lock escalation.  You can remove lock escalation at the instance level (trace flags 1211 and 1224) or at the table level in SQL Server 2008, using the ALTER TABLE statement.  Obviously, removing lock escalation should be carefully thought out and tested.  For more information on these trace flags, please visit the storage engine’s <a href="http://blogs.lessthandot.com/index.php/All/?disp=authdir&author=4" target="_blank">blog</a> <a title="http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/05/17/lock-escalation.aspx" href="http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/05/17/lock-escalation.aspx">http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/05/17/lock-escalation.aspx</a>.</p>
<p>There you have it. I have shown you how simply adding data to your table can literally make a fast application a blocking nightmare overnight.  Before you go off and start adding nolock hints or changing your isolation level, please understand that you should only take these steps if you are experiencing this problem.  The reality is an OLTP system should not be holding onto more than 5000 (a lock escalation tipping point) locks because transactions should be short and efficient.  If you are experiencing this problem, your database is probably OLAP, you are missing indexes, or you queries are not typical OLTP transactions.  For example, you probably have large DML transactions and users trying to query the table concurrently.   </p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com4tag:blogger.com,1999:blog-4646137438366687895.post-59497537038490861042010-05-11T19:51:00.001-07:002010-05-12T06:28:17.995-07:00SQL Meme: Tagged: 5 things SQL Server should drop<div style="PADDING-BOTTOM: 4px; MARGIN: 0px; PADDING-LEFT: 8px; PADDING-RIGHT: 0px; FLOAT: right; PADDING-TOP: 0px" class="wlWriterHeaderFooter"></div><p>I have been tagged by Denis Gobo (<a href="http://blogs.lessthandot.com/index.php/All/?disp=authdir&author=4" target="_blank">Blog</a>/<a href="http://twitter.com/DenisGobo" target="_blank">Twitter</a>) in a SQL meme regarding the top 5 things I would drop from SQL Server, <a href="http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/sql-meme-tagged-5-things-sql-server-shou" target="_blank">Denis’s post</a>. I am sure some of you could spit out a list a mile long, but I am going to focus on my 5 biggest pet peeves, well ones that have not been listed yet :^). </p><h4>sp_msforeachdb and sp_msforeachtable</h4><p>First off these two stored procedures are undocumented, so they can be deprecated or the functionality may change. In my opinion, these two stored procedures are useless. If you look underneath the hood, these stored procedures both use basic cursors…. Sure they make fancy work of the “?”, but you can to with your own cursor. Remove these from your code and roll your own cursors.</p><h4>PRINT</h4><p>Print is a bit antiquated when it comes to newer versions of SQL Server. I am not saying drop print altogether, but drop the 8000 varchar/4000 nvarchar print limitation. I have seen this byte (sorry couldn’t resist) people over and over. It does not make sense to allow a developer to store 2 GB worth of data in a variable and then only print 8000 characters… Sure we can roll our own print procedure, or use XML, but why should we work around the issue. Allow PRINT to “print” up to the maximum variable size.</p><h4>ORDER BY Constant in Windowing Function</h4><p>If you try to order by a constant value using a windowing function, such as Row_Number(), you will get a message stating constants are not allowed; however, there is a workaround. The workaround is to use a subquery (with a constant) in the order by clause. The behavior should be removed because it seemingly gives developers the idea that the data will be order in the order of table. Before anyone says anything, a table does not have a predefined order. So what we have here is a number sequence that is not guaranteed each time it is run. In my book, inconsistent behavior = Remove the functionality and make the user order by an actual column.</p><p>**** UPDATE ****</p><p>I have been asked to provide a sample of what I am talking about here. Essentially create a very simple table with a few columns as such:</p><pre class="csharpcode"><span class="kwrd">DECLARE</span> @t <span class="kwrd">TABLE</span>(
Id <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
SomeChar <span class="kwrd">CHAR</span>(1)
);
INSERT <span class="kwrd">INTO</span> @t (SomeChar) <span class="kwrd">VALUES</span> (<span class="str">'a'</span>) ;
INSERT <span class="kwrd">INTO</span> @t (SomeChar) <span class="kwrd">VALUES</span> (<span class="str">'b'</span>) ;
INSERT <span class="kwrd">INTO</span> @t (SomeChar) <span class="kwrd">VALUES</span> (<span class="str">'c'</span>) ;</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now try each of the following queries against the table and you will see one windowing function says constants cannot be used, but clearly then can with a little ingenuity. The problem here is the illusion that the data will be sequenced in the order of the table, without actually sorting it; however, what is really occurring is the optimizer is generating a sequence whose order may vary from execution to execution. A column should be supplied; otherwise, unexpected results may occur.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> Id,SomeChar,ROW_NUMBER() <span class="kwrd">OVER</span>(<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> <span class="str">'a'</span>) <span class="kwrd">AS</span> seq
<span class="kwrd">FROM</span> @t
/*Windowed functions do <span class="kwrd">not</span> support constants <span class="kwrd">as</span> <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> clause expressions.*/
<span class="kwrd">SELECT</span> Id,SomeChar,ROW_NUMBER() <span class="kwrd">OVER</span>(<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> <span class="str">'a'</span>)) <span class="kwrd">AS</span> seq
<span class="kwrd">FROM</span> @t
/*
Id SomeChar seq
---------<span class="rem">-- -------- --------------------</span>
1 a 1
2 b 2
3 c 3
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<h4>Edit Top X Rows</h4>
<p>This is a SSMS feature that I just do not find useful at all, plus this feature gives non database professionals the ability to modify/inserted/delete data, with no understanding of what is occurring in the background. In my opinion, those who use this feature are asking for trouble. I believe all insert/update/delete transactions should be done through CRUD (Create/Read/Update/Delete) stored procedures or TSQL batch operations. If you do not know how to do CRUD through TSQL, you do not need to be doing it at all.</p>
<h4>SELECT *</h4>
<p>I may take a little heat from this one, but SELECT * should be removed from SQL Server. SQL Server intellisense should auto expand “*” into the column list. SELECT * is a prime candidate for performance problems and wasted network traffic. SELECT * affects the optimizer’s ability to use indexes, increases network bytes, and breaks code when column ordinal position is changed or columns are added or removed. Sure we all use SELECT * for quick ad-hoc queries, but believe me.. it also exists in production code. In my opinion, the benefits of expanding the “*” outweigh the cons because it makes developers/DBAs realize how many columns they are selecting, which may tip them off that they should restrict the number of columns being selected. Also expanding the “*” prevents insert statements from breaking when columns are added or removed.</p>
<p>These are the items I would remove from SQL Server given the chance. I am sure I can come up with a lot more, but I will let others take a stab at this. </p>
<p>Until next time, happy coding.</p>Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com15tag:blogger.com,1999:blog-4646137438366687895.post-17767583742920349482010-05-04T20:47:00.001-07:002010-05-04T20:47:17.981-07:00Performance tuning Case Expressions With Correlated Subqueries<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/05/performance-tuning-case-expressions.html";digg_title = "Performance tuning Case Expressions With Correlated Subqueries";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>Today I wanted to talk about some potential pitfalls that a developer may encounter when using correlated subqueries, in a case expression.  As you may recall, I have done a post on the potential performance pitfalls, in using correlated subqueries, before  <a title="http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html" href="http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html">http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html</a>.  In this post, I will be focusing on case expressions that use  correlated subqueries.</p> <p>I will start by creating a sample table.</p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> ,
SomeChar TINYINT
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID())%3+1)
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_SomeChar <span class="kwrd">ON</span> dbo.TestData(SomeChar);
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData2'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData2];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData2(
Id <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
RowNum <span class="kwrd">INT</span> <span class="kwrd">unique</span>,
SomeChar TINYINT
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData2
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 500
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID())%3+1)
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_SomeChar <span class="kwrd">ON</span> dbo.TestData2(SomeChar);
<span class="kwrd">GO</span></pre>
<p>A typical correlated subquery in a case expression may look something like this:</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span>
RowNum,SomeChar,
<span class="kwrd">CASE</span> (<span class="kwrd">SELECT</span> SomeChar <span class="kwrd">FROM</span> dbo.TestData2 t2 <span class="kwrd">WHERE</span> t2.RowNum = t1.RowNum)
<span class="kwrd">WHEN</span> 1 <span class="kwrd">THEN</span> <span class="str">'Type1'</span>
<span class="kwrd">WHEN</span> 2 <span class="kwrd">THEN</span> <span class="str">'Type2'</span>
<span class="kwrd">WHEN</span> 3 <span class="kwrd">THEN</span> <span class="str">'Type3'</span>
<span class="kwrd">END</span>
<span class="kwrd">FROM</span> dbo.TestData t1
<span class="kwrd">WHERE</span> [RowNum] <= 500</pre>
<p>Let’s have a look at the execution plan to see what is going on underneath the hood</p>
<p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/S-DqPFBV58I/AAAAAAAAAa8/XSzqt9iIRd8/s1600-h/image%5B4%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="312" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2dNDdBUhM5B5p8o4euLUvxSit92WpTBDDxuNsw0RBYjlBAsqMuO2gtnHw_UFy4nLdRQP5XhsB8I2ZGRBH7WrHa6OzwHwxfQZfVe6P1jPKFqQV4LX9dSoFxzGU6-JBG_pmdIqQU4RMyQ/?imgmax=800" width="784" border="0" /></a> </p>
<p>IO:</p>
<pre class="csharpcode">Table <span class="str">'Worktable'</span>. Scan count 442, logical reads 2886, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData2'</span>. Scan count 3, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData'</span>. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Whoa!!! This query is extremely inefficient.  As you can see the TestData2 table was scanned 3 times and a worker table was created and scanned 442 times.  The problem here is the optimizer chooses to spool the data from dbo.TestData2 twice.  The even bigger problem with this method is scalability.  This code does not scale well at all.  In the case of this query, the optimizer creates a relational number of index spools to the number of elements in the case expression.  The relationship can be defined as Number Of Spools = Number of Case Elements – 1.  What does this mean? It means that if your case expression has 4 elements you get 3 spools… if you case expression has 5 elements you get 4 spools and so on.  Simply put…..query performance decreases as the number of elements in the case expression increase.  Take a look at the example below.</p>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S-DqP171XwI/AAAAAAAAAbE/RpM-1lqGAtA/s1600-h/image%5B12%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="470" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/S-DqQNvnt9I/AAAAAAAAAbI/ObBKEsYrppI/image_thumb%5B6%5D.png?imgmax=800" width="774" border="0" /></a> </p>
<p>IO:</p>
<pre class="csharpcode">Table <span class="str">'Worktable'</span>. Scan count 539, logical reads 4081, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData2'</span>. Scan count 4, logical reads 16, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData'</span>. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>So how should we change the query to help the optimizer make a better decision?  </p>
<p>The best solution is to allow the optimizer to get the computed value, while it is joining the TestData2 table, as shown below.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span>
RowNum,SomeChar,
(<span class="kwrd">SELECT</span> <span class="kwrd">CASE</span> SomeChar <span class="kwrd">WHEN</span> 1 <span class="kwrd">THEN</span> <span class="str">'Type1'</span> <span class="kwrd">WHEN</span> 2 <span class="kwrd">THEN</span> <span class="str">'Type2'</span> <span class="kwrd">WHEN</span> 3 <span class="kwrd">THEN</span> <span class="str">'Type3'</span> <span class="kwrd">END</span> <span class="kwrd">FROM</span> dbo.TestData2 t2 <span class="kwrd">WHERE</span> t2.RowNum = t1.RowNum)
<span class="kwrd">FROM</span> dbo.TestData t1
<span class="kwrd">WHERE</span> [RowNum] <= 500</pre>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiGJNnU-JUyJ0PAHYh61GNW-fmAUbaCSlm0PKuNsq-NBi_DCEwbd1heN5R5-0TpZeLbCsicZOKoEiQLZgtsrTOcXnxwCkKQHbSzoRHYS8gNXYtKMIuxpd-1bdAEEDlwsXy5gMxvUuEMQ/s1600-h/image%5B8%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="222" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/S-DqQ9dEz3I/AAAAAAAAAbQ/1EUSYaie188/image_thumb%5B4%5D.png?imgmax=800" width="798" border="0" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>IO:</p>
<pre class="csharpcode">Table <span class="str">'Worktable'</span>. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData2'</span>. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table <span class="str">'TestData'</span>. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.</pre>
<p>As you can see, this is a much better query plan.  The key here is the optimizer is able to use a compute scalar operator upstream, while joining the tables. Because the computed value is joined to the TestData table, we do not have to worry about spooling the data. </p>
<h3>Conclusion</h3>
<p>For me, correlated subqueries can have inconsistent behavior and often bear performance problems, such as this one.  Do not get me wrong, correlated subqueries are not all bad, but they should be thoroughly tested.  In my opinion, the best way to write this query is to LEFT OUTER JOIN dbo.TestData2.  An outer join will provide more consistent performance. </p>
<pre class="csharpcode"><span class="kwrd">SELECT</span>
t1.RowNum,t1.SomeChar,
<span class="kwrd">CASE</span> t2.SomeChar
<span class="kwrd">WHEN</span> 1 <span class="kwrd">THEN</span> <span class="str">'Type1'</span>
<span class="kwrd">WHEN</span> 2 <span class="kwrd">THEN</span> <span class="str">'Type2'</span>
<span class="kwrd">WHEN</span> 3 <span class="kwrd">THEN</span> <span class="str">'Type3'</span>
<span class="kwrd">END</span>
<span class="kwrd">FROM</span> dbo.TestData t1
<span class="kwrd">LEFT</span> <span class="kwrd">JOIN</span> dbo.TestData2 t2 <span class="kwrd">ON</span> t1.RowNum = t2.RowNum
<span class="kwrd">WHERE</span> t1.[RowNum] <= 500</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com6tag:blogger.com,1999:blog-4646137438366687895.post-91502503095241669922010-04-13T16:58:00.001-07:002010-04-13T16:58:45.584-07:00T-SQL Tuesday #005 – Creating & Emailing HTML Reports<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/04/t-sql-tuesday-005-creating-emailing.html";digg_title = "T-SQL Tuesday #005 – Creating & Emailing HTML Reports";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>This post is my contribution to the popular TSQL Tuesday event, <a title="http://sqlvariant.com/wordpress/index.php/2010/04/t-sql-tuesday-005-reporting/" href="http://sqlvariant.com/wordpress/index.php/2010/04/t-sql-tuesday-005-reporting/">http://sqlvariant.com/wordpress/index.php/2010/04/t-sql-tuesday-005-reporting/</a>.   The creator of this amazing event is <a href="http://sqlblog.com/blogs/adam_machanic/default.aspx" target="_blank">Adam Machanic</a>.   What I love most about this event is how it brings the SQL Server Community together. The “theme” for this TSQL Tuesday is reporting.  As you aware, reporting is a very broad topic.  I will be focusing on creating and emailing HTML reports.  Now this process is no substitute for a SSRS report or a cube report.  What I am about to show you is a very sleek way of presenting data to managers at a very high level.  You do not want to send an entire report as a HTML report, so this process should be limited to dashboards or reports that are small in nature.  If the user needs more detail , or is simply requesting too much data, you may want to add a detail link in the HTML body, as this gives the user the ability to drill through for more detail.  </p> <p>Let’s get started by creating a sample table and a couple of views.  It should be noted that this process will primarily utilize views to expose data.  This code can be further expanded to filter for specific columns, but as it stands now…. this process returns all columns in the view or table.</p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.Sales'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.Sales;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.Sales(
SalesId <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
EmployeeId <span class="kwrd">INT</span>,
Amt <span class="kwrd">NUMERIC</span>(9,2),
LocationCd <span class="kwrd">INT</span>
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (1,12.50,1);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (1,99.99,4);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (2,45.64,1);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (3,44.65,2);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (3,52.89,4);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (4,250.54,3);
INSERT <span class="kwrd">INTO</span> dbo.Sales <span class="kwrd">VALUES</span> (5,150.00,5);
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.vw_SalesVolumnByLocation'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">VIEW</span> dbo.vw_SalesVolumnByLocation;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">VIEW</span> dbo.vw_SalesVolumnByLocation
<span class="kwrd">AS</span>
<span class="kwrd">SELECT</span> LocationCd, <span class="kwrd">SUM</span>(Amt) <span class="kwrd">AS</span> SalesVolume
<span class="kwrd">FROM</span> dbo.Sales
<span class="kwrd">GROUP</span> <span class="kwrd">BY</span> LocationCd
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">VIEW</span> dbo.vw_SalesBySalesCounselor
<span class="kwrd">AS</span>
<span class="kwrd">SELECT</span> [EmployeeId],[LocationCd],[Amt]
<span class="kwrd">FROM</span> dbo.Sales
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Next, the stored procedure.  First and foremost this code looks a lot worse than than it really is.  I had to use dynamic SQL because I did not want to have to create this stored procedure in every database.</p>
<p>The parameter list is pretty massive, but a lot of the parameters have default values, which means you do not have to specify anything.  The parameter are pretty self explanatory.</p>
<pre class="csharpcode"><span class="kwrd">USE</span> [master]
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> usp_Email_HTML_Rpt
@DB <span class="kwrd">VARCHAR</span>(255) = <span class="kwrd">NULL</span>,
@<span class="kwrd">Object</span> <span class="kwrd">VARCHAR</span>(255),
@<span class="kwrd">Schema</span> <span class="kwrd">VARCHAR</span>(255),
@Rec NVARCHAR(255),
@CC NVARCHAR(255) = <span class="kwrd">NULL</span>,
@rpt_Header <span class="kwrd">VARCHAR</span>(50),
@rpt_Header_BGColor <span class="kwrd">VARCHAR</span>(10) = <span class="str">'#FFFFFF'</span>,
@TblHdr_BGColor <span class="kwrd">VARCHAR</span>(10) = <span class="str">'#FFFFFF'</span>,
@Condition1_Col <span class="kwrd">VARCHAR</span>(255) = <span class="kwrd">NULL</span>,
@Condition1_Expression <span class="kwrd">VARCHAR</span>(500) = <span class="kwrd">NULL</span>,
@Condition1_BGColor <span class="kwrd">VARCHAR</span>(10) = <span class="kwrd">NULL</span>,
@Condition2_Col <span class="kwrd">VARCHAR</span>(255) = <span class="kwrd">NULL</span>,
@Condition2_Expression <span class="kwrd">VARCHAR</span>(500) = <span class="kwrd">NULL</span>,
@Condition2_BGColor <span class="kwrd">VARCHAR</span>(10) = <span class="kwrd">NULL</span>,
@AltRowBGColor <span class="kwrd">VARCHAR</span>(10) = <span class="kwrd">NULL</span>,
@Pred_Filter1_Col <span class="kwrd">VARCHAR</span>(255) = <span class="kwrd">NULL</span>,
@Pred_Filter1_Expression <span class="kwrd">VARCHAR</span>(500) = <span class="kwrd">NULL</span>,
@Pred_Filter2_Col <span class="kwrd">VARCHAR</span>(255) = <span class="kwrd">NULL</span>,
@Pred_Filter2_Expression <span class="kwrd">VARCHAR</span>(500) = <span class="kwrd">NULL</span>,
@OrderBy <span class="kwrd">VARCHAR</span>(500) = <span class="kwrd">NULL</span>
<span class="kwrd">AS</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@StyleSheet <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>),
@RtnSQL NVARCHAR(<span class="kwrd">MAX</span>),
@html_email NVARCHAR(<span class="kwrd">MAX</span>)
<span class="kwrd">DECLARE</span> @HTML <span class="kwrd">TABLE</span>(seq TINYINT, Tag <span class="kwrd">VARCHAR</span>(<span class="kwrd">MAX</span>));
--<span class="kwrd">Create</span> a <span class="kwrd">new</span> style sheet <span class="kwrd">if</span> <span class="kwrd">none</span> was passed <span class="kwrd">in</span>
<span class="kwrd">IF</span> @StyleSheet <span class="kwrd">IS</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
--<span class="kwrd">Set</span> the <span class="kwrd">Procedure</span> Stylesheet. You can also supply this <span class="kwrd">as</span> a <span class="kwrd">variable</span>
<span class="kwrd">SET</span> @StyleSheet =
<span class="str">'<head>
<style type="text/css">
th {width:150px;color:"#FFFFFF";font-weight:bold;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@TblHdr_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">';border:1;border-width:thin; border-style:solid; align:center}
td {width:150px;background-color: "#FFFFFF"; border: 1; border-style:solid;border-width:thin; text-align: "left"}
td.Cond1Met {width:150px;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@Condition1_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">'; border-style:solid;border-width:thin; text-align: "left"}
td.Cond2Met {width:150px;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@Condition2_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">'; border-style:solid;border-width:thin; text-align: "left"}
td.AltRowColor {width:150px;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@AltRowBGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">'; border: 1; border-style:solid;border-width:thin; text-align: "left"}
td.LegendCond1Met {width:200px;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@Condition1_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">'; border-style:solid;border-width:thin; text-align: "center"}
td.LegendCond2Met {width:200px;background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@Condition2_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) +<span class="str">'; border-style:solid;border-width:thin; text-align: "center"}
th.LegendHdr {width:200px;color:"#FFFFFF"; font-weight:bold; background-color: '</span> + QUOTENAME(<span class="kwrd">COALESCE</span>(@rpt_Header_BGColor,<span class="str">'#FFFFFF'</span>),<span class="str">'"'</span>) + <span class="str">';border: 1;border-width:thin; border-style:solid;text-align: "center"}
td.Legend {width:200px;background-color: "#FFFFFF"; border: 1; border-width:thin; border-style:solid; text-align: "center"}
th.LegendTitle {width:200px;color:black;background-color: "#C0C0C0"; border: 1; border-width:thin; border-style:solid; text-align: "center"}
</style>
<title>'</span> + <span class="kwrd">COALESCE</span>(@rpt_Header,<span class="str">'Report Header'</span>) + <span class="str">'</title>
</head>
'</span>
<span class="kwrd">END</span>
--Build basic html <span class="kwrd">structure</span>
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">VALUES</span> (1,<span class="str">'<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">'</span> + <span class="kwrd">CHAR</span>(13) + <span class="str">'<html>'</span> + <span class="kwrd">COALESCE</span>(@StyleSheet,<span class="str">''</span>) + <span class="str">'<body>'</span>);
--<span class="kwrd">If</span> optional conditions exist, build a legend
<span class="kwrd">IF</span> @Condition1_Col <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span> <span class="kwrd">OR</span> @Condition2_Col <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">SELECT</span> 2, <span class="str">'<table border="1" align="LEFT">'</span> <span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> 3, <span class="str">'<tr><th class="LegendTitle"COLSPAN=3>Legend</th></tr>'</span> <span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> 4, <span class="str">'<tr><th class="LegendHdr">Variable</th><th class="LegendHdr">Condition Column</th><th class="LegendHdr">Condition Expression</th></tr>'</span> <span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> 5, <span class="str">'<tr><td class="Legend">@Condition1</td><td class="Legend">'</span> + <span class="kwrd">COALESCE</span>(@Condition1_Col,<span class="str">'n/a'</span>) + <span class="str">'</td><td class="LegendCond1Met"> '</span> + <span class="kwrd">COALESCE</span>(@Condition1_Expression,<span class="str">'n/a'</span>) + <span class="str">'</td></tr>'</span> <span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> 6, <span class="str">'<tr><td class="Legend">@Condition2</td><td class="Legend">'</span> + <span class="kwrd">COALESCE</span>(@Condition2_Col,<span class="str">'n/a'</span>) + <span class="str">'</td><td class="LegendCond2Met"> '</span> + <span class="kwrd">COALESCE</span>(@Condition2_Expression,<span class="str">'n/a'</span>) + <span class="str">'</td></tr>'</span> <span class="kwrd">UNION</span> <span class="kwrd">ALL</span>
<span class="kwrd">SELECT</span> 7, <span class="str">'</table><br><br><br><br><br><br><br>'</span> + <span class="str">'<h1>'</span> + <span class="kwrd">COALESCE</span>(@rpt_Header,<span class="str">'Report Header'</span>) + <span class="str">'</h1>'</span> + <span class="str">'<table border="1" align="left" width="25%">'</span>
<span class="kwrd">END</span>
<span class="kwrd">ELSE</span>
<span class="kwrd">BEGIN</span> --<span class="kwrd">No</span> legend <span class="kwrd">is</span> needed, <span class="kwrd">start</span> building the <span class="kwrd">table</span>
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">SELECT</span> 8, <span class="str">'<br>'</span> + <span class="str">'<h1>'</span> + <span class="kwrd">COALESCE</span>(@rpt_Header,<span class="str">'Report Header'</span>) + <span class="str">'</h1>'</span> + <span class="str">'<table border="1" align="left" width="25%">'</span>
<span class="kwrd">END</span>
--<span class="kwrd">Create</span> <span class="kwrd">Table</span> Header
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'
SELECT 9,CAST(
(
SELECT CAST('</span><span class="str">'<th>'</span><span class="str">' + COALESCE(c.COLUMN_NAME,'</span><span class="str">''</span><span class="str">') + '</span><span class="str">'</th>'</span><span class="str">' AS XML)
FROM '</span> + <span class="kwrd">COALESCE</span>(QUOTENAME(@DB) + <span class="str">'.'</span>,<span class="str">''</span>) + <span class="str">'[INFORMATION_SCHEMA].[COLUMNS] c
WHERE c.[TABLE_NAME] = @dynObject AND c.[TABLE_SCHEMA] = @dynSchema
FOR XML PATH('</span><span class="str">''</span><span class="str">'),ELEMENTS,ROOT('</span><span class="str">'tr'</span><span class="str">'),TYPE
) AS VARCHAR(MAX))'</span>;
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">EXEC</span> sp_executesql @<span class="kwrd">sql</span>, N<span class="str">'@dynObject VARCHAR(255),@dynSchema VARCHAR(128)'</span>,@dynObject = @<span class="kwrd">Object</span>, @dynSchema=@<span class="kwrd">Schema</span>
--<span class="kwrd">Create</span> <span class="kwrd">SQL</span> <span class="kwrd">Statement</span> <span class="kwrd">to</span> <span class="kwrd">return</span> actual <span class="kwrd">values</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'
SELECT
@dynRtnSQL = '</span><span class="str">'SELECT 10,'</span><span class="str">''</span><span class="str">'<tr>'</span><span class="str">''</span><span class="str">'+'</span><span class="str">' + STUFF(
(
SELECT
'</span><span class="str">'+ CASE '</span><span class="str">' +
COALESCE('</span><span class="str">'WHEN '</span><span class="str">' + QUOTENAME(@dynCondition1_Col) + @dynCondition1_Expression
+ '</span><span class="str">' THEN '</span><span class="str">''</span><span class="str">'<td class="Cond1Met">'</span><span class="str">''</span><span class="str">' + CAST('</span><span class="str">' + QUOTENAME(c.COLUMN_NAME) + '</span><span class="str">' AS VARCHAR(MAX))'</span><span class="str">','</span><span class="str">''</span><span class="str">')
+ COALESCE('</span><span class="str">' WHEN '</span><span class="str">' + QUOTENAME(@dynCondition2_Col) + @dynCondition2_Expression
+ '</span><span class="str">' THEN '</span><span class="str">''</span><span class="str">'<td class="Cond2Met">'</span><span class="str">''</span><span class="str">' + CAST('</span><span class="str">' + QUOTENAME(c.COLUMN_NAME) + '</span><span class="str">' AS VARCHAR(MAX))'</span><span class="str">','</span><span class="str">''</span><span class="str">')
+ '</span><span class="str">' WHEN '</span><span class="str">''</span><span class="str">'1'</span><span class="str">''</span><span class="str">'= CASE WHEN ROW_NUMBER() OVER(ORDER BY '</span> + <span class="kwrd">COALESCE</span>(@OrderBy,<span class="str">'(SELECT NULL)'</span>) + <span class="str">') % 2 = 0 THEN 1 ELSE 0 END'</span><span class="str">'
+ '</span><span class="str">' THEN '</span><span class="str">''</span><span class="str">'<td class="AltRowColor">'</span><span class="str">''</span><span class="str">' + CAST('</span><span class="str">' + QUOTENAME(c.COLUMN_NAME) + '</span><span class="str">' AS VARCHAR(MAX))'</span><span class="str">'
+ '</span><span class="str">' ELSE '</span><span class="str">''</span><span class="str">'<td>'</span><span class="str">''</span><span class="str">' + CAST('</span><span class="str">' + QUOTENAME(c.COLUMN_NAME) + '</span><span class="str">' AS VARCHAR(MAX))'</span><span class="str">'
+ '</span><span class="str">' END'</span><span class="str">'
+ '</span><span class="str">' + '</span><span class="str">''</span><span class="str">'</td>'</span><span class="str">''</span><span class="str">''</span><span class="str">'
FROM '</span> + <span class="kwrd">COALESCE</span>(QUOTENAME(@DB) + <span class="str">'.'</span>,<span class="str">''</span>) + <span class="str">'[INFORMATION_SCHEMA].[Columns] c
WHERE c.[TABLE_NAME] = @dynObject AND c.[TABLE_SCHEMA] = @dynSchema
FOR XML PATH('</span><span class="str">''</span><span class="str">'),TYPE
).value('</span><span class="str">'.'</span><span class="str">','</span><span class="str">'VARCHAR(MAX)'</span><span class="str">')
,1,1,'</span><span class="str">''</span><span class="str">') + '</span><span class="str">'+'</span><span class="str">''</span><span class="str">'</tr>'</span><span class="str">''</span><span class="str">' FROM '</span> + <span class="kwrd">COALESCE</span>(QUOTENAME(@DB) + <span class="str">'.'</span>,<span class="str">''</span>) + <span class="str">''</span><span class="str">' + QUOTENAME(@dynSchema) + '</span><span class="str">'.'</span><span class="str">' + QUOTENAME(@dynObject) +
'</span><span class="str">'WHERE 1=1 '</span> + <span class="kwrd">COALESCE</span>(<span class="str">' AND'</span> + QUOTENAME(@Pred_Filter1_Col) + <span class="kwrd">SPACE</span>(1) + @Pred_Filter1_Expression,<span class="str">''</span>) + <span class="str">''</span>
+ <span class="kwrd">COALESCE</span>(<span class="str">' AND'</span> + QUOTENAME(@Pred_Filter2_Col) + <span class="kwrd">SPACE</span>(1) + @Pred_Filter2_Expression,<span class="str">''</span>)
+ <span class="kwrd">COALESCE</span>(<span class="str">' ORDER BY '</span> + @OrderBy,<span class="str">''</span>) + <span class="str">''</span><span class="str">''</span>
--<span class="kwrd">Create</span> a <span class="kwrd">variable</span> <span class="kwrd">to</span> hold the newly created <span class="kwrd">dynamic</span> <span class="kwrd">sql</span> <span class="kwrd">statement</span>
<span class="kwrd">--PRINT</span> @<span class="kwrd">sql</span>
<span class="kwrd">EXEC</span> sp_executesql
@<span class="kwrd">sql</span>,
N<span class="str">'@dynCondition1_Col VARCHAR(255), @dynCondition1_Expression VARCHAR(500), @dynCondition2_Col VARCHAR(255), @dynCondition2_Expression VARCHAR(500), @dynSchema VARCHAR(255), @dynObject VARCHAR(255), @dynRtnSQL NVARCHAR(MAX) OUTPUT'</span>,
@dynCondition1_Col = @Condition1_Col,
@dynCondition1_Expression = @Condition1_Expression,
@dynCondition2_Col = @Condition2_Col,
@dynCondition2_Expression = @Condition2_Expression,
@dynSchema = @<span class="kwrd">Schema</span>,
@dynObject = @<span class="kwrd">Object</span>,
@dynRtnSQL = @RtnSQL <span class="kwrd">OUTPUT</span>
--<span class="kwrd">PRINT</span> @RtnSQL
--<span class="kwrd">Execute</span> the newly created <span class="kwrd">dynamic</span> TSQL statment.
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">EXEC</span> sp_executesql @RtnSQL
--<span class="kwrd">Close</span> <span class="kwrd">all</span> report HTML tags
INSERT <span class="kwrd">INTO</span> @HTML (seq,Tag)
<span class="kwrd">SELECT</span> 11, <span class="str">'</table></body></html>'</span>
--<span class="kwrd">SELECT</span> Tag <span class="kwrd">FROM</span> @HTML <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> seq <span class="rem">-- return HTML in the correct order</span>
<span class="kwrd">SELECT</span> @HTML_Email = <span class="kwrd">COALESCE</span>(@HTML_Email,<span class="str">''</span>) + Tag <span class="kwrd">FROM</span> @HTML <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> seq <span class="rem">-- return HTML in the correct order</span>
--<span class="kwrd">PRINT</span> @HTML_Email
<span class="kwrd">EXEC</span> msdb.dbo.sp_send_dbmail
@recipients = @rec,
@copy_recipients = @CC,
@subject = @rpt_Header,
@body = @HTML_Email,
@body_format = <span class="str">'HTML'</span>,
@importance = <span class="str">'Normal'</span>
<span class="kwrd">END</span>
<span class="kwrd">GO</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now let’s see this stored procedure in action.  The code is very flexible and gives you a variety of methods to slice and dice data.  I have provided two conditional filters that will highlight data that meets the criteria to a specified color.  I have also include sort and filtering parameters to help reduce the amount of data being returned.  As I stated before, not all of the parameters are required.  One of my favorite parameters is @AltRowBGColor.  @AltRowBGColor accepts an HTML color that will alternate the row color of the HTML table.</p>
<p>Execute the following code: (AltRowBGColor is commented out for this demo)</p>
<pre class="csharpcode"><span class="kwrd">EXECUTE</span> [dbo].[usp_Email_HTML_Rpt]
@DB =<span class="str">'tempdb'</span>
,@Rec = <span class="str"><a href="mailto:'ahaines@stei.com'">'ahaines@stei.com'</a> –Change to your email address</span>
,@<span class="kwrd">Object</span> = <span class="str">'vw_SalesVolumnByLocation'</span>
,@<span class="kwrd">Schema</span> = <span class="str">'dbo'</span>
,@rpt_Header = <span class="str">'Sales Volumn By Location'</span>
,@rpt_Header_BGColor = <span class="str">'#87AFC7'</span>
,@TblHdr_BGColor = <span class="str">'#87AFC7'</span>
,@Condition1_Col = <span class="str">'SalesVolume'</span>
,@Condition1_Expression = <span class="str">'<100'</span>
,@Condition1_BGColor = <span class="str">'#E55451'</span>
,@Condition2_Col = <span class="str">'SalesVolume'</span>
,@Condition2_Expression = <span class="str">'>200'</span>
,@Condition2_BGColor = <span class="str">'#00FF00'</span>
--,@AltRowBGColor = <span class="str">'#A0CFEC'</span>
,@OrderBy = <span class="str">'[SalesVolume] DESC'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>You will get an email similar to the one below.  Note that you have to have database mail enabled for this code to work.  You will note that because a conditional filter was supplied a legend was generated.  The legend contains the details of the supplied parameters.  In the case below, Locations with a sales volume < 100 is considered sub par, hence the red color, and Locations with a sales volume > 200 is green.  As you can see this is a great way to visually see your data.  I use these types of reports in my environment to monitor backups, jobs, and their corresponding metrics.  </p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiD-faeJoiakT9vKyWJIlkNn3ngsIMP3DtiqW9irE6MIRKglwEzDw-xZZX7WWJEUY4KW7vpOXtM3RnK1uCCOHQ5Xwk3zV737akX7NfBfA2Jj1oiL3ANJR6735_6IbvOZRE9NCPgco-5Q/s1600-h/image%5B3%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="392" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjljwHf3y_HgNIY4QK_Ms8WCD8SxQY6AIRtY14ob4UeiM_-gjcAkEPkbh8bVEvJOSD-FvJ4UYx8bh7rFRu94GuwKtsbQ4eNDMthHk_Fa59euKgWDzzRZ_JFapSQvDxgnoA4NSw9p1NZYQ/?imgmax=800" width="452" border="0" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>Now, I will execute the stored procedure with lesser parameters and use the @AltRowBGColor variable.  You will note that no legend is generated because no conditional formatting was supplied.</p>
<pre class="csharpcode"><span class="kwrd">EXECUTE</span> [master].[dbo].[usp_Email_HTML_Rpt]
@DB =<span class="str">'tempdb'</span>
,@Rec = <span class="str">'ahaines@stei.com'</span>
,@<span class="kwrd">Object</span> = <span class="str">'vw_SalesBySalesCounselor'</span>
,@<span class="kwrd">Schema</span> = <span class="str">'dbo'</span>
,@rpt_Header = <span class="str">'Sales Volume By Sales Counselor'</span>
,@rpt_Header_BGColor = <span class="str">'#87AFC7'</span>
,@TblHdr_BGColor = <span class="str">'#87AFC7'</span>
,@AltRowBGColor = <span class="str">'#A0CFEC'</span>
,@OrderBy = <span class="str">'[Amt] DESC'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="http://lh3.ggpht.com/_ayZBUzPGG9A/S8UFKUoI7ZI/AAAAAAAAAa0/P4AYrw4iPXo/s1600-h/image%5B7%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="398" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEis3mSQMAIbucY6ymsd1ZE4tf3pL1_d_vGVvl5tPGzw1aVrrRVe5ww_3z6jgkqq9m4N_C8n8T741wkaWTMoibbwMGNhT455A7bNEoouatROGGb0IoOhh5vhf5lRUfJaB7rYLXLs-DiNNQ/?imgmax=800" width="458" border="0" /></a> </p>
<p>This type of reporting is very good for quick and dirty analysis, like dash boarding.  It is also very easy to implement and gives developers/DBAs quick turnaround for reporting.  The alternative would be to open BIDs (or another reporting tool) and generate a report which takes a lot more time that executing a stored procedure.  If you need automation, you can schedule this procedure to execute via a SQL job.  There are a lot of modifications that this stored procedure can undergo.   This stored procedure is by no means perfect, but it does get the job done.  I am planning on enhancing a lot of the features provide here, but for the time being I am satisfied. I personally believe that variable checks need to be put into place and a show/hide legend bit should be introduced.  Someone more versed in HTML might find it better to import a style sheet.  When I get a little more time, I will formally update this post with more complete code.  The idea here was to present a concept and show you the power of TSQL and database mail.</p>
<p>I hope that you find this stored procedure useful and I invite you to modify the code to work for your environment.  If you have ideas on how to optimize the code or make a cool add-on, please keep me informed, so I can update this post.</p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com45tag:blogger.com,1999:blog-4646137438366687895.post-36182349063511720872010-03-24T19:59:00.001-07:002010-03-25T13:19:30.863-07:00SQL Server Management Studio Tips And Tricks<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/03/sql-server-management-studio-tips-and.html";digg_title = "SQL Server Management Studio Tips And Tricks";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>This week I decided I wanted to take a step back from my performance tuning series and present material that DBAs and developers may not know.  I will be focusing on SQL Server Management Studio (SSMS) tips and tricks.  </p> <h5>Keyboard Shortcuts:</h5> <p>SQL Server Management Studio allows keyboard shortcuts that can help you save time and increase efficiency.  Keyboard shortcuts allow you to execute TSQL commands at the push of a button. The best part about keyboard shortcuts is the shortcut can be used to pass highlighted text as parameters.  I will start with some of the built-in keyboard shortcuts.</p> <p>To open the keyboard shortcuts menu, click <strong>Tools –> Options –> Expand Environment –> Click Keyboard</strong>.  Now you will note that keyboard shortcuts are loaded by default.  Some of the defaults are sp_help (Alt+F1), sp_who (Ctrl+2) etc… You can implement just about any code you want in the keyboard shortcut.  For example, you can put the query “select * from sys.dm_exec_requests” directly into the text box.  The beauty of this is you can open a new query window and just hit the shortcut keys to execute the query.  This saves time and makes life a bit easier.  Below is a screenshot of my shortcut list, including the query I posted above.</p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhckDdNa298lSmqa-ElVMCIYB45FTH2JEW6tQ9dvQEWHesubDhhhVupCAS-B0TI4ZdLR_ljh91HK-NMjeaISYVuaWIvVBGHWDuZq0PvO16rElJgspSPSIs1kxWW2sj4AqVGwikARYnldw/s1600-h/image%5B7%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="259" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/S6rRiebKRbI/AAAAAAAAAZ4/rZHKYtcoK-s/image_thumb%5B3%5D.png?imgmax=800" width="431" border="0" /></a> </p> <p></p> <p>Well that is all good and nice but what else can shortcuts do.  The best part about shortcuts in my opinion is they can actually be used to execute stored procedures and supply highlighted values as parameters.</p> <p>Lets run sp_help but we will never type the text sp_help.  Open a new query window and create a new table.</p> <pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.MyTable(ID <span class="kwrd">INT</span>);</pre>
<p>Now type MyTable in the query window and highlight the text.  Once the text is highlighted, hold <strong>Alt and press F1</strong>.  You should see the results of sp_help displayed for the highlighted object, “MyTable.”</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi51cSd93_Q45dJT_CGFL4dMsmqsgGZcRsPl90rU6mgaKjITAUP2voD_JOUDUunZ4ST7jDg-sRTQk7XcNtchv21VVi1eFB7y9s7Dim-Q4OcMjLPHfM1xT23fh_VOiYvXxa8w1Q_EF1FFA/s1600-h/image%5B11%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="294" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/S6rRjasKm6I/AAAAAAAAAaA/ZGR581-IJa0/image_thumb%5B5%5D.png?imgmax=800" width="608" border="0" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p><em>Note: If you need to specify an object in a different schema, you have to use the two part name “schema.Object” and put it in single quotes.  For example, ‘dbo.MyTable’.</em></p>
<p>Now that is pretty cool right?  What I will show you next is even cooler!  I will execute a user defined stored procedure with parameters  Open the keyboard shortcut menu and assign the below stored procedure to a shortcut key.  </p>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S6rRj_xT2sI/AAAAAAAAAaE/umXBUCph1wU/s1600-h/image%5B15%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="271" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/S6rRkOBtlwI/AAAAAAAAAaI/M1V_B3SxmJs/image_thumb%5B7%5D.png?imgmax=800" width="452" border="0" /></a> </p>
<p>Open a new query window and create the below stored procedure.</p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> dbo.KeyBoardShortcut(@Id <span class="kwrd">INT</span>,@Col <span class="kwrd">CHAR</span>(1))
<span class="kwrd">AS</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">SELECT</span> @Id,@Col
<span class="kwrd">END</span>
GO</pre>
<p>Now type the text below and highlight it.  Once highlighted, hit <strong>Ctrl+F1</strong>. <em>(Note: If you assigned the procedure to a different shortcut, you will need to use that shortcut)</em></p>
<pre class="csharpcode">1,<span class="str">'a'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiomc23JBSAEs0oVuMjRa0V_52bRmjVQP22RK-b55WYKcvpGmj4OKUeBMbrI4bEPaR3wj_9iX58bnChhmGy7HkbVqY2_QxhSU9lvSwetVshtsGyY8zFGs01_KRLZCXjo_ltADIbj3x-Tw/s1600-h/image%5B19%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="98" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUdxZusxFj031ELZdk8WKKAwR_NT0Swl0kDc-w04iJSfsaOtfIIBAN7ExuIPdfF8W2-ZEkksenLVTn33ezaE-5-01rIBkhCW5W9rE9feRukbpyl2m73jx4nGt0-GOwv0al_3OZ4Io47g/?imgmax=800" width="331" border="0" /></a> </p>
<p>As you can see, the stored procedure was executed with our highlighted parameters!  This is a awesome feature that can save a lot of time, especially when you have custom code that you need to access frequently.  This increases efficiency and puts more information at your fingertips.  One of my favorite shortcut keys is sp_helptext.  You can use create a shortcut for sp_helptext and highlight any procedure, function, view, trigger, or stored procedure and get the create script of that object.</p>
<h5>Listing All Table Columns</h5>
<p>This is a request that I see very often.  The request usually revolves around developers who have a lot of columns in their table and have to select from most of them.  As you can imagine, typing each column name can be a very tedious process. There are two methods that I use to accomplish this task, when I am feeling a little lazy…… BUT the key is to work smarter not harder right :^) !</p>
<p>The first method is to right-click the table click <strong>script table as –> SELECT To –> New Query Window</strong>.  Voila we now have a select query that lists all the columns in the table.  You can perform the same steps to get a list of all columns in a view.</p>
<p>The next method is to expand the table –> Drag the columns folder into the query window.  This will generate a column delimited list of the column names.</p>
<h5>Scripting Multiple Objects</h5>
<p>I am sure many of you have tried to script multiple objects from Management Studio, but have had little luck.  Now, we could choose to <strong>right-click the database –> tasks –> generate scripts</strong>, but where is the fun it that?  This is an easier way that does not make you jump through hoops or follow a wizard.  The trick is to click on the folder containing the objects you want to script and then look at the Object Explorer Details pane.  Inside the Object Explorer Details pane you can Ctrl or Shift click multiple objects.  Once selected, you can <strong>right-click –> Script <Object Type> AS –> Create –> To New Window</strong>.</p>
<p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/S6rRlp3Vl7I/AAAAAAAAAaU/sOOWSRKP6Dw/s1600-h/image%5B23%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="287" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdTldvEQL5bTFfnlOey9Ymtbzo3R2tJtXQqaZYTGejdDVL3uPrq8yDG9jh743O3Rj7ZKiEQSqBA7JJErGcINzjgclB2cPhFjgNf91vFkkRhaKocfJ1jMx25b9pm7kciosFZaiqGj0nkw/?imgmax=800" width="505" border="0" /></a> </p>
<h5>Creating a TSQL Toolkit</h5>
<p>One of the most underrated or unknown features of SQL Server Management Studio is Template Explorer.  You may be thinking that template explorer is used specifically for TSQL templates, but I will show you a couple of ways to make Template Explorer function as a TSQL Toolkit.  Click <strong>CTRL+ALT+T</strong> or go to <strong>view –> Template Explorer</strong> to open the window.</p>
<p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/S6rRmkkNuII/AAAAAAAAAac/UjCFTTmEgi4/s1600-h/image%5B27%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="386" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/S6rRnDNymCI/AAAAAAAAAag/rRj7wqWqsjk/image_thumb%5B13%5D.png?imgmax=800" width="155" border="0" /></a> </p>
<p></p>
<p></p>
<p></p>
<p>Now this does not look that impressive from the get, but trust me it can and does get better. <strong>Right-click SQL Server Templates and choose New –> then click folder</strong>.  Give the folder a name. I will be using the name Test_Scripts.  Next create a new template in the Test_Scripts folder.  I named the new template sys.dm_exec_requests.  Your template explorer should now look like this.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7WJ5aVb0PXGTX5K83ikzklVTFpTL_bzSroZMvRZ_4sVmNsePKwCByQAH7BacXisf4CDjKG4AcKf2sRgcggQztTLtV5rxGQR1qxL46hdOKheTUJR3cH_8Dpf5Ha41rSYJkS6Tt92XrDg/s1600-h/image%5B30%5D.png"><img title="image" style="border-top-width: 0px; display: inline; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="59" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJCxxTw6lCunjBhpl5RcNfm6AfLA2dP_VeZ8QsPVoaD_N5q0JqLe8bgc2JcsAfQHi91Ee_SN-VoqTwq0lFpGvXJUpS8vtxfyIgBfCSMdrCSETaWffGetjdcDvVCwKGPuaIm3dp7DzXyA/?imgmax=800" width="218" border="0" /></a> </p>
<p>Okay, well that is great…… How does this help me?  Well lets add some TSQL code to that new template.  Once we add code to the template, We can then drag the template into any query window and SSMS will automatically post the code into the open query window.  Basically, we can use Template Explorer as our personal Toolkit.  We can create any folder structure we want and SSMS does a great job of keeping all our scripts organized.  Let’s see this in action.</p>
<p>Right-click the <strong>sys.dm_exec_requests</strong> template and choose edit.  Paste the code below and click save.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> sys.dm_exec_requests</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Open a new query window and drag the <strong>sys.dm_exec_requests</strong>  template into the query window. Voila!!!!!!! the TSQL is automatically scripted into our new query window.  As you can see, we can use the template explorer to save our scripts and make them easily accessible.  This alleviates our need to open windows explorer or browse the file system for our scripts.  Plus al the scripts are saved and managed in one place.  If you need to copy your scripts out of template explorer, the actually already exist on the file system.  The directory will be within your documents.  On my SQL Server 2008 instance, my templates are stored here, C:\Documents and Settings\ahaines\Application Data\Microsoft\Microsoft SQL Server\100\Tools\Shell\Templates\Sql.  I don’t know about you, but I for one love this feature.  This gives me the ability to quickly get data, without having to leave SSMS, plus I do not have to waste time searching the file system.  Give it a try and see what you think.</p>
<p>That is all the tips I have for now.  I hope that you have learned something new or any of these tips can help you.  Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com74tag:blogger.com,1999:blog-4646137438366687895.post-66435833427730771912010-03-17T21:12:00.001-07:002010-06-03T20:29:37.794-07:00Performance Tuning 101 – What You Will Not Learn In The Class Room (Part 2)<div style="PADDING-BOTTOM: 4px; MARGIN: 0px; PADDING-LEFT: 8px; PADDING-RIGHT: 0px; FLOAT: right; PADDING-TOP: 0px" class="wlWriterHeaderFooter"></div><p>In my last post, <a title="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html" href="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html">http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html</a> I talked about performance tuning queries that appear to be well tuned. There are a lot of optimization techniques available unbeknownst to most developers that do not require indexes or radical code changes. These are the optimizations that I will be talking about in this post. There is absolutely no way I could go over every possible optimization technique available, but I will do my best to present as much content here today, and will make future posts on other techniques.</p><p>I will start things off by talking about a challenge that Ramesh Meyyappan presented in his webcast, <a href="http://www.sqlworkshops.com/">http://www.sqlworkshops.com/</a>. Ramesh’s challenge was to solve the TOP 101 phenomenon, using SQL Server 2005. To start things off, I will create a sample table, with data.</p><pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2000)
);
<span class="kwrd">GO</span>
;<span class="kwrd">WITH</span>
L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span>
,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)
,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)
,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)
,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)
,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)
,Number <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5)
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span>
N <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%1000000+1 <span class="kwrd">AS</span> SomeId ,
REPLICATE(<span class="str">'a'</span>,2000) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span> Number
<span class="kwrd">WHERE</span> [N] <= 50000
<span class="kwrd">GO</span>
<span class="kwrd">UPDATE</span> <span class="kwrd">STATISTICS</span> dbo.[TestData] <span class="kwrd">WITH</span> FULLSCAN;
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Next, I will create a query that uses TOP and an order by to return 100 rows. </p>
<pre class="csharpcode">--Fast
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 100 [RowNum],[SomeId],[SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
<span class="kwrd">OPTION</span> (MAXDOP 1)
/*
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 1 ms.
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 78 ms, elapsed <span class="kwrd">time</span> = 102 ms.
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now watch what happens when I change the TOP operator to 101. You will notice that I did not change anything else in the query other than increasing the number of rows returned by 1.</p>
<pre class="csharpcode">--Slow
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101 [RowNum],[SomeId],[SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
<span class="kwrd">OPTION</span>(MAXDOP 1)
/*
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 312 ms, elapsed <span class="kwrd">time</span> = 1690 ms.
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Wow…. TOP 101 is over 17 times slower than TOP 100 and all I changed is the number of rows in the TOP operator!!!!! So why does the optimizer take so much longer to optimize and execute a query using TOP 101, oppose to TOP 100? The short answer is the memory requirements. The TOP 101 queries requires a lot more query memory than TOP 100, which translates into tempdb sorting. As you may recall, I addressed some techniques to solve the tempdb sorting problem in my last post, <a title="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html" href="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html">http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html</a>. If you are using SQL 2008, you can use the same optimization techniques presented in my prior post, but SQL 2005 is a completely different animal. To make the TOP 101 query faster, we need to first understand why it is slower. Let’s take a look at what is different when we run the TOP 100 and the TOP 101 query.</p>
<p>Lets start by looking at the memory SQL Server grants to each query. Open two different query windows and execute each top query within a while loop. We can then use sys.dm_os_memory_grants to get the required memory. </p>
<p>Here is a sample of how to run the TOP query in a while loop.</p>
<pre class="csharpcode"><span class="kwrd">WHILE</span> 1=1
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 100 [RowNum],[SomeId],[SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
<span class="kwrd">OPTION</span>(MAXDOP 1)</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>In a new query window, run the following query to get the memory specifications.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> [granted_memory_kb],[required_memory_kb],[max_used_memory_kb] <span class="kwrd">FROM</span> sys.dm_exec_query_memory_grants <span class="kwrd">WHERE</span> [session_id] = 58</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Here are my results:</p>
<pre class="csharpcode"> granted_memory_kb required_memory_kb max_used_memory_kb
--<span class="rem">-- -------------------- -------------------- --------------------</span>
FAST 1024 216 216
granted_memory_kb required_memory_kb max_used_memory_kb
--<span class="rem">-- -------------------- -------------------- --------------------</span>
SLOW 6040 512 6040</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The results are simply astonishing. The memory requirements increase nearly 28 times when I use TOP 101, instead of TOP 100. I do not have a formal explanation of why the TOP 101 operator, consumes more memory than TOP 100. Brad Schulz, <a title="http://bradsruminations.blogspot.com/" href="http://bradsruminations.blogspot.com/">http://bradsruminations.blogspot.com/</a>, has contacted Conor Cunningham about this issue and believes that 101 is an arbitrary threshold. Brad is working on an in-depth post involving the TOP operator. Keep an eye out for this one, as it should be really good. Anyway, once the 101 threshold is breached the optimizer uses different calculations to optimize a query, which can effectively bloat the memory requirements for the query. This memory bloat forces the sort operation to spill into tempdb. This is where the TOP 101 bottleneck exists. To verify this problem, open profiler and choose the sort warnings counter and you will see that the slow query has a sort warning error, while the fast query does not.</p>
<p>Now that I have identified the problem, how do I solve it? I will start by attempting the methods that I used in the previous article, <a title="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html" href="http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html">http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html</a>.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101 [RowNum],[SomeId],<span class="kwrd">CAST</span>([SomeCode] <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(4200))
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
<span class="kwrd">OPTION</span>(MAXDOP 1)
/*
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 344 ms, elapsed <span class="kwrd">time</span> = 3385 ms.
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Bloating the estimated row size still did not help our situation. Next I will try shrinking the row size.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101 [RowNum],[SomeId],RTRIM(<span class="kwrd">CAST</span>([SomeCode] <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(2000)))
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
<span class="kwrd">OPTION</span>(MAXDOP 1)
/*
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 344 ms, elapsed <span class="kwrd">time</span> = 2461 ms.
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
*/</pre>
<p>Hmm. Still no luck….. How can I reduce the row size of the input passed into the sort operator? When you really sit back and think about the problem, the answer is really simple. To reduce the row size, all you have to do is reduce the columns involved in the sort. I like to use the TOP inside a derived table, making sure to only use the RowNum and SomeId columns. We can then join back onto the TestData table. This gives us a fast sort and a ultra fast index seek on the 101 rows we are returning. </p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> t.[RowNum],t.[SomeId], t.[SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData] t
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span>(
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101 [RowNum],[SomeId]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [SomeId]
) <span class="kwrd">AS</span> t2
<span class="kwrd">ON</span> T.RowNum = t2.RowNum
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t.[SomeId]
<span class="kwrd">OPTION</span> (MAXDOP 1)
/*
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 31 ms, elapsed <span class="kwrd">time</span> = 104 ms.
<span class="kwrd">SQL</span> Server parse <span class="kwrd">and</span> compile <span class="kwrd">time</span>:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Alternatively, we can use correlated subqueries or the cross apply operator.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101
t.[RowNum],
(<span class="kwrd">SELECT</span> t2.[SomeId] <span class="kwrd">FROM</span> dbo.[TestData] t2 <span class="kwrd">WHERE</span> t2.[RowNum] = t.[RowNum]) <span class="kwrd">AS</span> SomeId,
(<span class="kwrd">SELECT</span> t2.[SomeCode] <span class="kwrd">FROM</span> dbo.[TestData] t2 <span class="kwrd">WHERE</span> t2.[RowNum] = t.[RowNum]) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span> dbo.[TestData] t
<span class="kwrd">WHERE</span> t.[RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t.[SomeId]
<span class="kwrd">OPTION</span> (MAXDOP 1)
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 101
t.[RowNum],
t2.SomeId,
t2.SomeCode
<span class="kwrd">FROM</span> dbo.[TestData] T
<span class="kwrd">CROSS</span> APPLY(<span class="kwrd">SELECT</span> t2.SomeId, t2.SomeCode <span class="kwrd">FROM</span> dbo.[TestData] t2 <span class="kwrd">WHERE</span> t2.[RowNum] = t.[RowNum]) <span class="kwrd">AS</span> t2
<span class="kwrd">WHERE</span> t.[RowNum] < 30000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t.[SomeId]
<span class="kwrd">OPTION</span> (MAXDOP 1)</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
It should be noted that the correlated subquery method will produce more IO because it uses two subqueries. As you can see, the solution to this challenge is quite simple, but the solution requires an understanding of what is occurring underneath the hood of SQL Server. </p>
<p><em>Note: It is still possible that some of the sorting will be sent to tempdb, but you should see a elapsed time that rivals TOP 100.</em></p>
<p>The next optimization technique, I will be demonstrating is a predicate pushing problem. Unbeknownst to most developers, SQL Server 2005 does have a problem with predicate pushing in views. A lot of these issues have been resolved in SQL Server 2008, but should be known. I will be demonstrating a very simple example, using a ranking function. Ranking functions are relatively new to SQL Server and were introduced in 2005. I am sure there are other scenarios that cause predicate pushing problems, but I will only be addressing the ranking problem, in this post.</p>
<p>Let’s start by creating a small sample table.</p>
<pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> Test(
ID <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
FName <span class="kwrd">VARCHAR</span>(50),
LName <span class="kwrd">VARCHAR</span>(50)
);
INSERT <span class="kwrd">INTO</span> dbo.Test <span class="kwrd">VALUES</span> (<span class="str">'Adam'</span>,<span class="str">'Haines'</span>);
INSERT <span class="kwrd">INTO</span> dbo.Test <span class="kwrd">VALUES</span> (<span class="str">'John'</span>,<span class="str">'Smith'</span>);
INSERT <span class="kwrd">INTO</span> dbo.Test <span class="kwrd">VALUES</span> (<span class="str">'Jane'</span>,<span class="str">'Doe'</span>);
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_LName <span class="kwrd">ON</span> dbo.Test(LName) <span class="kwrd">INCLUDE</span>(FName);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>As you can see, the table is relatively simple. The idea is to present an easy to understand example that demonstrates potential performance problems with views.</p>
<p>Here is my simple query that shows an index seek on LName.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> Id,FName,LName,ROW_NUMBER() <span class="kwrd">OVER</span>(PARTITION <span class="kwrd">BY</span> LName <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> Id) <span class="kwrd">AS</span> seq
<span class="kwrd">FROM</span> dbo.Test
<span class="kwrd">WHERE</span> LName = <span class="str">'Smith'</span>
GO</pre>
<p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/S6GoIUzrVMI/AAAAAAAAAZc/wejeWU8BuTU/s1600-h/image%5B3%5D.png"><img style="BORDER-BOTTOM: 0px; BORDER-LEFT: 0px; DISPLAY: inline; BORDER-TOP: 0px; BORDER-RIGHT: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/S6GoJEU6F8I/AAAAAAAAAZg/sJLbt4WiWRs/image_thumb%5B1%5D.png?imgmax=800" width="606" height="116" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>Let’s see what happens when I put the logic into a view, with no predicate. The predicate will be called from outside the view and should be pushed down into the view, as views are materialized into the underlying objects at runtime.</p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">VIEW</span> dbo.vw_Test
<span class="kwrd">AS</span>
<span class="kwrd">SELECT</span> Id,FName,LName,ROW_NUMBER() <span class="kwrd">OVER</span>(PARTITION <span class="kwrd">BY</span> LName <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> Id) <span class="kwrd">AS</span> seq
<span class="kwrd">FROM</span> dbo.Test
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>I will now query the view using the same predicate as the original query.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> Id,FName,LName,seq
<span class="kwrd">FROM</span> dbo.vw_Test
<span class="kwrd">WHERE</span> LName = <span class="str">'Smith'</span>
GO</pre>
<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/S6GoJa6BkaI/AAAAAAAAAZk/ctVSsJJVFdg/s1600-h/image%5B7%5D.png"><img style="BORDER-BOTTOM: 0px; BORDER-LEFT: 0px; DISPLAY: inline; BORDER-TOP: 0px; BORDER-RIGHT: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/S6GoKGtndaI/AAAAAAAAAZo/VzqbA9h9n48/image_thumb%5B3%5D.png?imgmax=800" width="566" height="133" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>The problem here is the optimizer decided to filter the results of the query <strong>AFTER</strong> the table “Test” has been scanned. One should expect the optimizer to seek on the LName column because the optimizer should push the predicate; however, SQL Server 2005 does not do a great job of this. SQL Server 2008 will appropriately push the predicate deep into the plan to get the index seek. How do we solve this problem? Unfortunately, there is not a whole lot you can do to make the plan work more efficiently. The best option in my opinion is to a INLINE TVF to parameterize the query. </p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_Test(@LName <span class="kwrd">VARCHAR</span>(50))
<span class="kwrd">RETURNS</span> <span class="kwrd">TABLE</span>
<span class="kwrd">RETURN</span>(
<span class="kwrd">SELECT</span> Id,FName,LName,ROW_NUMBER() <span class="kwrd">OVER</span>(PARTITION <span class="kwrd">BY</span> LName <span class="kwrd">ORDER</span> <span class="kwrd">BY</span> Id) <span class="kwrd">AS</span> seq
<span class="kwrd">FROM</span> dbo.Test
<span class="kwrd">WHERE</span> LName = @LName
)
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>Now execute a select against the TVF using the same predicate.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> Id,FName,LName,seq
<span class="kwrd">FROM</span> dbo.fn_Test(<span class="str">'Smith'</span>)</pre>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S6GoKheNGOI/AAAAAAAAAZs/aQEhHjn_9U0/s1600-h/image%5B11%5D.png"><img style="BORDER-BOTTOM: 0px; BORDER-LEFT: 0px; DISPLAY: inline; BORDER-TOP: 0px; BORDER-RIGHT: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/S6GoMCRJW-I/AAAAAAAAAZw/M0DmwmM5Jt0/image_thumb%5B5%5D.png?imgmax=800" width="551" height="139" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>There you have it. I have demonstrated a few optimization techniques that I have used to solve performance problems. I have only scratched the surface here. There are many more optimization techniques available. Stay tuned for future posts, where I will explore even more optimization techniques including a shocking example demonstrating how an index rebuild can introduce fragmentation and how to avoid it.</p>
<p>Until next time, happy coding.</p>Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com16tag:blogger.com,1999:blog-4646137438366687895.post-7566881161710837522010-03-07T22:46:00.001-08:002010-03-07T22:58:10.977-08:00Performance Tuning 101 – What You Will Not Learn In The Classroom<div class="wlWriterHeaderFooter" style="float:right; margin:0px; padding:0px 0px 4px 8px;"><script type="text/javascript">digg_url = "http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html";digg_title = "Performance Tuning 101 – What You Will Not Learn In The Classroom";digg_bgcolor = "#FFFFFF";digg_skin = "normal";</script><script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script><script type="text/javascript">digg_url = undefined;digg_title = undefined;digg_bgcolor = undefined;digg_skin = undefined;</script></div><p>I was reading an article last week and saw a snippet of code that claimed to make fully optimized code, more efficient.  The article itself was not written by the original content creator.  With a little investigative work, I came across the website and author who first developed the presented performance tuning techniques.  The website that hosted the content is <a title="http://www.sqlworkshops.com/" href="http://www.sqlworkshops.com/">http://www.sqlworkshops.com/</a> and the original author is Ramesh Meyyappan, <a href="mailto:rmeyyappan@sqlworkshops.com">rmeyyappan@sqlworkshops.com</a>.  Ramesh is SQL Server consultant and professional trainer, who has worked previously with Microsoft and has real world knowledge of performance tuning techniques.  On the SQLWorkshops website, Ramesh offers 3 free webcasts that demonstrate how to performance tune queries.  He focuses on query memory, parallelism, MAXDOP (MAX Degree of parallelism), and wait stats.  I recommend that all database professionals view these webcasts.  There is a lot of great content presented in these webcasts that you cannot find in a classroom.  In this post, I will take a few of the methods provided and explain them.  I will also  provide additional methods that were not covered in the webcast.  On a side note, if you have not figured out the challenge Ramesh presented, I will show you a few of the solutions I came up with, in my next installment.  </p> <p>The first optimization technique I will explore deals with data type mismanagement.  The two data types I will be evaluating in this post are VARCHAR and CHAR.  There are a lot of do’s and don’ts regarding these two data types, on the Internet.  I will not engage the pros and cons in this post.  The main point is a performance penalty can occur, if the wrong data type is chosen.  You should always choose a data type that most closely resembles the size and structure of your data.  Experienced database professionals know to use data types that closely resemble the data  being stored; however, inexperienced database professionals fail to see the long term impact of poor design choices.  Unfortunately, it is quite common for database professionals to use a catch all data type like VARCHAR(8000) or CHAR(2000), without fully understanding the data itself.  In most cases, choosing the wrong data type results in wasted storage and slower query performance, as I will show in this post.  I will get this party started by creating my sample table and data. </p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2000)
);
<span class="kwrd">GO</span>
;<span class="kwrd">WITH</span>
L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span>
,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)
,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)
,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)
,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)
,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)
,Number <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5)
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span>
N <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%1000000+1 <span class="kwrd">AS</span> SomeId ,
REPLICATE(<span class="str">'a'</span>,ABS(CHECKSUM(NEWID())) % 2000 + 1) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span> Number
<span class="kwrd">WHERE</span> [N] <= 50000
<span class="kwrd">GO</span>
<span class="kwrd">UPDATE</span> <span class="kwrd">STATISTICS</span> dbo.[TestData] <span class="kwrd">WITH</span> FULLSCAN;
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The first thing I want to illustrate is the average size of our SomeCode column.  We are using a CHAR(2000) data type; however, we are storing much less data than this on average.  </p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> <span class="kwrd">AVG</span>(LEN(SomeCode)) <span class="kwrd">AS</span> AvgLen <span class="kwrd">FROM</span> dbo.TestData <span class="kwrd">WHERE</span> [RowNum] < 35000
/*
AvgLen
---------<span class="rem">--</span>
997
*/</pre>
<p>As you can see from the query above the average SomeCode length is 997 bytes; however, we are storing 2000 bytes per row.    This means we are consuming 50% more storage, per row.  Using a char data type is typically faster than using a VARCHAR data type; however, the cost of storing CHAR data types usually offset the performance gain.  If I were to choose a VARCHAR(2000) data type, I would consume less storage, which translates into faster table scans, smaller tables, faster backups, faster restores etc….  Performance will increase because less storage, is directly correlated to less pages.   The bottom line is less is faster.</p>
<p>The query I will be demonstrating returns a range of rows, where the sort order is different than the primary clustered key sort.  The IO stats will be the same across all variants of this query, so the counter will not illustrate a proper delta.  I will be using elapsed time to illustrate the delta between queries, with varying predicates.  </p>
<p>Let’s turn on STATISTICS TIME and NOCOUNT.</p>
<pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> <span class="kwrd">TIME</span> <span class="kwrd">ON</span>;
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now run the following query.  Make sure to make query results go to text. (Note: I assign columns to local variables to avoid results being printed).  Run the query a couple of times to warm the cache and get a baseline CPU and elapsed time.</p>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">CHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = [SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3000 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/S5SdN6byd_I/AAAAAAAAAYI/4oGfpTHkP0g/s1600-h/image%5B3%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="149" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-HsxoRPfXOoLi78mifDhiIM6L40aI9zxXgCSRjUPukTfvlxapl663Y2hTJnVkHRZSGLbkOR-EGJL_t6ZywrKpN6P1E_6T7Jh0v5qlKdo561_IELI0HCY5MjsNUrqmvGMRruSU_rtxRg/?imgmax=800" width="328" border="0" /></a> </p>
<p>The query is very fast and performs within our SLA of 2 seconds.  Let’s see what happens when I increase the number of rows in the predicate to 3500.</p>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">CHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = [SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S5SdO7hNL8I/AAAAAAAAAYQ/BOeQGqCr-48/s1600-h/image%5B8%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="137" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHmkzcjsALIkTAqBtdOa_gENQ0_T9gme-gRCOVDsy1-WotXoTBfxQHOr4Amwyj_BXwn9Jb7Ooy8kYQpbne-CSSaGlSkx6f96huaRsIVKCgtQSS25QQDUrho6k7Sri2gHkn94jku36Uhw/?imgmax=800" width="322" border="0" /></a> </p>
<p>Wow… by returning 5000 more rows, the query is over 14 times slower!  You may be thinking that the speed difference is neglible, but on production systems the elapsed time delta can be quite extensive, at which point performance is a real problem.  Let’s check the execution plan for each query side by side.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxn8fUUa3P1uV5Z7M9zmV8PjOWZlIFwC0bjcnP44WmR9f17wyQVeWAtfN8tKy0kaij1L6blK8cEv1NnzCxtSj7CGRpNielL8ENnZ3bEWRoPq0QChknLwmoCV6rf2zzYGdKar8ew9cytA/s1600-h/image%5B13%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="262" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJF_tLFxtYXoQHqt7Jp73eTQsJnhmS6eHbyvaF271pko40uoCGo_glQ1PDg441Ysl6aiZYizCNajHEFWWMJB1Ic_WTnWtNL1RANP_YNZlGV4ab3WccRnCQwBRkMUHyIgDjXT6sFYX0gQ/?imgmax=800" width="704" border="0" /></a> </p>
<p>Well that is not much help….. both of our query plans look good.    There is really not much you can do to make the query perform better.  The index is doing a great job at returning the data., but that pesky sort is degrading performance.  The first thing I need to investigate is the detail of the query plan.The first query plan attribute I will look at is the clustered index seek estimated row size.  The estimated row size is derived from statistics and some internal calculations.    </p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijgDws55KdSNLy-7ZhD-z0eUNspXVIK2U030IhkvLH5WL2LOoc_My5uJ-Tbnz-T0mBBZ8Mi2k-JVD1Y-m81yTehPCDRQekMcjUlIbHWUAXAEG3G0FH5kDOcgLfRGzLXv2aTZcz8UCcRQ/s1600-h/image%5B21%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="305" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/S5SdRG6egbI/AAAAAAAAAYk/LhR2SGGyinY/image_thumb%5B11%5D.png?imgmax=800" width="346" border="0" /></a> </p>
<p>The row size is estimated to be 2015 bytes for each row in the clustered index seek.  The row size is derived from native data types.  For example, we are using a CHAR(2000) and two integer columns.  This makes our row size 2000 (1 byte per char)+ 4 (int) + 4 (int) = 2008.  As you can see, 2008 is less than 2015.  The additional 7 bytes are associated to row overhead and the NULL bitmap.  The sort operator is also estimated a row size of 2015.  All of the estimates play a part in dictating how much memory will be granted to the query.  Next I am going to put the query into a while loop, so I can see the memory grants.  Open two new query windows and paste the below code.  Make sure to change the RowNum to 3000, in one of the query windows.  Take note of the session id of each of the windows.</p>
<pre class="csharpcode"><span class="kwrd">WHILE</span> 1 = 1
<span class="kwrd">BEGIN</span>
<span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">CHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = [SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --SWITCH <span class="kwrd">OUT</span> <span class="kwrd">TO</span> 3500 <span class="kwrd">TO</span> 3000
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId
END</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>In a completely new window, paste the below DMV query.  This query will return the memory grants for each of the sessions.  The key columns from this query are Max_Used_Memory and Granted_Memory.  If the Max and Granted memory are the same, it is likely that the query is consuming all the RAM granted to the query and the optimizer has to rely on tempdb to help with some operations.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> [granted_memory_kb],[required_memory_kb],[max_used_memory_kb] <span class="kwrd">FROM</span> sys.dm_exec_query_memory_grants <span class="kwrd">WHERE</span> [session_id] = 56
<span class="kwrd">SELECT</span> [granted_memory_kb],[required_memory_kb],[max_used_memory_kb] <span class="kwrd">FROM</span> sys.dm_exec_query_memory_grants <span class="kwrd">WHERE</span> [session_id] = 60</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>All the execution plan details look fine, so I will dig a bit deeper into tempdb to unearth the real problem.  Let’s have a look at the tempdb IO stats to see if either query is using tempdb to perform any operations.</p>
<p>I will start by executing the “fast” query.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> sys.[dm_io_virtual_file_stats](2,1)
--FAST
<span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">CHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = [SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3000 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId
<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> sys.[dm_io_virtual_file_stats](2,1)
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2JIlbiXrwLZ1lS4XOqYaGx1jmEs2OvIj-linNW-9SG8R4guZ7315ollrg1fM6hRmqM3SXETSt4Y0lVZ-C7O2ijEelv1wLJJ-kBXtfBR3DcXXwgRId3SDuit_571DH0SBZiqABSLA2Sw/s1600-h/image%5B27%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="211" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDxV4Xfo27tsnPmMDttfITTQ_BW5ftVbMMlD-EhND6lJRbdYbvYRe_qotmj9LTSxrR8FHjXLMXfT67WWF3I-wschC7Mn8a6o6m_hUdFrjg0lsmbULX1puNHUfi1Js-FSvP1hGDvDrDzA/?imgmax=800" width="737" border="0" /></a> </p>
<p></p>
<p></p>
<p>The important columns to look at are num_of_reads, num_of_bytes_read, num_of_writes, num_of_bytes_written.  The number of reads and writes before and after the query are exactly the same.  This means the optimizer did not have to use tempdb operations.  Next, I will execute the slow query, which uses a predicate of less than 3500.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> sys.[dm_io_virtual_file_stats](2,1)
--FAST
<span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">CHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = [SomeCode]
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId
<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> sys.[dm_io_virtual_file_stats](2,1)
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbJvymwrNe-xDLAbVxSUE_iSy1MfScv5xtFlMIse7FzItTzJLy7eY-Ujp3I_1Y2oIHJ5iRULxPRjAcmUkNQNb5qJSkEIdySjCvU2H_mcnqUlljMxoxkaBn8Dl7a_q_INTygv_ianFHQg/s1600-h/image%5B32%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="209" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7WMFs2flelqoW5Wmc0kdYohuMXeS1WQpLIJylIuToaJRo7aZEUB3fE4bWvONrRZZV5vF2O-C-zTMDHMv7wcdTNUouX6BC8JuUOm8SFyqJ97CbfjGDkiZFtF-4kFXD3ADJht89PyhAPg/?imgmax=800" width="750" border="0" /></a> </p>
<p>By George…… I think we have something!!!  Why does the second query have more reads and writes into tempdb?  The answer is the sort operation is being done in memory for the “fast” query and the sort is being done in tempdb for the “slow” query.  Another method you can use to identify tempdb sort operations is Sort Warninigs.  When tempdb is used to perform sort operations the sort warnings counter will display it.  Below is a screenshot of the counter and the description.</p>
<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/S5SdTnXVwDI/AAAAAAAAAY8/VUyep-s9Nb8/s1600-h/image%5B37%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="435" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/S5SdUB-RxxI/AAAAAAAAAZA/dKw9U1mBMIs/image_thumb%5B21%5D.png?imgmax=800" width="674" border="0" /></a> </p>
<p>Now that I have identified the problem, how do I solve it?  There are a couple of ways to solve this problem.  One solutions is to increase the estimated row size, so the optimizer grants more memory to the query, which in turn allows the sort to fit into memory.  The second option is to decrease the row size, so the sort fits into memory.   Let’s see this in action.</p>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">VARCHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = <span class="kwrd">CAST</span>([SomeCode] <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(2000))
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>This method uses cast to convert the CHAR(2000) column to a VARCHAR(2000).  Casting the column to a VARCHAR(2000)   helps to reduce the estimated row size.  The estimated row size of a VARCHAR column is calculated as 50% of the total size.  The estimated row size is 50% of the VARCHAR column because 50% is a safe estimate because it really does not know how full the row is.  Generally speaking, VARCHAR columns never use 100% of a variable column anyway, so this helps SQL Server save resources. In our case, a VARCHAR(2000) is estimated to be approximately 1000 bytes.  Let’s have a look at the execution plan.</p>
<p><a href="http://lh3.ggpht.com/_ayZBUzPGG9A/S5SdUrHcbAI/AAAAAAAAAZE/eQ92LFNnW5c/s1600-h/image%5B41%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="289" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiamkxSeoCsh4a89c3bsXRYhcn3HqsRmx38bOGmLg0NFkUWvMxS2YehkmvnJy7nkrkZTL4nkwxQw7AiHefKjm28rCLBX4plikYmRN1SV-8tfPE2jLPoJFGQ_VwonOrgFS3ZqosK_lMJXA/?imgmax=800" width="564" border="0" /></a> </p>
<p>Unfortunately the conversion itself is not enough to optimize this query.  Let’s try to trim the CHAR column prior to the cast, so the row size is decreased.  The interesting thing here is the optimizer still reports the same estimated row size; however, the sort now occurs in memory.  This will help the sort fit into memory because the estimated row size is significantly less than 1019 for each row, so the optimizer is unintentionally giving extra memory to process this query.   Let’s see this in action.</p>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">VARCHAR</span>(2000)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = RTRIM(<span class="kwrd">CAST</span>([SomeCode] <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(2000)))
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiOC0HQ2io0nVCXn7YxJ5i17ygn8yQS2m6mCs2BrV1APzlYL08MLvGfbsyKeET9wUOVWGfHufw7zf3n93-oHm94DVzSUcwXotNFrWyW7Cpm8HCSgBdhQaQ_Gm_K0LbcD0dVGZxTVxg2w/s1600-h/image%5B45%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="138" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXwaW_HP5HIMOD4gxzk-apVwODAagCG-6Z5qvfeVN9_n9l7nkF7SNxnn2kNbfHOMmO9Hw5PXo_9zmWGgG0u2-DBxhjf5RKVxWeJmNp5YjiBOXoe0-KftLfuXxHlIuTVahyACEItQH7jg/?imgmax=800" width="396" border="0" /></a> </p>
<p>That’s  more like it.  If you run the query, with sys.dm_io_virtual_stats, you will see the query no longer uses tempdb to sort the rows.  This method works works well for queries where the predicate is < 3500.  If the returned range is significantly larger than 3500 rows, it is likely that the sort would spill into tempdb.  The idea here is to cater to every day usage patterns and let the exceptions take longer.  It should also be noted that if the CHAR(2000) column is nearly full for every row, the RTRIM method will not have the same impact.  When the RTRIM method is not cutting the mustard, it is time to go with another option…. such as bloating the estimated row size.</p>
<p>This is the solution presented in Ramesh’s webcast.  To do this we will use the same cast conversion, to introduce a compute scalar into the query plan, but this time we will make the value larger.  When choosing a larger number it is important to use a number that is not too large, as this could have an adverse effect on query performance, or other queries running on the server.  More information can be obtained via the webcast.</p>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @RowNum <span class="kwrd">INT</span>,@SomeInt <span class="kwrd">INT</span>, @SomeCode <span class="kwrd">VARCHAR</span>(4200)
<span class="kwrd">SELECT</span> @RowNum = [RowNum], @SomeInt = [SomeId], @SomeCode = <span class="kwrd">CAST</span>([SomeCode] <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(4200))
<span class="kwrd">FROM</span> dbo.[TestData]
<span class="kwrd">WHERE</span> [RowNum] < 3500 --3000 <span class="kwrd">is</span> <span class="kwrd">in</span> memory <span class="kwrd">and</span> 3500 sort <span class="kwrd">in</span> tempdb
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> SomeId</pre>
<pre class="csharpcode"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1InfcyiiHKpy6TjAQRBrdryEM1RKcua6XuRR-aK9akrJZX4Ue0jlr3d9pnetWD5hlHu8zVUlIceExZXFqkB5jrobFASCw3CGPWvr9Hg-B6TWnTxYNfb2zqvpiHGHiYEYnPpNGejJ_lg/s1600-h/image%5B49%5D.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="147" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvE_glKOMG24lfyi8DovVdiKJ6Kd1nbWz6gclCwHxpFbs04zcKZtvBHL846Awql-qS-vD4A8DfzZJ8YBI36Bg6Nwlp419vswbBJy39K4q3lAoWIz9pcMS8APfUjBxV56GQKJI_RH8Olw/?imgmax=800" width="408" border="0" /></a> </pre>
<p><em>Note: I am using SQL Server 2008 to perform my tests, but I have  tested both of these solutions in SQL Server 2005 and obtained similar results. </em></p>
<p>There you have it. I have done an introduction to performance tuning queries that seemingly have an ideal execution plan.  A lot of the concepts presented here were taken from <a href="http://sqlworkshops.com">http://sqlworkshops.com</a> .  SQLWorkshops.com provides 3 webcasts that present a wealth of performance tuning tips that will benefit all levels of database professionals.  Stay tuned for my next post!!! I will be dissecting more query optimization techniques including two solutions to the TOP 101 challenge. </p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com5tag:blogger.com,1999:blog-4646137438366687895.post-85658045215545104352010-02-17T18:51:00.001-08:002010-02-18T08:11:23.186-08:00Optimizing SQL Server Joins - Part 2<p>In my last post, I talked about covering the query and the join expression, with a single index, <a title="http://jahaines.blogspot.com/2010/01/optimizing-sql-server-joins.html" href="http://jahaines.blogspot.com/2010/01/optimizing-sql-server-joins.html">http://jahaines.blogspot.com/2010/01/optimizing-sql-server-joins.html</a>. Covering the join expression allows the optimizer to seek the joining column and has the benefit of removing the very costly Key/RID lookup. In this post, I will be talking about optimizing join order.</p> <p>Let’s start with a very simplistic definition of join order. Join order is the order in which the optimizer processes tables involved in the join clause. When tables are processed, the tables are placed in either the top or bottom input, which dictates how the data is searched and how many rows are affected. There is plenty of misinformation out there stating that the order you specify your joins in can directly impact query performance. While this is partially true, it is more the exception than the rule. This misinformation came about because the optimizer is a cost based optimizer. The optimizer has to determine the best plan in the shortest amount of time. The optimizer does not have all the time in the world to test each and every join order or permutation. Because the optimizer cannot evaluate every possible permutation (in some cases), it is possible for the optimizer to miss a more optimal join order, but this would only happen when a query has an exuberant number of joins. Like I said before this is not a regular occurrence. In almost all cases, the join order is of negligible concern. Let’s see this in action. </p> <p>First I will create my sample tables and load them with data.</p> <pre class="csharpcode"><span class="kwrd">USE</span> tempdb
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.Customers'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.Customers;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.Customers(
CustId <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
FName <span class="kwrd">VARCHAR</span>(50),
LName <span class="kwrd">VARCHAR</span>(50)
);
<span class="kwrd">GO</span>
;<span class="kwrd">WITH</span>
L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span>
,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)
,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)
,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)
,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)
,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)
,Nums <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5)
INSERT dbo.Customers(FName,LName)
<span class="kwrd">SELECT</span>
<span class="str">'FName'</span> + <span class="kwrd">CAST</span>(N <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(10)),
<span class="str">'LName'</span> + <span class="kwrd">CAST</span>(N <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(10))
<span class="kwrd">FROM</span> Nums
<span class="kwrd">WHERE</span> N<=1000;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.Orders'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.Orders;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.Orders(
OrderId <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
CustId <span class="kwrd">INT</span>
);
<span class="kwrd">GO</span>
;<span class="kwrd">WITH</span>
L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span>
,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)
,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)
,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)
,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)
,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)
,Nums <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5)
INSERT dbo.Orders(CustId)
<span class="kwrd">SELECT</span>
ABS(CHECKSUM(NEWID())%250+65)
<span class="kwrd">FROM</span> Nums
<span class="kwrd">WHERE</span> N<=100;
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_CustId <span class="kwrd">ON</span> dbo.Orders(CustId);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<div id="codeSnippetWrapper">First and foremost, I really do not recommend using HINTS like FOREPLAN, Plan Guides, or OPTION(FORCE ORDER), unless it is a last resort. In most cases, performance problems can be attributed to non-scalable code, and/or missing indexes; therefore, the first step in solving performance problems is to optimize code. In this post, I will be using FORCEPLAN to illustrate how join order can affect performance. I will start by running a set of simple queries. One query will not use FORCEPLAN and the other will use FORCEPLAN. FORCEPLAN makes the optimizer maintain the JOIN order specified. One of the biggest drawbacks of FORCEPLAN is that a nested loop join will likely be used, which may or may not be efficient for your query plan. The optimizer can still use other types of joins if other hints are used, or the optimizer has to use another join to satisfy the query. You can find more information on FORCEPLAN here, <a title="http://msdn.microsoft.com/en-us/library/ms188344.aspx" href="http://msdn.microsoft.com/en-us/library/ms188344.aspx">http://msdn.microsoft.com/en-us/library/ms188344.aspx</a>. SQL Server 2005 has more force options available that do not have the same join type limitation, such as OPTION(FORCE ORDER), Plan hints, and plan guides; however, these can be just as bad for performance, if used excessively or inappropriately.</div>
<div id="codeSnippetWrapper"> </div>
<pre class="csharpcode"><span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">ON</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> o.CustId,c.FName,c.LName
<span class="kwrd">FROM</span> dbo.Orders o
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.Customers c
<span class="kwrd">ON</span> c.CustId = o.CustId
<span class="kwrd">WHERE</span> c.LName = <span class="str">'LName151'</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">OFF</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">OFF</span>;
<span class="kwrd">GO</span>
/*
<span class="kwrd">Table</span> <span class="str">'Orders'</span>. Scan <span class="kwrd">count</span> 1, logical <span class="kwrd">reads</span> 2, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.
<span class="kwrd">Table</span> <span class="str">'Customers'</span>. Scan <span class="kwrd">count</span> 1, logical <span class="kwrd">reads</span> 7, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.
<span class="kwrd">SELECT</span> o.CustId,c.FName,c.LName <span class="kwrd">FROM</span> dbo.Orders o <span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.Customers c <span class="kwrd">ON</span> c.CustId = o.CustId <span class="kwrd">WHERE</span> c.LName = <span class="str">'LName151'</span>
|--Nested Loops(<span class="kwrd">Inner</span> <span class="kwrd">Join</span>, <span class="kwrd">OUTER</span> <span class="kwrd">REFERENCES</span>:([c].[CustId]))
|--<span class="kwrd">Clustered</span> <span class="kwrd">Index</span> Scan(<span class="kwrd">OBJECT</span>:([tempdb].[dbo].[Customers].[PK__Customers__7E6CC920] <span class="kwrd">AS</span> [c]), <span class="kwrd">WHERE</span>:([tempdb].[dbo].[Customers].[LName] <span class="kwrd">as</span> [c].[LName]=<span class="str">'LName151'</span>))
|--<span class="kwrd">Index</span> Seek(<span class="kwrd">OBJECT</span>:([tempdb].[dbo].[Orders].[ncl_idx_CustId] <span class="kwrd">AS</span> [o]), SEEK:([o].[CustId]=[tempdb].[dbo].[Customers].[CustId] <span class="kwrd">as</span> [c].[CustId]) ORDERED FORWARD)
*/
<span class="kwrd">GO</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<div> </div>
<div>As you can see, the Customers table is used as the OUTER table and Orders table is the INNER table, which supports my saying the optimizer is free to rearrange the predicate. For each row in the outer table the INNER table has to be scanned. If you use, FORCEPLAN the optimizer will make the OUTER TABLE Orders and the INNER table Customers. The first problem I encounter is the Orders table does not have a where clause, so the index seek on Orders will go away, but the optimizer will be able to seek Customers on CustId. Additionally, I have made the OUTER table (Orders) smaller, which means the optimizer will have less passes on the INNER table (Customers); however, the optimizer will have to scan/seek through a lot more data to return the results. Using FORCEPLAN can really degrade performance and in this case results more than 30 times more reads. </div>
<div id="codeSnippetWrapper"> </div>
<pre class="csharpcode"><span class="kwrd">SET</span> FORCEPLAN <span class="kwrd">ON</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">ON</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> o.CustId,c.FName,c.LName
<span class="kwrd">FROM</span> dbo.Orders o
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.Customers c
<span class="kwrd">ON</span> c.CustId = o.CustId
<span class="kwrd">WHERE</span> c.LName = <span class="str">'LName151'</span>
<span class="kwrd">SET</span> FORCEPLAN <span class="kwrd">OFF</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">OFF</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">OFF</span>;
<span class="kwrd">GO</span>
/*
<span class="kwrd">Table</span> <span class="str">'Customers'</span>. Scan <span class="kwrd">count</span> 0, logical <span class="kwrd">reads</span> 200, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.
<span class="kwrd">Table</span> <span class="str">'Orders'</span>. Scan <span class="kwrd">count</span> 1, logical <span class="kwrd">reads</span> 2, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.
<span class="kwrd">SELECT</span> o.CustId,c.FName,c.LName <span class="kwrd">FROM</span> dbo.Orders o <span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.Customers c <span class="kwrd">ON</span> c.CustId = o.CustId <span class="kwrd">WHERE</span> c.LName = <span class="str">'LName151'</span>
|--Nested Loops(<span class="kwrd">Inner</span> <span class="kwrd">Join</span>, <span class="kwrd">OUTER</span> <span class="kwrd">REFERENCES</span>:([o].[CustId]))
|--<span class="kwrd">Index</span> Scan(<span class="kwrd">OBJECT</span>:([tempdb].[dbo].[Orders].[ncl_idx_CustId] <span class="kwrd">AS</span> [o]))
|--<span class="kwrd">Clustered</span> <span class="kwrd">Index</span> Seek(<span class="kwrd">OBJECT</span>:([tempdb].[dbo].[Customers].[PK__Customers__7E6CC920] <span class="kwrd">AS</span> [c]), SEEK:([c].[CustId]=[tempdb].[dbo].[Orders].[CustId] <span class="kwrd">as</span> [o].[CustId]), <span class="kwrd">WHERE</span>:([tempdb].[dbo].[Customers].[LName] <span class="kwrd">as</span> [c].[LName]=<span class="str">'LName151'</span>) ORDERED FORWARD)
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<div>
<br /></div>
<div>As you can see, adding or misusing query hints can really degrade query performance. In my opinion, the worst part about using hints is you prohibit the optimizer from logically deducing the best way to solve a query. Developers often neglect to see the long term impact of using hints, as data distribution, indexes, and business requirements change over time. One of the best thing you can do is leave the optimizer alone and let it go to work. A lot of times developers think they know better than the optimizer or that they can trick it; however, this almost always ends in bad performance. Query hints are quite often band-aids to much larger problems; however, under certain circumstances there can be a benefit in using query hints and this is why they exist. For more information on how to help persuade the optimizer to do what you want, you can use thing following link, <a title="http://msdn.microsoft.com/en-us/library/cc917694.aspx" href="http://msdn.microsoft.com/en-us/library/cc917694.aspx">http://msdn.microsoft.com/en-us/library/cc917694.aspx</a>. There are a more options in SQL 2005+ that give developers and DBAs more control over query execute plans; however, I still recommend that you solve performance problems by optimizing code and/or adding indexes. Only when you have evaluated all other options, should you consider using hints or guides. All-in-all, the most typical way to optimize join order is to leave it alone.</div>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com5tag:blogger.com,1999:blog-4646137438366687895.post-42635922884095521272010-01-27T09:39:00.001-08:002010-02-17T19:32:47.393-08:00Optimizing SQL Server Joins<p>Today a colleague asked me a question about performance optimization regarding joins.  I gave him a pretty detailed answer over the phone, but I do not think I really made my message stick, which is the reason for this post.  The first and most important thing to remember is the optimizer can only choose one physical operator per table.  The only exception to this is when the optimizer decides to use index intersection, <a title="http://www.sqlmag.com/Articles/ArticleID/94116/94116.html?Ad=1" href="http://www.sqlmag.com/Articles/ArticleID/94116/94116.html?Ad=1">http://www.sqlmag.com/Articles/ArticleID/94116/94116.html?Ad=1</a> (SQL 2005+).  Index intersection occurs when the optimizer creates its own join and uses two indexes to satisfy the predicate.   While index intersection is still aligned with what I said, index intersection does add an additional table reference to the query plan, which should be noted.  If the same table is referenced multiple times, the execution plan will have more than than one physical operator at varying stages of the execution plan.  When tables are joined together SQL Server creates a constrained Cartesian product, which is nothing more than matching rows based on a given join expression. To do this SQL Server uses a join (Merge, Hash or Nested Loop)  and creates an INNER (Bottom) and OUTER (Top) set.  I use the word set here because the INNER does not have to be a table.  INNER can actually be a constrained Cartesian product. Essentially, the optimizer chooses a base (OUTER) set and will then filter the INNER set based on the results from the outer set.  The optimizer then continues to filter the set for each join, until the query is satisfied.  It is important to know that the optimizer is under no obligation to join tables in the order you have specified.  The optimizer is free to rearrange joins as it sees fit.  Let’s start by creating our sample objects.</p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.State'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[<span class="kwrd">State</span>];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.[<span class="kwrd">State</span>](
State_Cd <span class="kwrd">CHAR</span>(2),
Descr <span class="kwrd">VARCHAR</span>(150)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.[<span class="kwrd">State</span>] ([State_Cd],[Descr]) <span class="kwrd">VALUES</span> (<span class="str">'AL'</span>,<span class="str">'Alabama'</span>);
INSERT <span class="kwrd">INTO</span> dbo.[<span class="kwrd">State</span>] ([State_Cd],[Descr]) <span class="kwrd">VALUES</span> (<span class="str">'LA'</span>,<span class="str">'Louisiana'</span>);
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.City'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[City];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.[City](
State_Cd <span class="kwrd">CHAR</span>(2),
City_Cd <span class="kwrd">VARCHAR</span>(100)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.[City] ([State_Cd],[City_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'AL'</span>,<span class="str">'Mobile'</span>);
INSERT <span class="kwrd">INTO</span> dbo.[City] ([State_Cd],[City_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'LA'</span>,<span class="str">'New Orleans'</span>);
INSERT <span class="kwrd">INTO</span> dbo.[City] ([State_Cd],[City_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'LA'</span>,<span class="str">'Luling'</span>);
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.Zip'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[Zip];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.Zip(
City_Cd <span class="kwrd">VARCHAR</span>(100),
Zip_Cd <span class="kwrd">VARCHAR</span>(10),
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.Zip ([City_Cd],[Zip_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'Mobile'</span>,<span class="str">'36601'</span>);
INSERT <span class="kwrd">INTO</span> dbo.Zip ([City_Cd],[Zip_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'New Orleans'</span>,<span class="str">'70121'</span>);
INSERT <span class="kwrd">INTO</span> dbo.Zip ([City_Cd],[Zip_Cd]) <span class="kwrd">VALUES</span> (<span class="str">'Luling'</span>,<span class="str">'70070'</span>);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now that we have our tables, lets execute a very simplistic query.</p>
<p><em>Note:   There are no indexes present on any tables, so all queries will result in a  table scan.</em></p>
<pre class="csharpcode">--Qry 1
<span class="kwrd">SELECT</span> s.[Descr]
<span class="kwrd">FROM</span> dbo.[<span class="kwrd">State</span>] s</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMfagXZG8RaYtWhKefZ2Qw0a3K7PFUpwK9lqe6yY20ZWq2SKjTfTadHrAHJRHiOD6cDazu5WlJxp8tOlfBdSJr0dk5E6cCJfaNtH1sCPMrbLwmFPsCFjC1IkvDTh0-nt29xDqQhNxLkA/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiOwl4UH8hdIUswmpOJyAPdWoqQOt0e3iR1X7X5yRa0ihyPNtDaZIaQ3aOLN-ROZKgRhA2DrpUqBRNjScdeTzhSNTynJds53LWvfstLm4YNZsmFkXFZz6WCqPIkFQuE6AUDkhisStrOQ/?imgmax=800" width="371" height="136" /></a> </p>
<p>Now let’s add a second table to the query to see what happens.  What we should see is a nested loop join and two table scans.  </p>
<pre class="csharpcode">--Qry 2
<span class="kwrd">SELECT</span> s.[Descr],c.City_Cd
<span class="kwrd">FROM</span> dbo.[<span class="kwrd">State</span>] s
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[City] c <span class="kwrd">ON</span> s.[State_Cd] = c.[State_Cd]</pre>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj71I7iYNEgwEKViJOy7RHhqg-Fd2tLyrDg0T52bgN5m-keSiHS7BHQhlRjLDI3iSjeIp1UwNuaFv6WN653f_i0i_gbPPYBPfnQI9Xg8LijLD68-BlENTK87YnlFzR0r_uwF-hFksql5g/s1600-h/image%5B7%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikK8Ya600zd417TuJpmIhemYVdcUwXjExIueD8ZAtGSSEZWb_egJFvaV-1kQD7HddnmJGBB4yWT-VzqgvxsiftPqPub62WhMKIDJeUmHXaccbynWyjLueXPcyGzUMzV209xEMGzmEbtA/?imgmax=800" width="370" height="197" /></a> </p>
<p>The screenshot above shows the State table has been chosen as the base query.  As you can see, each table is represented by a single Table scan operator.   Let’s add one more table to the mix to see how the optimizer will react.</p>
<pre class="csharpcode">--Qry 3
<span class="kwrd">SELECT</span> s.[Descr],c.City_Cd,z.Zip_Cd
<span class="kwrd">FROM</span> dbo.[<span class="kwrd">State</span>] s
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[City] c <span class="kwrd">ON</span> s.[State_Cd] = c.[State_Cd]
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[Zip] z <span class="kwrd">ON</span> z.[City_Cd] = c.[City_Cd]</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S2B6TjWST_I/AAAAAAAAAXY/qRnc4mBukyI/s1600-h/image%5B11%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/S2B6T4N_jjI/AAAAAAAAAXc/N3IRInVsE3c/image_thumb%5B5%5D.png?imgmax=800" width="403" height="218" /></a> </p>
<p></p>
<p>The optimizer decided to make the first OUTER table State and then decided to make the INNER  a constrained Cartesian product of City and Zip .   This is especially important when you are optimizing code because it is a lot easier to fix problems when you know how indexes and joins really work.  For this example, I see that we are scanning Zip and City, so that tells me that I am missing an index on these two tables. Remember the general rule of thumb is to have indexes on all column participating in the join expression. I will start the optimization process by adding an index to the Zip table, on the City_Cd column.</p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_Zip_City_Cd <span class="kwrd">ON</span> dbo.[Zip](City_Cd);</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ03bzKSqU_BK9XlcGEwE12tMvWYjrfDvT4FZ4F1bxOcZwYo0neoXRCD8APIhkpTU9oVyZtJLk2obiBXembSPd-g4uCfNLMzk9l1G3IhgLFfGxIFYXZXh42fD6PQ7A7XM7CrmTkWu2IQ/s1600-h/image%5B15%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/S2B6UcOWKRI/AAAAAAAAAXk/R0EE5xnsoeM/image_thumb%5B7%5D.png?imgmax=800" width="415" height="218" /></a> </p>
<p>Wow, look at how the query plan changed.  We now see our index seek on Zip, but we now see a RID lookup.  Lookups can become performance bottlenecks really quickly and can sometimes cause blocking or even worse dead locks.  Key Lookups can cause blocking and deadlocks because the optimizer has to take a shared lock on the Clustered Index to get the data that is missing from the nonclustered  index and this causes a problem when an insert/update/delete occurs because it requires an exclusive lock, on the Clustered Index.  In our example, We have an index on Zip which only contains the column City_Cd; however, we are selecting Zip_Cd.  Because Zip_Cd does not exist in the index, the optimizer has to go back to the heap to get the remaining column data.  To solve this problem we need to add Zip_Cd to the index.  I will be adding the Zip_Cd column via the INCLUDE clause.  I chose INCLUDE because I am not using this column in the predicate and using the INCLUDE clause keeps the index relatively small because the value is only stored at the leaf level of the nonclustered index.  You may want to add the column to the index key if you use the column in a lot of predicates because SQL Server maintains statistics on index key columns, but not columns in the INCLUDE clause.  Statistics are used used by SQL Server to estimate cardinality.  Better cardinality estimates allow the optimizer to make better decisions about what operators are best for the given query.  Essentially, better cardinality estimates can be the difference between a scan and a seek. You have to weigh the cost of index maintenance and performance when deciding which method to choose.  For more information regarding INCLUDE please visit this link, <a title="http://msdn.microsoft.com/en-us/library/ms190806.aspx" href="http://msdn.microsoft.com/en-us/library/ms190806.aspx">http://msdn.microsoft.com/en-us/library/ms190806.aspx</a>.</p>
<p><em>Note: If you are not sure what columns need to be added to the index you can hover your mouse over the lookup and look at the output list.  The index that you need to add the columns too will always be the seek operator to the right of the lookup operator.</em></p>
<pre class="csharpcode"><span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.indexes <span class="kwrd">WHERE</span> name = <span class="str">'ncl_idx_Zip_City_Cd'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">INDEX</span> ncl_idx_Zip_City_Cd <span class="kwrd">ON</span> dbo.Zip;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_Zip_City_Cd <span class="kwrd">ON</span> dbo.[Zip](City_Cd)<span class="kwrd">INCLUDE</span>([Zip_Cd]);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now Execute the query again.</p>
<pre class="csharpcode">--Qry 3
<span class="kwrd">SELECT</span> s.[Descr],c.City_Cd,z.Zip_Cd
<span class="kwrd">FROM</span> dbo.[<span class="kwrd">State</span>] s
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[City] c <span class="kwrd">ON</span> s.[State_Cd] = c.[State_Cd]
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[Zip] z <span class="kwrd">ON</span> z.[City_Cd] = c.[City_Cd]</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCjWJz0TpdyT-b8MrJa2YtsDwJaVgPlb006QrEFVXDhvDEVwz_Kz-BCfBExmpDnQHDL9Q1lVj2lYh8kvk2klP7jUqMv_fcrW84DmUiKWOGbxW811lejqE0iHV6Ebxb1MEpWuJ_JAuP-Q/s1600-h/image%5B19%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/S2B6VFbqnwI/AAAAAAAAAXs/nLYmx8ptAfE/image_thumb%5B9%5D.png?imgmax=800" width="421" height="216" /></a> </p>
<p>Now that is a little better but lets make this query even faster. Next, I will add an index to the City table, making sure to include City_Cd.</p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_City_State_Cd <span class="kwrd">ON</span> dbo.City(State_Cd)<span class="kwrd">INCLUDE</span>([City_Cd]);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now execute the query again.</p>
<pre class="csharpcode">--Qry 3
<span class="kwrd">SELECT</span> s.[Descr],c.City_Cd,z.Zip_Cd
<span class="kwrd">FROM</span> dbo.[<span class="kwrd">State</span>] s
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[City] c <span class="kwrd">ON</span> s.[State_Cd] = c.[State_Cd]
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.[Zip] z <span class="kwrd">ON</span> z.[City_Cd] = c.[City_Cd]</pre>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNUpPEhGRVor2401RO3BlN5EMuRtcFYuXpNbnvTCmue3GhdGWhjE6tGXzLCNDCqRDYQwTHijZWBsb_Y58XXe0zns5hnx23jzaPS_HfalGtSRGmMe7_zzGZPGknkJzhIvC122DsW9J7Eg/s1600-h/image%5B23%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/S2B6VkrMzcI/AAAAAAAAAX0/cchIPpjn9qw/image_thumb%5B11%5D.png?imgmax=800" width="426" height="186" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p>How about them apples?  By adding the proper indexes in place we can now get index seeks across the board.  It is important to note that we cannot seek State because we have no predicate filter on any columns in the state table, so the optimizer has to scan.  When you are trying to optimize queries the first place to look is the execution plan.  If you see a lot of scans, you have a lot of optimization potential.  Remember that you want to make sure all columns in the select, the join and the where clause are present in your index.  Please do not tell your boss that you need to create indexes to cover every query in your environment. You will not be able to fully cover every query in your environment, but the important thing is to optimize and cover the queries that are really expensive or causing problems. Who knows you maybe able to cover multiple queries by creating or modifying a single index.</p>
<p>That’s it for now.  I hope I have cleared up how the optimizer handles joins and given you greater insight on how to optimize joins. Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com2tag:blogger.com,1999:blog-4646137438366687895.post-72309912697074483472010-01-20T13:45:00.001-08:002010-01-20T19:23:57.178-08:00SQL Server Myths Debunked (Part 2)<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/S1d5TembKZI/AAAAAAAAAWg/V142FGOGRmY/s1600-h/question-mark%5B4%5D.jpg"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="question-mark" border="0" alt="question-mark" align="left" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjuI3ZlLmeRtptW-1JwcY4JUSqc0YJew2B0kBKFnoqrt76j5BXpDOwjv1LpZq8dJ8xlgnRUylSuFEHBlHhyphenhyphenKFqYXVvduI_v7iMAlrmDVQ_WVn9-q2byB3-gaeldXrtmz2CCLnDokGusWw/?imgmax=800" width="244" height="216" /></a> Last time I went over some of the most misconstrued myths to hit SQL Server, <a title="http://jahaines.blogspot.com/2010/01/sql-server-myths-debunked-part-1.html" href="http://jahaines.blogspot.com/2010/01/sql-server-myths-debunked-part-1.html">http://jahaines.blogspot.com/2010/01/sql-server-myths-debunked-part-1.html</a>.  In this post I will be finishing the remaining items and I encourage you to post any comments about any myths you have seen or you have questions about.  </p> <p> </p> <p> </p> <p> </p> <p> </p> <p>Below is the list of Items I will be addressing in this post:</p> <ul> <li>Table data can be stored separately of the clustered index </li> <li>Columns just need to exist in the index to be used </li> <li>While loops are faster/better than cursors </li> <li>Sp_executesql is always better than exec </li> <li>Tables have a guaranteed sort order </li> </ul> <h3>Table data can be stored separately of the clustered index</h3> <p>The first myth I will be talking about is the myth that table data can exist separately of the clustered index.  This myth is absolutely false.  Table data exists in the clustered index.  This means the table does not store any data.  If the table is on a differing filegroup than the clustered index, the filegroup containing the table will be empty because the “table” is actually moved with the clustered index.  It should be noted that any non clustered indexes will remain on the same filegroup.  Let’s see this in action.</p> <pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>
<span class="kwrd">GO</span>
<span class="kwrd">USE</span> master
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.databases <span class="kwrd">WHERE</span> NAME=<span class="str">'CL_Idx'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">DATABASE</span> [CL_Idx];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">DATABASE</span> [CL_Idx] <span class="kwrd">ON</span> <span class="kwrd">PRIMARY</span>
( NAME = N<span class="str">'CL_Idx'</span>, FILENAME = N<span class="str">'C:\Program Files\Microsoft SQL Server\MSSQL10.DEV\MSSQL\DATA\CL_Idx.mdf'</span> , <span class="kwrd">SIZE</span> = 4096KB , FILEGROWTH = 0 ),
FILEGROUP [FG1]
( NAME = N<span class="str">'CL_Idx2'</span>, FILENAME = N<span class="str">'C:\Program Files\Microsoft SQL Server\MSSQL10.DEV\MSSQL\DATA\CL_Idx2.ndf'</span> , <span class="kwrd">SIZE</span> = 4096KB , FILEGROWTH = 0 ),
FILEGROUP [FG2]
( NAME = N<span class="str">'CL_Idx3'</span>, FILENAME = N<span class="str">'C:\Program Files\Microsoft SQL Server\MSSQL10.DEV\MSSQL\DATA\CL_Idx3.ndf'</span> , <span class="kwrd">SIZE</span> = 4096KB , FILEGROWTH = 0 )
LOG <span class="kwrd">ON</span>
( NAME = N<span class="str">'CL_Idx_log'</span>, FILENAME = N<span class="str">'C:\Program Files\Microsoft SQL Server\MSSQL10.DEV\MSSQL\DATA\CL_Idx_log.ldf'</span> , <span class="kwrd">SIZE</span> = 4096KB , FILEGROWTH = 10%)
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">NOT</span> <span class="kwrd">EXISTS</span> (<span class="kwrd">SELECT</span> name <span class="kwrd">FROM</span> sys.filegroups <span class="kwrd">WHERE</span> is_default=1 <span class="kwrd">AND</span> name = N<span class="str">'FG1'</span>)
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> [CL_Idx] <span class="kwrd">MODIFY</span> FILEGROUP [FG1] <span class="kwrd">DEFAULT</span>
<span class="kwrd">GO</span>
<span class="kwrd">USE</span> [CL_Idx]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'[CL_Idx].dbo.TestData'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.TestData;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
) <span class="kwrd">ON</span> [FG1];
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 100000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%2500+1 <span class="kwrd">AS</span> SomeId,
<span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)
+ <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span>
<span class="kwrd">CONVERT</span>(<span class="kwrd">varchar</span>(10),s.NAME) <span class="kwrd">AS</span> FlName,
<span class="kwrd">CASE</span> <span class="kwrd">WHEN</span> s.name = <span class="str">'CL_Idx2'</span> <span class="kwrd">THEN</span> <span class="str">'FG1'</span> <span class="kwrd">ELSE</span> <span class="str">'FG2'</span> <span class="kwrd">END</span> <span class="kwrd">AS</span> FG,
(FILEPROPERTY(s.name,<span class="str">'spaceused'</span>) * 8)/1024. <span class="kwrd">AS</span> [FG_Size(MB)]
<span class="kwrd">FROM</span> sys.master_files s
<span class="kwrd">WHERE</span>
database_id = DB_ID()
<span class="kwrd">AND</span> type = 0
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">UNIQUE</span> <span class="kwrd">CLUSTERED</span> <span class="kwrd">INDEX</span> unq_cl_idx_Row_Num <span class="kwrd">ON</span> dbo.TestData(RowNum) <span class="kwrd">ON</span> FG2;
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span>
<span class="kwrd">CONVERT</span>(<span class="kwrd">varchar</span>(10),s.NAME) <span class="kwrd">AS</span> FlName,
<span class="kwrd">CASE</span> <span class="kwrd">WHEN</span> s.name = <span class="str">'CL_Idx2'</span> <span class="kwrd">THEN</span> <span class="str">'FG1'</span> <span class="kwrd">ELSE</span> <span class="str">'FG2'</span> <span class="kwrd">END</span> <span class="kwrd">AS</span> FG,
(FILEPROPERTY(s.name,<span class="str">'spaceused'</span>) * 8)/1024. <span class="kwrd">AS</span> [FG_Size(MB)]
<span class="kwrd">FROM</span> sys.master_files s
<span class="kwrd">WHERE</span>
database_id = DB_ID()
<span class="kwrd">AND</span> type = 0
GO</pre>
<p>Results:</p>
<img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/S1d5Utu3vAI/AAAAAAAAAWs/OuF8X221Xyo/image_thumb%5B1%5D.png?imgmax=800" width="231" height="310" /></a> <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<h3>Columns just need to exist in the index to be used</h3>
<p>The next myth is one of the primary causes of poor performance.  Some believe that simply having a column in the index means the optimizer can use that index to seek rows.  This believe is very far from the truth.  In reality, the index has to be the leftmost column in the index key to be used to seek the row.  This means the query predicate has to contain the leftmost index key; otherwise, you will get an index scan. Before I show you an example, here is a link that describes index basics, <a title="http://jahaines.blogspot.com/2009/07/real-sql-pages-beginners-guide-to.html" href="http://jahaines.blogspot.com/2009/07/real-sql-pages-beginners-guide-to.html">http://jahaines.blogspot.com/2009/07/real-sql-pages-beginners-guide-to.html</a>.  Let’s get to the example.</p>
<pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.#Customers'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> #Customers;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> #Customers(
Id <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
FName <span class="kwrd">VARCHAR</span>(25),
LName <span class="kwrd">VARCHAR</span>(25)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> #Customers <span class="kwrd">VALUES</span> (1,<span class="str">'Adam'</span>,<span class="str">'Haines'</span>);
INSERT <span class="kwrd">INTO</span> #Customers <span class="kwrd">VALUES</span> (2,<span class="str">'John'</span>,<span class="str">'Smith'</span>);
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_FName <span class="kwrd">ON</span> [#Customers](FName,LName);
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> FName,LName
<span class="kwrd">FROM</span> [#Customers]
<span class="kwrd">WHERE</span> LName = <span class="str">'Haines'</span>
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">OFF</span>;
<span class="kwrd">GO</span>
<span class="kwrd">DROP</span> <span class="kwrd">INDEX</span> ncl_idx_FName <span class="kwrd">ON</span> [#Customers];
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_FName <span class="kwrd">ON</span> [#Customers](LName,FName);
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">ON</span>;
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> FName,LName
<span class="kwrd">FROM</span> [#Customers]
<span class="kwrd">WHERE</span> LName = <span class="str">'Haines'</span>
<span class="kwrd">GO</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> PROFILE <span class="kwrd">OFF</span>;
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Results:</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUL0oAD_Cgs0yx1yD34ZrQ4XjP5To7-4KaVzgyK3Wx3P2a0uKhbe_bL2aU0eRi4T1-oxquqkYAVfXVctpOjZ4ckxs1PoGIyn0ACumHVIkHzCN2uwaRvGxqjVubQ3imxJchFM2HKLnZdg/s1600-h/image%5B11%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/S1d5VXQZRAI/AAAAAAAAAW0/pdYpM8s-14s/image_thumb%5B5%5D.png?imgmax=800" width="474" height="347" /></a> </p>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now by switching the order of the index columns we will get an index seek.</p>
<h3>While loops are faster/better than cursors</h3>
<p>The next myth is while loops are faster/better than cursors.  This is not true in all scenarios, but it is occasionally true.  The performance solely depends on what you are doing and how you are doing it.  I wont post an example for this because there are too many variances that can occur and I would have to dedicate an entire post on the differences. I will say that you should avoid both of these options, if possible.  These solutions are iterative/recursive in nature, which goes against how SQL Server operates.  90 percent of the time, you can replace iterative logic with set based logic and increase performance 10 fold.  The take away here is while loops are not ALWAYS better….. however there are usually better SET based solutions out there.</p>
<h3>Sp_executesql is always better than exec</h3>
<p>The next myth has to deal with dynamic SQL and the best way to execute it.  A lot of the time, developers will say that they are using sp_executesql, so the query is optimized.  These believes show a lack of understanding of how sp_executesql really works and where it really benefits.  The short answer is sp_executesql is not always better.  Under some circumstances sp_executesql and exec will produce the same query plan and the query will not be able to benefit from query plan reuse, unless an exact binary match already exists in cache.  This behavior occurs because the developer uses the parameters outside the scope of the dynamic SQL.  Only inline parameters will be used in the parameterization process, using sp_executesql.  Let’s have a look at this.</p>
<pre class="csharpcode"><span class="kwrd">IF</span> object_id(<span class="str">'tempdb.dbo.#t'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> #t;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> #t(
id <span class="kwrd">INT</span>,
col <span class="kwrd">CHAR</span>(1)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (1,<span class="str">'a'</span>);
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (2,<span class="str">'b'</span>);
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (3,<span class="str">'c'</span>);
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@param <span class="kwrd">VARCHAR</span>(25)
<span class="kwrd">SET</span> @param = <span class="str">'b'</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'SELECT Id,col FROM #t WHERE col='</span> + QUOTENAME(@param,<span class="str">''</span><span class="str">''</span>)
<span class="kwrd">EXEC</span> sp_executesql @<span class="kwrd">sql</span>
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@param <span class="kwrd">VARCHAR</span>(25)
<span class="kwrd">SET</span> @param = <span class="str">'c'</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = <span class="str">'SELECT Id,col FROM #t WHERE col='</span> + QUOTENAME(@param,<span class="str">''</span><span class="str">''</span>)
<span class="kwrd">EXEC</span>(@<span class="kwrd">sql</span>)
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@param <span class="kwrd">VARCHAR</span>(25)
<span class="kwrd">SET</span> @param = <span class="str">'d'</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'SELECT Id,col FROM #t WHERE col='</span> + QUOTENAME(@param,<span class="str">''</span><span class="str">''</span>)
<span class="kwrd">EXEC</span> sp_executesql @<span class="kwrd">sql</span>
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@param <span class="kwrd">VARCHAR</span>(25)
<span class="kwrd">SET</span> @param = <span class="str">'e'</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'SELECT Id,col FROM #t WHERE col= @dyn_param'</span>
<span class="kwrd">EXEC</span> sp_executesql @<span class="kwrd">sql</span>, N<span class="str">'@dyn_param CHAR(1)'</span>,@dyn_param=@param
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> NVARCHAR(<span class="kwrd">MAX</span>),
@param <span class="kwrd">VARCHAR</span>(25)
<span class="kwrd">SET</span> @param = <span class="str">'f'</span>
<span class="kwrd">SET</span> @<span class="kwrd">sql</span> = N<span class="str">'SELECT Id,col FROM #t WHERE col= @dyn_param'</span>
<span class="kwrd">EXEC</span> sp_executesql @<span class="kwrd">sql</span>, N<span class="str">'@dyn_param CHAR(1)'</span>,@dyn_param=@param
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> text, [execution_count]
<span class="kwrd">FROM</span> sys.dm_exec_query_stats qs
<span class="kwrd">CROSS</span> APPLY sys.[dm_exec_sql_text](qs.[sql_handle])
<span class="kwrd">WHERE</span> text <span class="kwrd">LIKE</span> <span class="str">'%SELECT Id,col FROM #t%'</span> <span class="kwrd">AND</span> text <span class="kwrd">NOT</span> <span class="kwrd">LIKE</span> <span class="str">'%cross apply%'</span>
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [last_execution_time] <span class="kwrd">DESC</span>
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Results:</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiD3OrfKdefv9KYdYK2NYpVqSBP4WaVfsmbho4qWPnxA6lkchV2GGolZElrhRiN4c7LqPF8d2EVu4Rde88iX9kVdI3km5ynUPL9F437g8mJKkNPSiZEIpVv7FF2Uo3fQajzQYkYjEXLYg/s1600-h/image%5B23%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKYiqxQvjJc8Uq6l5XGbu4Rh7VLpJSBX-wqCrY9eLdrX9rNVWtnAAWwyTzsLaMWZubSExqNi15Virq0izklSy9pwrWbAZwrpAz1rCDLNNN3l2ZGDvztbNaK3gRyX3ke1mbqBBzSJYc0A/?imgmax=800" width="473" height="129" /></a> </p>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>As you can see, the only query that took advantage of query plan reuse is the inline parameterized query.  When dealing with dynamic SQL you need to use sp_executesql and inline parameters to take full advantage of query plan reuse.  Also, fully parameterizing dynamic SQL reduces the risk of injection attack.</p>
<h3>Tables have a guaranteed sort order</h3>
<p>The final myth I will be talking deals with tables and their sort order.  By definition  tables do not have an order.  Tables are an unordered set of rows.  Confusion occurs because clustered indexes are supposed to order the rows.  While clustered indexes do physically sort the rows when they are built, there are no guarantees that clustered indexes will be used, thus you may not get clustered index order.  There are many execution plan operators that can influence the order in which data is returned. Operators such as stream aggregates, partitions, parallelism and even the use of nonclustered indexes can change the order of returned results.  The only guarantee that exists is that there is no guaranteed order, without the use of an ORDER BY clause.  Lets see this in action.</p>
<pre class="csharpcode"><span class="kwrd">IF</span> object_id(<span class="str">'tempdb.dbo.#t'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> #t;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> #t(
Id <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
Col <span class="kwrd">CHAR</span>(1)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (<span class="str">'a'</span>);
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (<span class="str">'b'</span>);
INSERT <span class="kwrd">INTO</span> #t <span class="kwrd">VALUES</span> (<span class="str">'c'</span>);
<span class="kwrd">GO</span>
--<span class="kwrd">Clustered</span> <span class="kwrd">Index</span> scan yields <span class="kwrd">clustered</span> <span class="kwrd">index</span> <span class="kwrd">order</span>
<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> #t
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">UNIQUE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_Id_Col <span class="kwrd">ON</span> [#t](Id <span class="kwrd">DESC</span>,Col);
<span class="kwrd">GO</span>
--<span class="kwrd">NONClustered</span> <span class="kwrd">Index</span> scan yields <span class="kwrd">NONClustered</span> <span class="kwrd">index</span> <span class="kwrd">order</span>
<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> [#t]</pre>
<p></p>
<p>Results:</p>
<p><a href="http://lh4.ggpht.com/_ayZBUzPGG9A/S1d5Wtxe39I/AAAAAAAAAXA/gWq0NH0z548/s1600-h/image%5B19%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOY3qp81atPfrh5piTkWoVtFPe0xGEPFj-wTUotl7JeeyMZNlVuixU7Xk4cPTmZMY7FivY1uRBkiWauPs5WQn7_kH_oH5RNMxrzR3OcUuR9K7rQGN2siNiGh5XH09HOnyzNbPUHsqvuw/?imgmax=800" width="217" height="279" /></a> </p>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>As you can see, there are a lot of influences to the order of the resultset.  If you need the dataset returned in a specific order you MUST specify an ORDER BY clause; otherwise, you risk using code that will sometimes work.</p>
<h3>Wrap-up</h3>
<p>That’s it. I have gone through the myths that I have encountered over the years.  If you have any more ideas, or any myths you want to debunk, post comments here.  I hope that you all have learned something new and hopefully the spread of these myths will stop here.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com3tag:blogger.com,1999:blog-4646137438366687895.post-79348731137333215892010-01-11T19:53:00.001-08:002010-01-13T19:26:36.485-08:00SQL Server Myths Debunked (Part 1)<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPlM1Di4C29v1_SSw38O9lbX78lVDXrhF9co6mAC3gywC0cdagBV9nzTjbMSx97dj5-VnlYkJMJQ1oWI9aoNpeKC8s7O5eyAUIqdhdZsu2hFDDhGoVOo_5dnrgCxd8W1oQ3battk8tIg/s1600-h/MythBusters%5B24%5D.jpg"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="MythBusters" border="0" alt="MythBusters" align="left" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHXUS577mKCyv1gTpWIqi0eqCuh2zTFZqZe_TA0YyyGEc4E_hJRXLlsjzRpQf6huCNcPDPnvKFD6sSpSsGqcbNIVevovl-NimQwSKEsG-wFB4EfOwtZh_KM7jx-WNFLnIXkiBzzQyBtw/?imgmax=800" width="244" height="139" /></a> Today I wanted to talk about some of the common misconceptions or myths that I have encountered over the years.  A lot of these myths are so wide spread because of the sheer amount of misinformation available.  Myths are born in a number of ways, but typically SQL Server myths are brought to life by propagated misinformation from blogs, articles, unreliable sources etc. that spreads through the community like a wild fire. </p> <p> </p> <p>Below is a list of myths that I have encountered and will be discussing in this series.</p> <li><a href="#Link1">Table Variables do not exist in TempDB </a></li> <li><a href="#Link2">Table variables cannot have indexes </a></li> <li><a href="#Link3">Putting a db in simple recovery stops the log from growing </a></li> <li><a href="#Link4">Shrinking a log breaks the log chain  </a></li> <li><a href="#Link5">Backing up the transaction log frees up OS storage </a></li> <li><a href="#Link6">Table data can be stored separately of the clustered index </a></li> <li><a href="#Link7">Columns just need to exist in the index to be used </a></li> <li><a href="#Link8">While loops are faster than cursors </a></li> <li><a href="#Link9">sp_executesql is always better than exec </a></li> <li><a href="#Link10">tables have a guaranteed sort order </a><a name="Link1"> <h2>Table Variables do not exist in TempDB </h2> <p>Lets start with one of my favorite myths.  There is a very common misconception that table variables do not exist in TempDB, but only in memory.  This is absolutely false.  Table variables exist in memory, when they are small enough to fit, but they always consume storage in TempDB and make entries into the TempDB  log.</p> <p>The example below demonstrates how you can view the table makeup in sys.tables.</p> <pre class="csharpcode">--Make the master db the <span class="kwrd">current</span> db
<span class="kwrd">USE</span> [master]
<span class="kwrd">GO</span>
--<span class="kwrd">Declare</span> a <span class="kwrd">table</span> <span class="kwrd">variable</span>
<span class="kwrd">DECLARE</span> @t <span class="kwrd">TABLE</span>(
Id <span class="kwrd">INT</span>,
Col <span class="kwrd">CHAR</span>(1)
);
--<span class="kwrd">Get</span> <span class="kwrd">Table</span> <span class="kwrd">Variable</span> Definition
<span class="kwrd">SELECT</span> t.NAME,c.name,ty.name, [c].[max_length]
<span class="kwrd">FROM</span> tempdb.sys.tables t
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> tempdb.sys.columns c
<span class="kwrd">ON</span> [t].[object_id] = [c].[object_id]
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> tempdb.sys.types ty
<span class="kwrd">ON</span> [c].[system_type_id] = [ty].[system_type_id]
<span class="kwrd">WHERE</span> t.name <span class="kwrd">LIKE</span> <span class="str">'#%'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></a><a name="Link2">
<h2>Table variables cannot have indexes </h2>
<p>The next myth also deals with table variables.  It is often thought that table variables cannot have indexes, which is absolutely false.  Table variables can have indexes, if you specify them at the time of creation.  The only restriction is that you cannot create non-unique indexes.  It is important to remember that even though an index exists on the table variable, the optimizer still does not maintain statistics on table variables.  This means the optimizer will still assume one row, even if a scan is done.</p>
</a>
<pre class="csharpcode"><span class="kwrd">DECLARE</span> @t <span class="kwrd">TABLE</span>(
Id <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,
Col <span class="kwrd">CHAR</span>(1) <span class="kwrd">UNIQUE</span>
);
--<span class="kwrd">Get</span> <span class="kwrd">Table</span> <span class="kwrd">Variable</span> Definition
<span class="kwrd">SELECT</span> i.name, i.type_desc,i.is_unique,i.is_primary_key,i.[is_unique_constraint]
<span class="kwrd">FROM</span> tempdb.sys.tables t
<span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> tempdb.sys.indexes i
<span class="kwrd">ON</span> i.[object_id] = t.[object_id]
<span class="kwrd">WHERE</span> t.name <span class="kwrd">LIKE</span> <span class="str">'#%'</span></pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Results:</p>
<p></p>
<a href="http://lh5.ggpht.com/_ayZBUzPGG9A/S0vyIVM0JjI/AAAAAAAAAWY/FkHTd8CVVVw/s1600-h/image3.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfz_tBqYbxg9C1R1UT0hn0yuTwZBLx91YqzIYOGJ-_ZPIdHAEbh59c-2p3INNkuBHPlKAif46EwvsF4r3LAt3MPeEuXM6WnsIvRns8V3_IeVDrL1W7zMlJskOJt7WtQx8GaZBpeAUIjA/?imgmax=800" width="505" height="85" /></a> <a name="Link3">
<h2>Putting a db in simple recovery stops the log from growing </h2>
<p>One of the most misunderstood myths in the SQL Server realm is how the simple recovery model actually works.  Some believe that by changing the recovery model to simple the transaction log will not grow.  This is wrong on so many levels.  If the transaction log were not able to grow, how would any transactions be rolled back?  When the recovery model is set to simple the transaction log still has to grow to accommodate large transactions. Let’s have a look at this in action.</p>
<pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>
<span class="kwrd">GO</span>
<span class="kwrd">USE</span> [master]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> db_id(<span class="str">'TestSimpleRecovery'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Create</span> DB
<span class="kwrd">CREATE</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery;
<span class="kwrd">GO</span>
--Change Recovery Model <span class="kwrd">To</span> Simple
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery <span class="kwrd">SET</span> RECOVERY SIMPLE
<span class="kwrd">GO</span>
--Change the FileGrowth <span class="kwrd">To</span> 100MB
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery
<span class="kwrd">MODIFY</span> <span class="kwrd">FILE</span>(NAME = TestSimpleRecovery_Log,FILEGROWTH=100MB);
<span class="kwrd">GO</span>
--Switch DB Context
<span class="kwrd">USE</span> [TestSimpleRecovery]
<span class="kwrd">GO</span>
--<span class="kwrd">Get</span> <span class="kwrd">Current</span> Log <span class="kwrd">Size</span> (<span class="kwrd">before</span> XACT)
<span class="kwrd">SELECT</span> ((<span class="kwrd">size</span> * 8) / 1024.) / 1024. <span class="kwrd">AS</span> SizeMBsBeforeTransaction
<span class="kwrd">FROM</span> sys.[master_files]
<span class="kwrd">WHERE</span> type = 1 <span class="kwrd">AND</span> [database_id] = db_id(<span class="str">'TestSimpleRecovery'</span>)
<span class="kwrd">GO</span>
--<span class="kwrd">Drop</span> Sample <span class="kwrd">Table</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
);
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @i <span class="kwrd">INT</span>
<span class="kwrd">SET</span> @i = 1
<span class="kwrd">BEGIN</span> <span class="kwrd">TRANSACTION</span>
<span class="kwrd">WHILE</span> @i < 100
<span class="kwrd">BEGIN</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%2500+1 <span class="kwrd">AS</span> SomeId,
<span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)
+ <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">SET</span> @i = @i + 1
<span class="kwrd">END</span>
<span class="kwrd">COMMIT</span> <span class="kwrd">TRANSACTION</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Get</span> <span class="kwrd">Current</span> Log <span class="kwrd">Size</span> (<span class="kwrd">After</span> XACT)
<span class="kwrd">SELECT</span> ((<span class="kwrd">size</span> * 8) / 1024.) / 1024. <span class="kwrd">AS</span> SizeMBsAfterTransaction
<span class="kwrd">FROM</span> sys.[master_files]
<span class="kwrd">WHERE</span> type = 1 <span class="kwrd">AND</span> [database_id] = db_id(<span class="str">'TestSimpleRecovery'</span>)
<span class="kwrd">GO</span>
/*
SizeMBsBeforeTransaction
-------------------------------------<span class="rem">--</span>
0.00054931640
SizeMBsAfterTransaction
-------------------------------------<span class="rem">--</span>
0.09820556640
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></a><a name="Link4">
<h2>Shrinking a log breaks the log chain  </h2>
<p>The next myth I will be discussing is shrinking the transaction log.  It is often purveyed that shrinking the transaction log breaks the log chain.  This is absolutely false!  This is a prime example of how confusion can cause misinformation.  It is true that backing up the log with the truncate only option breaks the log chain, but this is completely different than shrinking the log.  Shrinking the log is used to empty space from a data or log file.  Shrinking the log file does not affect the active portion of the log at all, which means the backup chain is fully intact.  For more information, you should read this article, </p>
<a title="http://msdn.microsoft.com/en-us/library/ms178037.aspx" href="http://msdn.microsoft.com/en-us/library/ms178037.aspx">http://msdn.microsoft.com/en-us/library/ms178037.aspx</a>.</a> </a>
<pre class="csharpcode"><span class="kwrd">IF</span> <span class="kwrd">NOT</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.databases <span class="kwrd">WHERE</span> name = <span class="str">'ShrinkLog_Test'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">CREATE</span> <span class="kwrd">DATABASE</span> [ShrinkLog_Test];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">BACKUP</span> <span class="kwrd">DATABASE</span> [ShrinkLog_Test]
<span class="kwrd">TO</span> <span class="kwrd">DISK</span> = <span class="str">'c:\ShrinkLog_Test.bak'</span> <span class="kwrd">WITH</span> INIT, STATS=10
<span class="kwrd">GO</span>
<span class="kwrd">BACKUP</span> LOG [ShrinkLog_Test]
<span class="kwrd">TO</span> <span class="kwrd">DISK</span> = <span class="str">'c:\ShrinkLog_Test.trn'</span> <span class="kwrd">WITH</span> INIT, STATS=10
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1 b.type, b.first_lsn, b.last_lsn
<span class="kwrd">FROM</span> msdb..backupset b
<span class="kwrd">WHERE</span> b.[database_name] = <span class="str">'ShrinkLog_Test'</span>
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [backup_start_date] <span class="kwrd">DESC</span>
/*
type first_lsn last_lsn
--<span class="rem">-- --------------------------------------- ---------------------------------------</span>
L 41000000005400064 41000000009000001
*/
<span class="kwrd">USE</span> [ShrinkLog_Test]
<span class="kwrd">GO</span>
<span class="kwrd">DBCC</span> SHRINKFILE(<span class="str">'ShrinkLog_Test_log'</span>,1,TRUNCATEONLY)
<span class="kwrd">GO</span>
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1 b.type, b.first_lsn, b.last_lsn
<span class="kwrd">FROM</span> msdb..backupset b
<span class="kwrd">WHERE</span> b.[database_name] = <span class="str">'ShrinkLog_Test'</span>
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> [backup_start_date] <span class="kwrd">DESC</span>
/*
type first_lsn last_lsn
--<span class="rem">-- --------------------------------------- ---------------------------------------</span>
L 41000000005400064 41000000009000001
*/
<span class="kwrd">USE</span> master
<span class="kwrd">GO</span>
<span class="kwrd">RESTORE</span> <span class="kwrd">DATABASE</span> ShrinkLog_Test <span class="kwrd">FROM</span> <span class="kwrd">DISK</span> = <span class="str">'c:\ShrinkLog_Test.bak'</span> <span class="kwrd">WITH</span> norecovery, REPLACE
<span class="kwrd">GO</span>
<span class="kwrd">RESTORE</span> log ShrinkLog_Test <span class="kwrd">FROM</span> <span class="kwrd">DISK</span> = <span class="str">'c:\ShrinkLog_Test.trn'</span> <span class="kwrd">WITH</span> recovery, REPLACE
<span class="kwrd">GO</span>
/*
Processed 168 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'ShrinkLog_Test'</span>, <span class="kwrd">file</span> <span class="str">'ShrinkLog_Test'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
Processed 3 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'ShrinkLog_Test'</span>, <span class="kwrd">file</span> <span class="str">'ShrinkLog_Test_log'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
<span class="kwrd">RESTORE</span> <span class="kwrd">DATABASE</span> successfully processed 171 pages <span class="kwrd">in</span> 0.061 seconds (21.780 MB/sec).
Processed 0 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'ShrinkLog_Test'</span>, <span class="kwrd">file</span> <span class="str">'ShrinkLog_Test'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
Processed 5 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'ShrinkLog_Test'</span>, <span class="kwrd">file</span> <span class="str">'ShrinkLog_Test_log'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
<span class="kwrd">RESTORE</span> LOG successfully processed 5 pages <span class="kwrd">in</span> 0.043 seconds (0.794 MB/sec).
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><a name="Link5">
<h2>Backing up the transaction log frees up OS storage></h2>
<p>The next myth is one of the most problematic ones because those who believe this become bedazzled when they backup their transaction log and it does not give the space back to the OS.  I would like to point out that I do not recommend shrinking data files unless you absolutely crunched for space.  Shrinking the log is the only way to return storage back to the OS.</p>
<p></p>
<p></p>
<p></p>
<pre class="csharpcode"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>
<span class="kwrd">GO</span>
<span class="kwrd">USE</span> [master]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> db_id(<span class="str">'TestSimpleRecovery'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery;
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Create</span> DB
<span class="kwrd">CREATE</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery;
<span class="kwrd">GO</span>
--Change Recovery Model <span class="kwrd">To</span> Simple
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery <span class="kwrd">SET</span> RECOVERY <span class="kwrd">FULL</span>
<span class="kwrd">GO</span>
--Change the FileGrowth <span class="kwrd">To</span> 100MB
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery
<span class="kwrd">MODIFY</span> <span class="kwrd">FILE</span>(NAME = TestSimpleRecovery_Log,FILEGROWTH=100MB);
<span class="kwrd">GO</span>
<span class="kwrd">BACKUP</span> <span class="kwrd">DATABASE</span> TestSimpleRecovery <span class="kwrd">TO</span> <span class="kwrd">DISK</span> = <span class="str">'C:\Test.bak'</span>
--Switch DB Context
<span class="kwrd">USE</span> [TestSimpleRecovery]
<span class="kwrd">GO</span>
--<span class="kwrd">Get</span> <span class="kwrd">Current</span> Log <span class="kwrd">Size</span> (<span class="kwrd">before</span> XACT)
<span class="kwrd">SELECT</span> ((<span class="kwrd">size</span> * 8) / 1024.) / 1024. <span class="kwrd">AS</span> SizeMBsBeforeTransaction
<span class="kwrd">FROM</span> sys.[master_files]
<span class="kwrd">WHERE</span> type = 1 <span class="kwrd">AND</span> [database_id] = db_id(<span class="str">'TestSimpleRecovery'</span>)
<span class="kwrd">GO</span>
--<span class="kwrd">Drop</span> Sample <span class="kwrd">Table</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(
RowNum <span class="kwrd">INT</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
);
<span class="kwrd">GO</span>
<span class="kwrd">DECLARE</span> @i <span class="kwrd">INT</span>
<span class="kwrd">SET</span> @i = 1
<span class="kwrd">BEGIN</span> <span class="kwrd">TRANSACTION</span>
<span class="kwrd">WHILE</span> @i < 100
<span class="kwrd">BEGIN</span>
INSERT <span class="kwrd">INTO</span> dbo.TestData
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%2500+1 <span class="kwrd">AS</span> SomeId,
<span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)
+ <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">SET</span> @i = @i + 1
<span class="kwrd">END</span>
<span class="kwrd">COMMIT</span> <span class="kwrd">TRANSACTION</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Get</span> <span class="kwrd">Current</span> Log <span class="kwrd">Size</span> (<span class="kwrd">After</span> XACT)
<span class="kwrd">SELECT</span> ((<span class="kwrd">size</span> * 8) / 1024.) / 1024. <span class="kwrd">AS</span> SizeMBsAfterTransaction
<span class="kwrd">FROM</span> sys.[master_files]
<span class="kwrd">WHERE</span> type = 1 <span class="kwrd">AND</span> [database_id] = db_id(<span class="str">'TestSimpleRecovery'</span>)
<span class="kwrd">GO</span>
<span class="kwrd">BACKUP</span> log TestSimpleRecovery <span class="kwrd">TO</span> <span class="kwrd">DISK</span> = <span class="str">'c:\test.trn'</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Get</span> <span class="kwrd">Current</span> Log <span class="kwrd">Size</span> (<span class="kwrd">After</span> Log <span class="kwrd">Backup</span>)
<span class="kwrd">SELECT</span> ((<span class="kwrd">size</span> * 8) / 1024.) / 1024. <span class="kwrd">AS</span> SizeMBsAfterBackup
<span class="kwrd">FROM</span> sys.[master_files]
<span class="kwrd">WHERE</span> type = 1 <span class="kwrd">AND</span> [database_id] = db_id(<span class="str">'TestSimpleRecovery'</span>)
<span class="kwrd">GO</span>
/*
Processed 168 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'TestSimpleRecovery'</span>, <span class="kwrd">file</span> <span class="str">'TestSimpleRecovery'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
Processed 2 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'TestSimpleRecovery'</span>, <span class="kwrd">file</span> <span class="str">'TestSimpleRecovery_log'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
<span class="kwrd">BACKUP</span> <span class="kwrd">DATABASE</span> successfully processed 170 pages <span class="kwrd">in</span> 0.361 seconds (3.677 MB/sec).
SizeMBsBeforeTransaction
-------------------------------------<span class="rem">--</span>
0.00054931640
SizeMBsAfterTransaction
-------------------------------------<span class="rem">--</span>
0.09820556640
Processed 1407 pages <span class="kwrd">for</span> <span class="kwrd">database</span> <span class="str">'TestSimpleRecovery'</span>, <span class="kwrd">file</span> <span class="str">'TestSimpleRecovery_log'</span> <span class="kwrd">on</span> <span class="kwrd">file</span> 1.
<span class="kwrd">BACKUP</span> LOG successfully processed 1407 pages <span class="kwrd">in</span> 3.002 seconds (3.659 MB/sec).
SizeMBsAfterBackup
-------------------------------------<span class="rem">--</span>
0.09820556640
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p></p>
<p>I am trying to keep this series manageable, so I will stop here and pickup the remaining items in part 2, of this series.  If you have any myths that you would like to see busted or have myths you would like to share, please feel free to leave me comments.  Stay tuned part 2!</p>
<p>Until next time happy coding.</p>
<p></p>
<p></p>
<p></p>
</a>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
</li> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com6tag:blogger.com,1999:blog-4646137438366687895.post-18984555059417549222009-12-15T19:16:00.001-08:002009-12-15T19:57:26.377-08:00SQL Server 2005 – How To Move 10 Millions Rows In 1 Millisecond<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/SyhQ89fZi1I/AAAAAAAAAWI/pXXfq-p4R1o/s1600-h/StarWars%5B4%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="StarWars" border="0" alt="StarWars" align="left" src="http://lh5.ggpht.com/_ayZBUzPGG9A/SyhQ9ZWLZSI/AAAAAAAAAWM/5It3EPFe_8o/StarWars_thumb%5B2%5D.png?imgmax=800" width="244" height="146" /></a> This blog post is more a tip that I picked up on while at PASS 2009.  Have you ever had the need to copy the contents of an entire table into another table?  Traditionally speaking, we as developers will use SELECT INTO or a INSERT INTO statement to load a destination table.  This is a still a great way of accomplishing the task at hand, but it is not nearly fast as what I am about to show you.  The method I am about to show you is not for all scenarios, but it can be very handy.  I do not know how many Oracle guys are reading this, but I have one question for you, “can your RDMS move 10 millions rows of data in <= 1 millisecond?”  I would be willing to bet that most will answer no, but I have to admit that this is not really a fair fight.  Why is this a little unfair…. let’s have a look at how this works under the hood.  </p> <p>This method derives it power based on new partitioning functionality, in SQL Server 2005.  If you have used partitioning in SQL Server 2005, you probably have a good idea where I am going with this.  If not, SQL Server 2005 has built-in functionality that allows tables to be split or divided into what I will call virtual tables, whose values are dependent on predefined boundaries. When the partitioning function column is used in a query predicate the optimizer knows which partition the data resides in, which makes queries more IO efficient.  This is amazing functionality because it does not require application changes and significantly reduces the amount of data SQL Server has to sift through. The partitioning feature I will be focusing on is the feature that allows SQL Server to switch or trade partitions out, ironically named SWTICH.  This is commonly used for situations were you want to move data to a different partition either because the boundaries have changed or you need to phase data out.  The real benefit in using the SWITCH function is SQL Server does not actually move the data, it updates the meta data pointers to the data.  Because I am not actually moving data, I am able to move data around nearly instantaneously, regardless of the number of rows. This is why I said it is not fair, but hey what in life is fair :^) </p> <p>Okay let’s see an example.  I will start by creating a sample table.</p> <pre class="csharpcode"><span class="kwrd">USE</span> [tempdb]
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> [dbo].[TestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[TestData](
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
);
<span class="kwrd">GO</span>
INSERT <span class="kwrd">INTO</span> [dbo].[TestData]
<span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 10000000
ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,
ABS(CHECKSUM(NEWID()))%2500+1 <span class="kwrd">AS</span> SomeId,
<span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)
+ <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode
<span class="kwrd">FROM</span>
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
<span class="kwrd">GO</span>
<span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'NewTestData'</span>)
<span class="kwrd">BEGIN</span>
<span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> [dbo].[NewTestData];
<span class="kwrd">END</span>
<span class="kwrd">GO</span>
--<span class="kwrd">Create</span> <span class="kwrd">New</span> <span class="kwrd">Table</span> <span class="kwrd">To</span> Move <span class="kwrd">Data</span> <span class="kwrd">To</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[NewTestData](
RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,
SomeId <span class="kwrd">INT</span>,
SomeCode <span class="kwrd">CHAR</span>(2)
);
GO</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now the fun part……. behold the power of SQL Server!!!!!!!!!!!!!!!!</p>
<pre class="csharpcode">--Move <span class="kwrd">data</span> <span class="kwrd">to</span> the <span class="kwrd">new</span> <span class="kwrd">table</span>
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> <span class="kwrd">TIME</span> <span class="kwrd">ON</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">ON</span>;
<span class="kwrd">ALTER</span> <span class="kwrd">TABLE</span> [dbo].[TestData] SWITCH <span class="kwrd">to</span> [dbo].[NewTestData];
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> <span class="kwrd">TIME</span> <span class="kwrd">OFF</span>;
<span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">OFF</span>;
<span class="kwrd">GO</span>
/*
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 0 ms.
<span class="kwrd">SQL</span> Server Execution Times:
CPU <span class="kwrd">time</span> = 0 ms, elapsed <span class="kwrd">time</span> = 1 ms.
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Next, I will verify the results.</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> <span class="kwrd">COUNT</span>(*) <span class="kwrd">FROM</span> [dbo].[TestData]; --0
<span class="kwrd">SELECT</span> <span class="kwrd">COUNT</span>(*) <span class="kwrd">FROM</span> [dbo].[NewTestData]; --10,000,000
/*
---------<span class="rem">--</span>
0
(1 <span class="kwrd">row</span>(s) affected)
---------<span class="rem">--</span>
10000000
(1 <span class="kwrd">row</span>(s) affected)
*/</pre>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>There you have.  I have successfully moved 10 million rows into a new table in 1 MS and incurred no IO through IO stats; however, IO has to be incurred to update meta data, although it should be minimal.  This method has limited use, but can be extremely advantageous.  There are stipulations that have to be met for SWITCH, such as the table you are switching to must be empty and the table must have the same schema.  For a comprehensive list of requirements please refer to this link, <a title="http://technet.microsoft.com/en-us/library/ms191160.aspx" href="http://technet.microsoft.com/en-us/library/ms191160.aspx">http://technet.microsoft.com/en-us/library/ms191160.aspx</a> .  </p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com20tag:blogger.com,1999:blog-4646137438366687895.post-33629965541584410012009-12-09T19:09:00.001-08:002009-12-12T14:25:30.608-08:00Splitting A Delimited String (Part 2)<p>This is part two of a two part series.  In part 1 of this series I demonstrated the most popular methods used to parse/split a delimited string of values, <a title="http://jahaines.blogspot.com/2009/11/splitting-delimited-string-part-1.html" href="http://jahaines.blogspot.com/2009/11/splitting-delimited-string-part-1.html">http://jahaines.blogspot.com/2009/11/splitting-delimited-string-part-1.html</a>.  In this article I will be focusing on the performance implications of each method presented.  I will start of by giving a disclaimer that your results may vary from the results presented in this article.  Although the numbers may differ, the data trend should be somewhat consistent with my results.  I will be tracking three key performance counters: CPU, Duration, and Reads against varying string sizes and data loads. In addition to a varying delimited strings length and load, I have performed each query 10 times and taken an average.  I did this to ensure that I get the most accurate results. Enough talk let’s dive into our first test.</p> <p><em>Note: I am not going to walk through how I did each test, but I will link all my scripts at the bottom of this post</em></p> <p>The first test is testing how CPU usage differs between methods.  I tested a delimited string of exactly 10 Ids over a table with 10,000 rows, 100,000 rows and 1,000,000 rows.  As you can see, the permanent numbers table TVF is by far the best solution.  The CPU is highest on the inline Numbers TVF.  The inline numbers table is the most expensive because SQL Server has to do a lot of processing and calculation on the fly, whereas the permanent numbers table mostly has to read data, which means the IO will be much higher.  Both XML methods perform much better than the inline numbers table, but are nearly twice as slow as the permanent numbers TVF because they require more CPU intensive processing, which is derived from SQL converting the data to XML and transforming the XML back to a relational format.  It takes more processing power to encode and decode the XML than to simply convert it, so you should only use the encode/decode method, if your data contains special XML characters, <a title="http://msdn.microsoft.com/en-us/library/aa226544(SQL.80).aspx" href="http://msdn.microsoft.com/en-us/library/aa226544(SQL.80).aspx">http://msdn.microsoft.com/en-us/library/aa226544(SQL.80).aspx</a>.</p> <p><strong>Winner of Round 1: Numbers Table Split Function</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaoAV6-P_gD6Osn2JQWDrsrhKInSD69h3zO007cXkemq-Bty5l20CFrLbH7CQyd9nC8dTnIuzPzYte1JiUSUyMI6l_r5J0cM3f0XlTgjNbjMfmZMR5vJ_E7Lz9wE-qCpr5Oywti3cTIg/s1600-h/image%5B6%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/SyBmUgTBOxI/AAAAAAAAASs/IAo01mVKGvg/image_thumb%5B2%5D.png?imgmax=800" width="406" height="252" /></a> </p> <p>The next test was performed on a table with 10,000, 100,000 rows, 1,000,000 rows, using a string consisting of 100 Ids.  If you look at the chart below, you will see a trend.  As the number of Ids increases, so does the cost of the XML method.  CPU usage actually increases at a exponential rate, which makes it the least scalable solution.  Obviously the Numbers Table TVF is the clear winner here.  </p> <p><strong>Winner of Round 2: Numbers Table Split Function</strong></p> <p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/SyBmU-0adjI/AAAAAAAAAS4/5rJO_L-UPPY/s1600-h/image%5B14%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiNcH0BKZdU6q92pHpca3Pr244yb9cvkN6nYQ-ioKTwPkfD-d3Zk3Bhu2IQcrdKZtY_aaGdKtrc_I9_JLlsbVso7VeZcPq_GGyC4bYjcZGZ0cn_GkZ4xCQzDXVp8jVV0ZPCur9WBANu8Q/?imgmax=800" width="405" height="251" /></a> </p> <p></p> <p>The final test was taken over the same load, but I supplied a string of 1000 Ids.  As you can see with 1000, ids the results depict more of the same behavior.  The big take away is the XML method should only be used when the number of values in the delimited string is relatively small. </p> <p><strong>Winner of Round 3: Numbers Table Split Function</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhl5vK-Ip2FX8HetS4KPbT1UpvY2q3ny-UG7xrk4bXkHsY0g2UdmLaqohFjDSo3RwDDN-LjX-aOTQAFTwHNzhULfl546SNlXFraa6VObsnkrarPWzkI-y61E6DvDl9ukknybhggP1jK3Q/s1600-h/image%5B3%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/SyGZuyQ9IrI/AAAAAAAAAV0/xXf6dHtgIRo/image_thumb%5B1%5D.png?imgmax=800" width="408" height="253" /></a> </p> <p>The next counter I will be focusing on is Duration.  Duration is not a very reliable counter, as it is very dependent on other processes running on the machine; however, it does provide insight to performance.  The first test will be done over the same load.  I will begin with 10 Ids again.</p> <p>The results are a little more aligned in this test.  The Numbers Table Split TVF is the best performing on average, followed by the XML methods.  Again there is a higher performance cost to encode XML, so do so only when necessary.  Duration does give you a general idea about performance, but these results definitely do not carry as much weight as the other counters.</p> <p><strong>Winner of Round 4: Numbers Table Split Function</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLGq1VS9MkhXBCIq5xXzA6LXTxTCN8DgGreyqfSCWmiEQjQX8uDO-HGvm-8ilvhflpV6fgOgGcz1nVaPSrkwySnY3LavMlHUS2-tbMJQNHDKOFshxpCHbmu8Xfe6Z6-5r9qpga3xbVug/s1600-h/image%5B25%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWGvcBttQgxf6Yyd_UPJtesmGHORqNNoRuXF1V7NBf2ljZQ9AWcWv_JlGiMn-uYp_ovIe7jrN3Q9G2hEw6b1g50T_fI74HhxmMpt9Jz9NStwfmETX-mOHJ0HxMFcKhz6QC0Yk-pR5avA/?imgmax=800" width="405" height="251" /></a> </p> <p>The next step is to increase the number of Ids to 100.  I wont repeat the same stuff over again I promise.  This test yields more of the same.</p> <p><strong>Winner of Round 5: Numbers Table Split Function</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivXU1DEvN_fnuJ3L9oLn9D0sa-tXIbmkL5shGX37bbryyg8UmK4XegzJlWbyWZ5xy7F_RrqZUoZ1N3T-VU1vWTlvqGWz_OJBzGTdMDdDCBOqLUN8Ec2pdqoPushRP86L1LjSA87tNR0Q/s1600-h/image%5B33%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1fj4iilp2LMTvMc1NdG9F7hYcspEriq8S8d98FraZr2tHJtyuZCnAC8sWvVBqYPkzkulci4rjDNw2KwK44f2ieuSO744JvDvc2WZdzOk5ljDQP9nr-H4QcOau_bwnw0rPxfEQ67OKsQ/?imgmax=800" width="412" height="256" /></a> </p> <p></p> <p>Next, I bump the number of Ids to 1000.  Here we go….. a different result :) .  In this example, the numbers table actually performed worse than the inline number TVF.  You may be wondering why the duration is worse for the numbers table split function.  I suspect the answer is the number of reads makes the query take longer to execute; however, there could have been something running on my machine when profiler captured the data.  This is a prime example of why duration can be an unreliable counter. I cannot tell if the query is actually worse or if an environmental factor on my laptop may have skewed the timing.  I will take the high road :^) and assume the reads impacted the timing because I had no known programs running.   The XML results are just disturbing…… For as much as I recommend XML split solutions on the forums, these results are just scary.</p> <p><strong>Winner of Round 6: Inline Numbers TVF Split</strong></p> <p><a href="http://lh6.ggpht.com/_ayZBUzPGG9A/SyGZvPfgBII/AAAAAAAAAV4/LIPZ_cpoOyk/s1600-h/image%5B7%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/SyGZvibFcXI/AAAAAAAAAV8/bUh0TcnGUFM/image_thumb%5B3%5D.png?imgmax=800" width="423" height="246" /></a> </p> <p>The final counter I will be testing is reads.  This is by far one of the most important counters because it impacts so many facets of SQL Server performance. Do not let the number of reads for the Numbers Table TVF persuade you to avoid it. A permanent numbers table TVF  is going to have more reads.  Essentially you are reading  table values from cache/disk instead of calculating them, so the numbers of reads is greater.  The obvious choice for this test is the inline numbers table TVF.</p> <p><strong>Winner of Round 7: Inline Numbers TVF Split</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6NfycPJjureDwKUbNYjlU82oN1sKC0_pwLyxcVZURcF3P3Q2D10E0AWkFydMNMgP_ajtbEDBQVJEKFYEGR0cIEyJNk-da4EnlLoGt46e_RFm0DeAEaRl_ZGRJoiZuN4KGCxUjK23tUw/s1600-h/image%5B45%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://lh5.ggpht.com/_ayZBUzPGG9A/SyBmbIs8x_I/AAAAAAAAAU8/wYck6251Wug/image_thumb%5B23%5D.png?imgmax=800" width="415" height="264" /></a> </p> <p>The next test increases the number of Ids to 100.  As for the results, we see more of the same.</p> <p><strong>Winner of Round 8: Inline Numbers TVF Split</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKVDdiGTGDWX3o9RNBZedMgAM-6iw5O57Xb3XYS6qNfl21W1Du_TxkNUyqZB6txs2NM5kz7KtJv9kA2NnLbHTmvNtEVG-TI3wDUaieNI2X9kNCTLiowuaAKtj4WmUaJVn2T5a2LP22sQ/s1600-h/image%5B49%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_PNBAe1MPkbC6IIlyqjjOD5th7oIcKD6Q5wVqlj5vzPUjjvFHzCCcLmDN__OVMpoHLhB87SowOUDZlmoHn30TMTNRC9WNvsG-ZjA8qzyl_CSx6YV1DYdf6YaBnhiz1kjAuHifOGZS5Q/?imgmax=800" width="417" height="280" /></a> </p> <p>Finally, I will increase the number of Ids to 1000. Again more of the same.  The number or reads, stays relatively consistent across all solutions.  The XML solution does better than the numbers table TVF here, but it just does not scale well at any other level.  I used to be a firm believer in the XML method, but I think I am going to start primarily recommending the number table TVF or an inline number table TVF.</p> <p><strong>Winner of Round 9: Inline Numbers TVF Split</strong></p> <p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3keqi9d2axPGwBHob0S6n3jqkupR_L9rd3tVWGz_jQyI7emmC_mjsmA_U1jL7MBIxeIDN6S9PHj-7zflrCAC44awShbWDVLCvX3GV3RekQ17VuhlfmS-LdXFJB_wvYaDH5UsFuDWaDA/s1600-h/image%5B11%5D.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioNIFIzb6t_yw-HW3PT-Eun3iYBsnvjBFN-XK5_9iyj3XnVqT0ouCM9x2kBghnJTE3qoqoJtZBYBe1uZit4HPohW8HxQDpVyazzre6CuUldCTLnXrRhd9Xcdc8yFh64IDlnvN1bYbzTA/?imgmax=800" width="428" height="266" /></a> </p> <h3>The verdict</h3> <p>So what is the verdict?  Well, again I cannot reiterate enough that no one solution is always better than another.  Different solutions work best in different situations and environments.  In my tests, there is a clear winner, the permanent numbers table TVF.  Even though this is the “winner”, I have overwhelming evidence that says I should be using a numbers table to split a string, regardless of the numbers table being permanent or inline.  I am happy with these results because the permanent numbers table split function performs well and is very easy to implement.  Another benefit of the permanent numbers table TVF solution is that it works in all versions of SQL.  Why would you not want to use the permanent numbers table?  You may be willing to accept more CPU consumption to reduce IO, or perhaps you do not want to maintain another table.  If this is the case, an inline numbers table solution is the way to go.  All-in-all, the XML method really did surprise me  and I find it sad that this method does not perform well.  There is just not enough incentive to use an XML solution when an easier and better performing solution exists.  Remember that you should test each method in your environment to see which works best in your environment.  </p> <p>I hope that you have learned something and can use this information to make more informed decisions when deciding to find a method to split delimited strings.</p> <p>**** UPDATE *****</p> <p>I did have some bugs in the code for the 1000 ids test.  I guess my relaxed brain never came home from vacation.  A special thanks goes out to Brad Schulz for spotting the bug.  I am glad that the end result is for the most part the same.  The only real deviation occurred in the reads category.  The numbers of reads should be comparable for all methods because the same amount of data should be returned, but in my original result set they were not.  I resolved the bug and have since updated the scripts and the results.</p> <p>****UPDATE***** </p> <p>A good friend and fellow SQL server enthusiast Brad Schulz recently posted a great entry, on parsing delimited strings with XML.  His findings show how using the XML method in certain ways can cause the query to really bomb; however, you can avoid some performance penalties by casting and storing the delimited string in a XML data type, instead of casting and parsing the XML inline.  I will not go into detail about why the inline XML is slower because I want you to read it right from the horse’s mouth, <a title="http://bradsruminations.blogspot.com/2009/12/delimited-string-tennis-anyone.html" href="http://bradsruminations.blogspot.com/2009/12/delimited-string-tennis-anyone.html">http://bradsruminations.blogspot.com/2009/12/delimited-string-tennis-anyone.html</a>.   When I changed my code to use a XML variable, the XML methods were just as performant as the numbers table methods.  I do not know about you, but I am ecstatic.  I for one love the XML method and am very excited to see that it is and can be just as performant, when used in the right context.</p> <p>Download the script files: <a title="http://cid-6f041c9a994564d8.skydrive.live.com/self.aspx/.Public/Split%20Delimited%20String/BlogPost^_UnpackDelimitedString.zip" href="http://cid-6f041c9a994564d8.skydrive.live.com/self.aspx/.Public/Split%20Delimited%20String/BlogPost^_UnpackDelimitedString.zip">http://cid-6f041c9a994564d8.skydrive.live.com/self.aspx/.Public/Split%20Delimited%20String/BlogPost^_UnpackDelimitedString.zip</a></p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com10tag:blogger.com,1999:blog-4646137438366687895.post-48068532133066542712009-11-15T15:53:00.001-08:002009-11-15T15:53:17.000-08:00Splitting A Delimited String (Part 1)<p>In a previous post, I demonstrated how to split a finite array of elements, using XML, <a title="http://jahaines.blogspot.com/2009/06/converting-delimited-string-of-values.html" href="http://jahaines.blogspot.com/2009/06/converting-delimited-string-of-values.html">http://jahaines.blogspot.com/2009/06/converting-delimited-string-of-values.html</a>.  The method presented in my previous post can only be used when there is a known number of elements.  In this post, I will be focusing on the methods most commonly used today to parse an array, when the number of elements is unknown. This part one of a two part series where I look at the differing methods used to split an array of strings, using SQL Server 2005 and 2008.  There are three primary methods to parse an array.  The first method takes advantage of a numbers table to quickly parse the string.  The second method uses the new XML functionality built into SQL Server 2005.  The final method uses a TVF, without a permanent numbers table.  </p> <p>Let’s get started by creating are sample table.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'t'</span>)</pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[t];</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.t(</pre>
<pre class="alt">SomeId <span class="kwrd">INT</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,</pre>
<pre>SomeCode <span class="kwrd">CHAR</span>(2)</pre>
<pre class="alt">);</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>;<span class="kwrd">WITH</span> </pre>
<pre class="alt"> L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span></pre>
<pre> ,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)</pre>
<pre class="alt"> ,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)</pre>
<pre> ,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)</pre>
<pre class="alt"> ,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)</pre>
<pre> ,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)</pre>
<pre class="alt"> ,Nums <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5) </pre>
<pre>INSERT dbo.t (SomeId,SomeCode)</pre>
<pre class="alt"><span class="kwrd">SELECT</span> </pre>
<pre> N,</pre>
<pre class="alt"> <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)</pre>
<pre> + <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode </pre>
<pre class="alt"><span class="kwrd">FROM</span> Nums</pre>
<pre><span class="kwrd">WHERE</span> N<=10000;</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">NONCLUSTERED</span> <span class="kwrd">INDEX</span> ncl_idx_SomeCode <span class="kwrd">ON</span> dbo.t(SomeCode);</pre>
<pre>GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The first method I will demonstrate is the Numbers table method.  This is probably the most efficient method of all the methods listed, but performance does vary among environments.  Another great benefit of this method is that it works with SQL Server 2000 and greater.</p>
<p>The first step in using this method is to create a table of numbers.  A table is just what it sounds like… a table of natural number starting from 1 and going to n, where is the maximum number you want in the table.  This method really performs well because of a clustered index on the number column, which allows for very fast index seeks. Here is the code I use to generate my numbers table.</p>
<div class="csharpcode">
<pre class="alt">--=============================================================================</pre>
<pre><span class="rem">-- Setup</span></pre>
<pre class="alt">--=============================================================================</pre>
<pre><span class="kwrd">USE</span> [tempdb]</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span> </pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>--=============================================================================</pre>
<pre class="alt"><span class="rem">-- Create and populate a Numbers table</span></pre>
<pre>--=============================================================================</pre>
<pre class="alt">--===== Conditionally <span class="kwrd">drop</span> </pre>
<pre><span class="kwrd">IF</span> OBJECT_ID(<span class="str">'dbo.Numbers'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span> </pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.Numbers;</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.[Numbers](</pre>
<pre class="alt">N <span class="kwrd">INT</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre>);</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt">;<span class="kwrd">WITH</span> </pre>
<pre> L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span></pre>
<pre class="alt"> ,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)</pre>
<pre> ,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)</pre>
<pre class="alt"> ,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)</pre>
<pre> ,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)</pre>
<pre class="alt"> ,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)</pre>
<pre> ,Nums <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5) </pre>
<pre class="alt">INSERT Numbers</pre>
<pre><span class="kwrd">SELECT</span> N <span class="kwrd">FROM</span> Nums</pre>
<pre class="alt"><span class="kwrd">WHERE</span> N<=10000;</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">ALTER</span> <span class="kwrd">TABLE</span> dbo.Numbers <span class="kwrd">ADD</span> <span class="kwrd">CONSTRAINT</span> PK_N</pre>
<pre class="alt"><span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span> ([N])<span class="kwrd">WITH</span>(<span class="kwrd">FILLFACTOR</span> = 100);</pre>
<pre><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>Next, we will need to create an Inline TVF (Table Valued Function) to split the array.  This function is written by SQL Server guru Itzik Ben-Gan.  This split function is very fast and very scalable.  </p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">IF</span> OBJECT_ID(<span class="str">'dbo.fn_split'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre><span class="kwrd">DROP</span> <span class="kwrd">FUNCTION</span> dbo.fn_split;</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_split(@arr <span class="kwrd">AS</span> NVARCHAR(2000), @sep <span class="kwrd">AS</span> <span class="kwrd">NCHAR</span>(1))</pre>
<pre class="alt"><span class="kwrd">RETURNS</span> <span class="kwrd">TABLE</span></pre>
<pre><span class="kwrd">AS</span></pre>
<pre class="alt"><span class="kwrd">RETURN</span></pre>
<pre><span class="kwrd">SELECT</span></pre>
<pre class="alt">(n - 1) - LEN(REPLACE(<span class="kwrd">LEFT</span>(@arr, n-1), @sep, N<span class="str">''</span>)) + 1 <span class="kwrd">AS</span> pos,</pre>
<pre><span class="kwrd">SUBSTRING</span>(@arr, n, CHARINDEX(@sep, @arr + @sep, n) - n) <span class="kwrd">AS</span> element</pre>
<pre class="alt"><span class="kwrd">FROM</span> dbo.Numbers</pre>
<pre><span class="kwrd">WHERE</span> n <= LEN(@arr) + 1</pre>
<pre class="alt"><span class="kwrd">AND</span> <span class="kwrd">SUBSTRING</span>(@sep + @arr, n, 1) = @sep;</pre>
<pre>GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>This function’s logic is pretty straight forward, but I will discuss how it works.  The numbers table is used to iterate through each character in the array.  As you can see the first step is to pad the beginning of the string with the delimiter.   With all the delimiters in place, the code can determine the position of each delimiter.  Once the logic has the delimiter’s position, the code logics utilizes the CHARINDEX() and SUBSTRING() system functions to extract each element and it’s corresponding position.</p>
<p>Now that we know how this code works, lets see it in action.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">DECLARE</span> @Ids <span class="kwrd">VARCHAR</span>(1000)</pre>
<pre><span class="kwrd">SET</span> @Ids = <span class="str">'1,500,5439,9999,7453'</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">SELECT</span> t.*</pre>
<pre class="alt"><span class="kwrd">FROM</span> dbo.t</pre>
<pre><span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.fn_split(@Ids,<span class="str">','</span>) <span class="kwrd">AS</span> fn</pre>
<pre class="alt"> <span class="kwrd">ON</span> t.SomeId = fn.Element</pre>
<pre> </pre>
<pre class="alt">/*</pre>
<pre>SomeId SomeCode</pre>
<pre class="alt">---------<span class="rem">-- --------</span></pre>
<pre>1 SQ</pre>
<pre class="alt">500 BO</pre>
<pre>5439 ZV</pre>
<pre class="alt">9999 RD</pre>
<pre>7453 IG</pre>
<pre class="alt">*/</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>Before we move onto the next method, I would like to point out that some developers love to use the exists clause for this situation, especially when the developer does not need any columns from the TVF; however, exists may degrade performance.  I am planning to do an in-depth post regarding the differences between inner join and EXISTS.  To give you an idea of the how exists can degrade performance, have a look at the screenshot below.  Please note that performance is not always black or white and all executions plans will not deviate as much as the one below.</p>
<p>Execution Plan:</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjE1t2rBDPH3hBDXJLuC3I5DGhImKkyKrssurp7oBLxpJ7iUmsdczkOK8HDKnH2s_9L42W3NZjOAysI35QuvXPESqMJYBaMbtCFDN2_ZgEBtQVmWjzP8Uupe74NbOrMI4Z1ogafTertuA/s1600-h/image%5B4%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/SwCUa2-rhSI/AAAAAAAAASU/jK46gbQyey0/image_thumb%5B2%5D.png?imgmax=800" width="559" height="313" /></a> </p>
<p>IO Stats:</p>
<div class="csharpcode">
<pre class="alt">********************* <span class="kwrd">inner</span> <span class="kwrd">join</span> *************************</pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">Table</span> <span class="str">'t'</span>. Scan <span class="kwrd">count</span> 0, logical <span class="kwrd">reads</span> 14, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
<pre><span class="kwrd">Table</span> <span class="str">'Numbers'</span>. Scan <span class="kwrd">count</span> 1, logical <span class="kwrd">reads</span> 3, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
<pre class="alt"> </pre>
<pre>********************* <span class="kwrd">exists</span> *************************</pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">Table</span> <span class="str">'Worktable'</span>. Scan <span class="kwrd">count</span> 2, logical <span class="kwrd">reads</span> 20018, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
<pre class="alt"><span class="kwrd">Table</span> <span class="str">'Numbers'</span>. Scan <span class="kwrd">count</span> 2, logical <span class="kwrd">reads</span> 6, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
<pre><span class="kwrd">Table</span> <span class="str">'t'</span>. Scan <span class="kwrd">count</span> 3, logical <span class="kwrd">reads</span> 46, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>As you can see, the difference between the inner join and the exist query plans are night and day.  I will post on this at a later date, but I wanted to make you aware of possible performance problems.</p>
<p>The next method I will be discussing is the XML nodes method.  This method does not require an ancillary table, but does require SQL Server 2005 and greater.  This method does handle XML special characters. I picked up this encoding/decoding method from SQL Server enthusiast Brad Schulz, <a title="http://bradsruminations.blogspot.com/" href="http://bradsruminations.blogspot.com/">http://bradsruminations.blogspot.com/</a>.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">DECLARE</span> @Ids <span class="kwrd">VARCHAR</span>(1000)</pre>
<pre><span class="kwrd">SET</span> @Ids = <span class="str">'1,500,5439,9999,7453'</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">SELECT</span> t.*</pre>
<pre class="alt"><span class="kwrd">FROM</span> dbo.t</pre>
<pre><span class="kwrd">INNER</span> <span class="kwrd">JOIN</span>(</pre>
<pre class="alt"> <span class="kwrd">SELECT</span> x.i.<span class="kwrd">value</span>(<span class="str">'.'</span>,<span class="str">'INT'</span>) <span class="kwrd">AS</span> SomeId</pre>
<pre> <span class="kwrd">FROM</span>(<span class="kwrd">SELECT</span> XMLEncoded=(<span class="kwrd">SELECT</span> @Ids <span class="kwrd">AS</span> [*] <span class="kwrd">FOR</span> XML <span class="kwrd">PATH</span>(<span class="str">''</span>))) <span class="kwrd">AS</span> EncodeXML</pre>
<pre class="alt"> <span class="kwrd">CROSS</span> APPLY (<span class="kwrd">SELECT</span> NewXML=<span class="kwrd">CAST</span>(<span class="str">'<i>'</span>+REPLACE(XMLEncoded,<span class="str">','</span>,<span class="str">'</i><i>'</span>)+<span class="str">'</i>'</span> <span class="kwrd">AS</span> XML)) CastXML</pre>
<pre> <span class="kwrd">CROSS</span> APPLY NewXML.nodes(<span class="str">'/i'</span>) x(i)</pre>
<pre class="alt">) <span class="kwrd">AS</span> Ids</pre>
<pre> <span class="kwrd">ON</span> Ids.SomeId = T.SomeId</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The XML method looks a lot more complex than it really is.  Essentially, what I am doing is creating an XML structure that contains each of the elements of the array.  For example, the array “1,500,5439,9999,7453” becomes “<i>1</i><i>500</i><i>5439</i><i>9999</i><i>7453</i>.” The first step is to encode the array by using FOR XML PATH. Once the array is in an encoded string format, I explicitly cast the xml string into an XML data type.  Once I have the XML in a decoding XML format, I use the XML nodes method to put the XML values into a relational format. For more information about how this method works, you can view the following blog post by Brad Schulz, <a title="http://bradsruminations.blogspot.com/2009/10/un-making-list-or-shredding-of-evidence.html" href="http://bradsruminations.blogspot.com/2009/10/un-making-list-or-shredding-of-evidence.html">http://bradsruminations.blogspot.com/2009/10/un-making-list-or-shredding-of-evidence.html</a>.  This post does a great job of breaking down the inner workings of this method.</p>
<p>The final method I will be demonstrating uses a TVF function with a virtual table of numbers.  The TVF method does not require an ancillary table because a numbers table is generated on the fly.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">IF</span> OBJECT_ID(<span class="str">'dbo.fn_TVF_Split'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre><span class="kwrd">DROP</span> <span class="kwrd">FUNCTION</span> dbo.fn_TVF_Split;</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_TVF_Split(@arr <span class="kwrd">AS</span> NVARCHAR(2000), @sep <span class="kwrd">AS</span> <span class="kwrd">NCHAR</span>(1))</pre>
<pre><span class="kwrd">RETURNS</span> <span class="kwrd">TABLE</span></pre>
<pre class="alt"><span class="kwrd">AS</span></pre>
<pre><span class="kwrd">RETURN</span></pre>
<pre class="alt"><span class="kwrd">WITH</span> </pre>
<pre> L0 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">UNION</span> <span class="kwrd">ALL</span> <span class="kwrd">SELECT</span> 1) --2 <span class="kwrd">rows</span></pre>
<pre class="alt"> ,L1 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L0 <span class="kwrd">AS</span> A, L0 <span class="kwrd">AS</span> B) --4 <span class="kwrd">rows</span> (2x2)</pre>
<pre> ,L2 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L1 <span class="kwrd">AS</span> A, L1 <span class="kwrd">AS</span> B) --16 <span class="kwrd">rows</span> (4x4)</pre>
<pre class="alt"> ,L3 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L2 <span class="kwrd">AS</span> A, L2 <span class="kwrd">AS</span> B) --256 <span class="kwrd">rows</span> (16x16)</pre>
<pre> ,L4 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L3 <span class="kwrd">AS</span> A, L3 <span class="kwrd">AS</span> B) --65536 <span class="kwrd">rows</span> (256x256)</pre>
<pre class="alt"> ,L5 <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> 1 <span class="kwrd">AS</span> C <span class="kwrd">FROM</span> L4 <span class="kwrd">AS</span> A, L4 <span class="kwrd">AS</span> B) --4,294,967,296 <span class="kwrd">rows</span> (65536x65536)</pre>
<pre> ,Nums <span class="kwrd">AS</span> (<span class="kwrd">SELECT</span> row_number() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> (<span class="kwrd">SELECT</span> 0)) <span class="kwrd">AS</span> N <span class="kwrd">FROM</span> L5) </pre>
<pre class="alt"><span class="kwrd">SELECT</span></pre>
<pre>(n - 1) - LEN(REPLACE(<span class="kwrd">LEFT</span>(@arr, n-1), @sep, N<span class="str">''</span>)) + 1 <span class="kwrd">AS</span> pos,</pre>
<pre class="alt"><span class="kwrd">SUBSTRING</span>(@arr, n, CHARINDEX(@sep, @arr + @sep, n) - n) <span class="kwrd">AS</span> element</pre>
<pre><span class="kwrd">FROM</span> Nums</pre>
<pre class="alt"><span class="kwrd">WHERE</span> </pre>
<pre> n <= LEN(@arr) + 1</pre>
<pre class="alt"> <span class="kwrd">AND</span> <span class="kwrd">SUBSTRING</span>(@sep + @arr, n, 1) = @sep</pre>
<pre> <span class="kwrd">AND</span> N<=1000</pre>
<pre class="alt">GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>Now that the function is in place, you can do the same join as before.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">DECLARE</span> @Ids <span class="kwrd">VARCHAR</span>(1000)</pre>
<pre><span class="kwrd">SET</span> @Ids = <span class="str">'1,500,5439,9999,7453'</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">SELECT</span> t.*</pre>
<pre class="alt"><span class="kwrd">FROM</span> dbo.t</pre>
<pre><span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.fn_TVF_split(@Ids,<span class="str">','</span>)</pre>
<pre class="alt"> <span class="kwrd">ON</span> t.SomeId = Element</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>This method is really the same method as provided before, except it does not take advantage of a permanent numbers table.  </p>
<p>I have shown the three most common and best performing methods for splitting a delimited string.  Which method do you use?  As you can imagine, this answer depends on the distribution of data, size of tables, indexes etc..  One method is not always going to be better than another method, so my recommendation is to test each method and choose the one that makes the most sense for you environment. In part two of this series, I am really going to dig into how each of these methods performs on varying sized tables and strings.  This should give you a better idea of which method to choose based on your data, but as stated before the results may vary depending on several environmental factors.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com3tag:blogger.com,1999:blog-4646137438366687895.post-77381180979741658092009-11-11T21:04:00.001-08:002009-11-11T21:28:49.268-08:00SSRS - Should I Use Embedded TSQL Or A Stored Procedure?<p>SSRS is becoming a highly scalable and performant reporting solution, for most environments.  SSRS is becoming more and more popular because of its price tag.  It is really tough to compete with free.  The only cost  consideration that needs to be made is SQL licensing.  With the uproar with BI and Share Point SSRS is becoming the premier reporting platform.  As database professionals, we need to consider the performance consequences of all the code that executes on our production/reporting databases.  SSRS is a great reporting tool but if left unchecked can be quite problematic.  This post will strictly focus on the eternal question…. should I use a stored procedure, or should I use embedded SQL. I will be addressing this question from the DBA perspective, which is often neglected.  </p> <p>I will be very frank and say that I really do not see any benefits to using embedded TSQL from the DBA perspective.  Embedded TSQL has a lot of cons that should deter any database professional from using it.  So what are some of the problems in using embedded TSQL?  Let’s name a few of the cons.</p> <h5>The Cons Of Embedded TSQL:</h5> <ul> <li>Harder to manage security </li> <li>Report may break when schema changes </li> <li>Difficult to make changes to embedded TSQL </li> <li>Causes procedure cache to bloat </li> </ul> <p><em>Note: I did not list the pros to using stored procedures, but the list is the inverse of the Embedded TSQL list.  </em></p> <p>As you can see, there are a lot of problems in choosing to use embedded TSQL.  The first con to using embedded TSQL is security.  It is extremely difficult to manage security without the use of stored procedures.  When code logic is encapsulated in stored procedure, the DBA can easily apply permission to the stored procedure, without elevating permissions to the underlying objects.  If embedded TSQL is used, the person executing the report must have underlying permissions to all objects referenced in the embedded TSQL, which makes maintaining embedded TSQL complicated because you really have no idea what code is actually being executed against your database.  To get an idea of what permissions are needed to execute a report, you have to open the report or run a trace to get the TSQL. </p> <p>Embedding TSQL in a SSRS report also can become a problem when the underlying database schema changes.  The SSRS report has no dependencies on the database schema and any change can break a report.  In this scenario, there may be additional downtime just to figure out and fix the problem.  In most cases, the DBA has no idea that a schema change broke the report. Typically the problem does not surface until customers start complaining, which leads to the problem of changing the embedded TSQL.  This is not to say that when stored procedures are used reports will not break, but SSMS offers better dependency checks to determine what objects are dependant, which decreases the likelihood of an report outage.  </p> <p>One of the biggest problems with embedded TSQL is modifying the TSQL code.  To modify the embedded TSQL the developer has to download the report RDL and then make the change.  Once the change has been made, the developer has to redeploy the report.  These steps require a lot of time to implement and additional downtime is incurred.  If a stored procedure is used, the only downtime incurred is the time taken to modify the stored procedure.  Another benefit of a stored procedure is a developer or DBA can more easily test the report within the confines of SSMS, instead of having to use BIDS.</p> <p>The absolute worse aspect of using embedded TSQL is it can bloat the procedure cache, which can severely degrade server performance. SSRS tries and does a good job at parameterizing most TSQL, but there are certain aspects of SSRS that cause the procedure cache to bloat.  This is where I want to focus most of my attention because this is often the most overlooked aspect of embedded TSQL.  There are two scenarios that I am currently aware of that can directly cause the procedure cache to bloat.  The first scenario occurs when a multi-value parameter is used in conjunction with the IN clause.  Multi-value parameters are treated a differently than standard parameters, in SSRS.  When a multi-value parameter is used with the IN clause the SSRS engine submits the query to SQL Server using literal values in the IN clause.  When literal values are used in the IN clause, the query is not considered parameterized, so the optimizer has to create a new query plan, unless an exact binary match already exists. Let’s have a look to see this example in action.</p> <p>First let’s create the sample table and populate the table with data.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">USE</span> [tempdb]</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">IF</span> object_id(<span class="str">'tempdb.dbo.SSRS_Cache_Bloat'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.SSRS_Cache_Bloat;</pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> SSRS_Cache_Bloat(</pre>
<pre>ID <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1) <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span> <span class="kwrd">CLUSTERED</span>,</pre>
<pre class="alt">ColA <span class="kwrd">VARCHAR</span>(10),</pre>
<pre>ColB <span class="kwrd">BIT</span></pre>
<pre class="alt">);</pre>
<pre> </pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> dbo.SSRS_Cache_Bloat <span class="kwrd">VALUES</span> (<span class="str">'Adam'</span>,0);</pre>
<pre>INSERT <span class="kwrd">INTO</span> dbo.SSRS_Cache_Bloat <span class="kwrd">VALUES</span> (<span class="str">'Bob'</span>,1);</pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> dbo.SSRS_Cache_Bloat <span class="kwrd">VALUES</span> (<span class="str">'Chad'</span>,0);</pre>
<pre>INSERT <span class="kwrd">INTO</span> dbo.SSRS_Cache_Bloat <span class="kwrd">VALUES</span> (<span class="str">'Dave'</span>,1);</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">IF</span> object_id(<span class="str">'tempdb.dbo.Lookup_Vals'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.Lookup_Vals;</pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.Lookup_Vals(</pre>
<pre>ColA <span class="kwrd">CHAR</span>(4)</pre>
<pre class="alt">);</pre>
<pre> </pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> dbo.Lookup_Vals <span class="kwrd">VALUES</span> (<span class="str">'Adam'</span>);</pre>
<pre>INSERT <span class="kwrd">INTO</span> dbo.Lookup_Vals <span class="kwrd">VALUES</span> (<span class="str">'Bob'</span>);</pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> dbo.Lookup_Vals <span class="kwrd">VALUES</span> (<span class="str">'Chad'</span>);</pre>
<pre>INSERT <span class="kwrd">INTO</span> dbo.Lookup_Vals <span class="kwrd">VALUES</span> (<span class="str">'Dave'</span>);</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Next create a new SSRS report. Create two data sets and a parameter, as defined below.</p>
<p>DataSet1 is the main report dataset.</p>
<pre class="csharpcode"><span class="kwrd">select</span> Id, ColA, ColB <span class="kwrd">from</span> SSRS_Cache_Bloat <span class="kwrd">where</span> ColA <span class="kwrd">in</span>(@var)</pre>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>DataSet2 is the parameter dataset.  You will need to make sure your parameter derives its values from this dataset.   </p>
<pre class="csharpcode"><span class="kwrd">select</span> colA <span class="kwrd">from</span> dbo.Lookup_Vals</pre>
<p>Make sure the parameter is set to use multi-value parameters, as shown below.</p>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/SvuXM5rOeTI/AAAAAAAAARA/7Kl9U5dTXYU/s1600-h/image%5B55%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbBrCjwwVltTHbRIrhUDN7U5diK8u6ccFHhhe1KPuRTJTlrFW4icSJQmgNMmJRemV0nn6y44sO6dmdFLEFpVTTw8TqizWPviY_L3H1ZNe_vAql3ciOXKqYClsKsLNatdqgUSKre5Nrwg/?imgmax=800" width="331" height="279" /></a> <a href="http://lh3.ggpht.com/_ayZBUzPGG9A/SvuXNoepDII/AAAAAAAAARI/1U61TiVKJs0/s1600-h/image%5B51%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/SvuXPFWQ53I/AAAAAAAAARM/65KRwHrQDtg/image_thumb%5B31%5D.png?imgmax=800" width="328" height="274" /></a></p>
<p><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>Once you have all the data sets configured.  Preview the report in SSRS.  When selecting your parameter values make sure to select more than one value.  Here is a screenshot of my report.</p>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/SvuXPcvQKzI/AAAAAAAAARQ/attwK0zJAb0/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaL6sBpk2YdP2c2OjcvmQJQ6oTPkpScl_FeeXuuNJkMvJHCwIHK-5HguZ4zzFD2fsXNiI5dmggKd_fvu2eSPOuj6IPoHXajTGSZg5TUcWugSWve_ncYViCA4lR7CWzgpcltfuYCqodFg/?imgmax=800" width="314" height="219" /></a> </p>
<p></p>
<p>As you can see the multi-value parameter returned a row for all three parameter values.  Let’s look at the query execution stats to see what SQL actually executed.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 10</pre>
<pre> qs.execution_count,</pre>
<pre class="alt"> qt.text,</pre>
<pre> qt.dbid, dbname=db_name(qt.dbid),</pre>
<pre class="alt"> qt.objectid </pre>
<pre><span class="kwrd">FROM</span> sys.dm_exec_query_stats qs</pre>
<pre class="alt"><span class="kwrd">cross</span> apply sys.dm_exec_sql_text(qs.sql_handle) <span class="kwrd">as</span> qt</pre>
<pre><span class="kwrd">ORDER</span> <span class="kwrd">BY</span> qs.[last_execution_time] DESC</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>You should see an entry similar to the screenshot below.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhK_64I3ZXvUbtNLazWu2zFpvBcugAILLZQPrinmYUmPqzTRY9nuMfgmbf6W0KPAm-ONILDhUjms_2UZ7LIIDt-k7csiPfL_EV1Qdu-6fuU8JWcWHddsmcewMif0nYQ7hqcl3_A4nsuFQ/s1600-h/image%5B12%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9uhdDnn1g0mGEoQeu5JmlKHh5frn-N93GKjE92p_bEjM2H8xkyG2JS7k8CWRGPyqT-OTztmB0BSPu3GgDLvtlKowZGvGS347YufF7pDxfC2SL3zD_X-0XthSaorUNIeHfRGo9Hi05ew/?imgmax=800" width="634" height="129" /></a> </p>
<p>As you can see, SSRS submitted TSQL with literal values specified in the IN clause.  What do you think will happen, if we preview the report again with different parameter values?  If you guessed that we will get a completely new plan, you would be right.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIjOpGWDKKWrD-4DUtal0xyS-XfUFC-y2S98Z9V1tzXNXVjQCNFWQr43fCEIr999cDGkBYztWnc8LqNnNc4LZyNlh49x119zAxh2N3x3AM5dP9WJ1o8Y-HPHZamWJU1boLbIzyHblMng/s1600-h/image%5B15%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh4.ggpht.com/_ayZBUzPGG9A/SvuXRXVf9gI/AAAAAAAAARk/56h7nBs72O8/image_thumb%5B9%5D.png?imgmax=800" width="635" height="149" /></a> </p>
<p>Can you imagine what happens when you have hundreds or thousands of differing options and hundreds or thousands of users?  The plan cache will take a beating because so many plans with differing values will have to be stored.  When so many plans exist in the procedure cache, you have less memory to store data pages in cache.  Ultimately nothing good comes out of having a bloated procedure cache. Multi-value parameters are not the only cause of bloating the cache.  The next scenario that bloats the procedure cache is using hard coded values in a parameter list.</p>
<p>Using hard coded values in a parameter list seems like a harmless gesture, but the reality is the SSRS engine guesses at the size of the string which directly impacts whether an existing plan can be used.  In our SSRS report change the dataset that is used to query the Lookup table to use fixed values, as shown below.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGueRCt8jHgD2KTMi0_z5rfn87k_w8v0etJ0hpLoy-CihCz6S3qiQ4Om4COg2N1Le8Ak5xf02nlxODtdCi9sYYVEO-FF0yL-BuAJqAFul1aZE84Q5ZWz3zdNQCQbt3QKRTh_lerPxCDg/s1600-h/image%5B20%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdA8WzIGRrXnjHkslNFQMyAhZA2kQ1-dncwVCHmEbn9l0wX6OHIrejuZsM9F8G7bZ2SgVoLuFMyyp9p1X8H9TgtRhyuiyONCPwXA2GbuszJ2ROm7pJcLHtSF1MiJ9SX9f92SQVEUpnag/?imgmax=800" width="426" height="353" /></a> <a href="http://lh3.ggpht.com/_ayZBUzPGG9A/SvuXSf8Of4I/AAAAAAAAARw/HMxGk7yFuxo/s1600-h/image%5B27%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNGIcZuLtL87caHEBKFZd4_CqE5AEAGPoizWTc9nAcu3fpHZ388wL23r7_xfKSza6USJbGkOB8Z2ajDVB7OlfnkM09s_EYyOVSWvh6SpdIrj7seEwTu2-dPe5br4UwqSMLaWfXXP_Gkw/?imgmax=800" width="428" height="357" /></a> </p>
<p><em>Note: I added x to some of the values so that the length varies among strings.</em></p>
<p>Let’s preview the report, to see what happens. I used the values “Adam” for the first execution and “Bob” for the second execution.  You should see entries like below in your procedure cache.</p>
<p><a href="http://lh3.ggpht.com/_ayZBUzPGG9A/SvuXTKsH4AI/AAAAAAAAAR4/8RoZUFaaE0w/s1600-h/image%5B32%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMjyNloRH-vt70Ph4kHr53sMxiwL731HIKNF4MD1n9DsvY5DrK_5AvcgGn6fIHUN6et2ZfQ03mzvIGR1rt0hTYvnLvcJ1k3nMFbVm0PB7pdW5gdzey2lzWknjDSWGs-BB7aMxMXqgLkA/?imgmax=800" width="690" height="202" /></a> </p>
<p></p>
<p>The primary difference between the two execution plans is the size of the declared variable.  In the case of “Adam” the variable was declared as a nvarchar(4) and for “Bob” a nvarchar(3).  Because the size is different a new query plan was created.  </p>
<p>These are the couple of scenarios that I am currently aware of that cause the plan cache to behave in this manner.  I am sure there are other quirks that can cause this problem.  So the big question left on the table is…. how do I fix this problem?  The answer is to use stored procedures.  </p>
<p>I will start by fixing scenario two.  Create the following procedure in the database.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> usp_Fix_Scenario2(@var <span class="kwrd">varchar</span>(10))</pre>
<pre><span class="kwrd">AS</span> </pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">SELECT</span> id,ColA,[colB]</pre>
<pre class="alt"> <span class="kwrd">FROM</span> dbo.SSRS_Cache_Bloat</pre>
<pre> <span class="kwrd">WHERE</span> ColA <span class="kwrd">IN</span>(@var)</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre>GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>You will see a single entry in the procedure cache, for both “Adam” and “Bob”.  As you can see the execution count is at two, which means the optimizer reused an existing plan.</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_AyacTCfbmO4MH-r1BxJtUybt08juaW90R5DO9GVnjmIOyPj2pDfYdYyU1YNmHiJ1y8CIxtksxDZjrT4C4G8thuKDZ1syk4FF-SbR2Ro-TEhZ8lUdnRUOsqmc2XGHq6rrIlTZBE_k0g/s1600-h/image%5B36%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/SvuXUQUlfJI/AAAAAAAAASE/gQzB-UmGckY/image_thumb%5B22%5D.png?imgmax=800" width="690" height="134" /></a> </p>
<p>Now let’s fix scenario one.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> usp_Fix_Scenario1(@var <span class="kwrd">varchar</span>(100))</pre>
<pre><span class="kwrd">AS</span> </pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DECLARE</span> @x XML</pre>
<pre class="alt"> <span class="kwrd">SET</span> @x = <span class="str">'<i>'</span> + REPLACE(@var,<span class="str">','</span>,<span class="str">'</i><i>'</span>) + <span class="str">'</i>'</span></pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">SELECT</span> id,ColA,[colB]</pre>
<pre> <span class="kwrd">FROM</span> dbo.SSRS_Cache_Bloat</pre>
<pre class="alt"> <span class="kwrd">WHERE</span> ColA <span class="kwrd">IN</span>(</pre>
<pre> <span class="kwrd">SELECT</span> x.i.<span class="kwrd">value</span>(<span class="str">'.'</span>,<span class="str">'varchar(10)'</span>)</pre>
<pre class="alt"> <span class="kwrd">FROM</span> @x.nodes(<span class="str">'/i'</span>) x(i)</pre>
<pre> )</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre>GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjicAiwWbCPIEgVq4vDHil8lJLm_B9Hpc0WDaRxNRqYaZkmvwXxjTDyuoopbqK5EmJSsIUbQscyk-9Zc3QgRR_qmS3kgYHo-b0E-5kLt6bsLw7ZQkZmHD73uHA1Oj4Q1ctatEUAIqzhwA/s1600-h/image%5B40%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh6.ggpht.com/_ayZBUzPGG9A/SvuXVEHWGOI/AAAAAAAAASM/H1V1x3gBkgg/image_thumb%5B24%5D.png?imgmax=800" width="692" height="101" /></a>
<p>As you can obviously see using stored procedures is by far a best practice.  Stored procedures allow the greatest security, flexibility, manageability, and performance. I really cannot see a reason to use embedded TSQL at all and hopefully at this point you feel the same way.  All-in-all, we learned a valuable lesson in this post.  You cannot always trust that easier is better even if Microsoft says it is okay.  SSRS is tool written with the developer in mind and the DBA perspective is neglected.  If DBAs knew what SSRS is really doing behind the scenes, embedded TSQL would be outlawed. We as DBAs have to know and expose potential performance problems for all applications including Microsoft applications.  I hope that my exposing these flaws within SSRS, will help you and your environment adhere to better SSRS practices.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com8tag:blogger.com,1999:blog-4646137438366687895.post-52412048613563003292009-10-28T17:38:00.001-07:002009-11-11T21:28:49.269-08:00Locking A Table, While It Is Being Loaded, And Minimizing Down Time<p>I came across an interesting thread in the MVP newsgroups a few weeks back and thought I would share it’s content here.  The thread was about providing a reliable method to load a table, while keeping the downtime to a minimum.  As an added bonus the users still want to be able to query the old data, while you are loading the new data. I would like to point out that I did not come up with this solution.  I got this solution from Aaron Bertrand, <a title="http://sqlblog.com/blogs/aaron_bertrand/" href="http://sqlblog.com/blogs/aaron_bertrand/">http://sqlblog.com/blogs/aaron_bertrand/</a>.  His solution is absolutely fantastic.  Aaron's solution is by far the best way to attack this problem, in my opinion.</p> <p>Instinctively,  the first solution that comes to most of our minds is an insert into/select statement, with locking hints.  This solution is not a scalable one and does not adhere to our business requirements.   I will start by creating a sample table and then we can dig into how to solve this problem.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">USE</span> [tempdb]</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> name = <span class="str">'TestData'</span>)</pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> [dbo].[TestData];</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[TestData](</pre>
<pre class="alt">RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,</pre>
<pre>SomeCode <span class="kwrd">CHAR</span>(2)</pre>
<pre class="alt">);</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>INSERT <span class="kwrd">INTO</span> [dbo].[TestData] (RowNum,SomeCode)</pre>
<pre class="alt"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 10000</pre>
<pre> ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,</pre>
<pre class="alt"> <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)</pre>
<pre> + <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode</pre>
<pre class="alt"><span class="kwrd">FROM</span> </pre>
<pre> [Master].[dbo].[SysColumns] t1,</pre>
<pre class="alt"> [Master].[dbo].[SysColumns] t2</pre>
<pre><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Okay with our table out of the way, we can start to really think about how to solve this problem.  The first step in solving this problem is to create two new schemas.  The first schema is called “Holder” and this schema will be a holder or container, for our table that we will be loading with new data. </p>
<div class="csharpcode">
<pre class="alt">--<span class="kwrd">create</span> Holder <span class="kwrd">schema</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">SCHEMA</span> [Holder];</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The Holder schema does just what the name implies… it holds the table that I will be inserting into.  The next step is to create a table that matches the same definitions, as the one above, but in the Holder schema.  </p>
<div class="csharpcode">
<pre class="alt">--<span class="kwrd">Create</span> TestData <span class="kwrd">table</span> <span class="kwrd">in</span> the Holder <span class="kwrd">schema</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [Holder].[TestData](</pre>
<pre class="alt">RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,</pre>
<pre>SomeCode <span class="kwrd">CHAR</span>(2)</pre>
<pre class="alt">);</pre>
<pre><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>With our schema and table created, I only have one other schema to create.  The last schema is the Switch schema.  The Switch schema is used as an intermediary schema to house the current source table (dbo.TestData) while the loaded table in the Holder schema (Holder.TestData) is transferred to the dbo schema (dbo.TestData).  </p>
<div class="csharpcode">
<pre class="alt">--<span class="kwrd">Create</span> Switch <span class="kwrd">schema</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">SCHEMA</span> [Switch];</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p> This solution adheres to all of our business rules and reduces downtime to the amount of time required to perform a schema metadata operation, which is nearly instantaneous.  This is a very scalable solution because the loading of the data is completely transparent to the users, all-the-while allowing them to query the stale data.  Let’s have a look at the final solution:</p>
<div class="csharpcode">
<pre class="alt">--<span class="kwrd">Create</span> <span class="kwrd">procedure</span> <span class="kwrd">to</span> <span class="kwrd">load</span> the <span class="kwrd">table</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> [dbo].[usp_LoadTestData]</pre>
<pre class="alt"><span class="kwrd">AS</span></pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span>;</pre>
<pre class="alt"> </pre>
<pre>--<span class="kwrd">Truncate</span> holder <span class="kwrd">table</span></pre>
<pre class="alt"><span class="kwrd">TRUNCATE</span> <span class="kwrd">TABLE</span> [Holder].[TestData];</pre>
<pre> </pre>
<pre class="alt">--<span class="kwrd">load</span> <span class="kwrd">new</span> <span class="kwrd">data</span> <span class="kwrd">into</span> holder <span class="kwrd">table</span></pre>
<pre>INSERT <span class="kwrd">INTO</span> [Holder].[TestData] (RowNum,SomeCode)</pre>
<pre class="alt"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 500</pre>
<pre> ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,</pre>
<pre class="alt"> <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65)</pre>
<pre> + <span class="kwrd">CHAR</span>(ABS(CHECKSUM(NEWID()))%26+65) <span class="kwrd">AS</span> SomeCode</pre>
<pre class="alt"><span class="kwrd">FROM</span> </pre>
<pre> [Master].[dbo].[SysColumns] t1,</pre>
<pre class="alt"> [Master].[dbo].[SysColumns] t2</pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">BEGIN</span> <span class="kwrd">TRANSACTION</span></pre>
<pre> <span class="rem">-- move "live" table into the switch schema</span></pre>
<pre class="alt"> <span class="kwrd">ALTER</span> <span class="kwrd">SCHEMA</span> [Switch] TRANSFER [dbo].[TestData];</pre>
<pre> </pre>
<pre class="alt"> <span class="rem">-- move holder populated table into the "live" or dbo schema</span></pre>
<pre> <span class="kwrd">ALTER</span> <span class="kwrd">SCHEMA</span> [dbo] TRANSFER [Holder].[TestData];</pre>
<pre class="alt"> </pre>
<pre> <span class="rem">-- Move the prior table to the holder schema</span></pre>
<pre class="alt"> <span class="kwrd">ALTER</span> <span class="kwrd">SCHEMA</span> [Holder] TRANSFER [Switch].[TestData];</pre>
<pre><span class="kwrd">COMMIT</span> <span class="kwrd">TRANSACTION</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>Let’s see how it works.  Execute the code and then query the dbo.TestData table.  You will see that the table now contains 500 rows instead of the 10000 I started with.  </p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">EXEC</span> dbo.[usp_LoadTestData];</pre>
<pre><span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> dbo.TestData;</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><em>Note: If you are interested in seeing how the solution handles locks, you can strip all the code out of the stored procedure and run everything but the commit transaction.  You can then open another window and try to query the table, which will result in a wait until the schema transfer is committed.</em>  </p>
<p>That’s it!!! This is by far the best method I have seen to solve this business problem.  It is very scalable,  has very little downtime, and is not affected by the NOLOCK hint.  I am really happy and thankful that Aaron shared his solution.  This solution has given me a lot of insight to solving this problem and similar problems.  Hopefully this post will have the same effect on you.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com2tag:blogger.com,1999:blog-4646137438366687895.post-19532016469953436192009-10-23T22:33:00.001-07:002009-11-11T21:28:49.270-08:00Converting A Scalar User Defined Function To A Inline Table Valued Function<p>In my last post <a title="http://jahaines.blogspot.com/2009/10/io-stats-what-are-you-missing.html" href="http://jahaines.blogspot.com/2009/10/io-stats-what-are-you-missing.html">http://jahaines.blogspot.com/2009/10/io-stats-what-are-you-missing.html</a>, I talked about the performance problems associated with scalar user defined functions and how SSMS may report invalid IO statistics, for scalar UDFs.  In this post, I will focusing on how to transform those pesky scalar UDFs into more scalable function.  When developing user defined functions, you have to keep a few things in mind.  Firstly, scalar UDFs are evaluated for each row returned by the query.  Additionally, SQL Server is not able to maintain statistics and optimize any function, except an inline table valued function.  Lastly, most code logic does not necessarily need to be encapsulated, in a function.  You may get better performance if you choose to use a derived table instead of a function; however, the biggest problem with a derived table is it cant be encapsulated and reused across an application.  An inline TVF is really useful if you need to encapsulate business logic and reuse it throughout an application.  Let’s start by creating the sample table DDL.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">USE</span> [tempdb]</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> <span class="kwrd">EXISTS</span>(<span class="kwrd">SELECT</span> 1 <span class="kwrd">FROM</span> sys.tables <span class="kwrd">WHERE</span> NAME = <span class="str">'TestData'</span>)</pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> dbo.[TestData];</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>--<span class="kwrd">Create</span> Sample <span class="kwrd">Table</span></pre>
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> dbo.TestData(</pre>
<pre>RowNum <span class="kwrd">INT</span> <span class="kwrd">PRIMARY</span> <span class="kwrd">KEY</span>,</pre>
<pre class="alt">SomeId <span class="kwrd">INT</span>,</pre>
<pre>SomeDate DATETIME</pre>
<pre class="alt">);</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>INSERT <span class="kwrd">INTO</span> dbo.TestData </pre>
<pre class="alt"><span class="kwrd">SELECT</span> <span class="kwrd">TOP</span> 1000 </pre>
<pre> ROW_NUMBER() <span class="kwrd">OVER</span> (<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> t1.NAME) <span class="kwrd">AS</span> RowNumber,</pre>
<pre class="alt"> ABS(CHECKSUM(NEWID()))%250 <span class="kwrd">AS</span> SomeId, </pre>
<pre> DATEADD(<span class="kwrd">DAY</span>,ABS(CHECKSUM(NEWID()))%1000,<span class="str">'2008-01-01'</span>) <span class="kwrd">AS</span> SomeDate</pre>
<pre class="alt"><span class="kwrd">FROM</span> </pre>
<pre> Master.dbo.SysColumns t1</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Next I am going to create two functions.  One function will be a scalar UDF and the other will be an inline table valued function.  If you do not know what an inline TVF is, an inline TVF is like a parameterized view and is subject to the same restrictions as a view.  For more information you can read the following post in BOL, <a title="http://msdn.microsoft.com/en-us/library/ms189294.aspx" href="http://msdn.microsoft.com/en-us/library/ms189294.aspx">http://msdn.microsoft.com/en-us/library/ms189294.aspx</a>. </p>
<div class="csharpcode">
<pre class="alt">--<span class="kwrd">Create</span> Scalar <span class="kwrd">Function</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_SomeFunction(@SomeId <span class="kwrd">INT</span>)</pre>
<pre class="alt"><span class="kwrd">RETURNS</span> DATETIME</pre>
<pre><span class="kwrd">AS</span></pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DECLARE</span> @dt DATETIME</pre>
<pre class="alt"> <span class="kwrd">SELECT</span> @dt = <span class="kwrd">MAX</span>(SomeDate)</pre>
<pre> <span class="kwrd">FROM</span> dbo.TestData</pre>
<pre class="alt"> <span class="kwrd">WHERE</span> SomeId = @SomeId</pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">RETURN</span> @dt</pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt">--<span class="kwrd">Create</span> Inline <span class="kwrd">Table</span> Valued <span class="kwrd">Function</span></pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_SomeInlineFunction()</pre>
<pre class="alt"><span class="kwrd">RETURNS</span> <span class="kwrd">TABLE</span></pre>
<pre><span class="kwrd">RETURN</span>(</pre>
<pre class="alt"> <span class="kwrd">SELECT</span> SomeId, <span class="kwrd">MAX</span>(SomeDate) <span class="kwrd">AS</span> SomeDate</pre>
<pre> <span class="kwrd">FROM</span> dbo.TestData</pre>
<pre class="alt"> <span class="kwrd">GROUP</span> <span class="kwrd">BY</span> SomeId</pre>
<pre>)</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
</div>
<p>All of our DDL is in place.  All that is left is to test the performance.  If you read my last post, you should be expecting the inline TVF to out perform the scalar UDF.  Let’s see what actually transpires.  Discard the query results to the grid by clicking Tools –> Options –> Query Results –> SQL Server –> Results To Grid –> Discard results after execution.  Next open SQL Server profiler and use the standard template.  Run the code below to capture the performance counters in profiler.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">SELECT</span> SomeId,dbo.fn_SomeFunction(SomeId)</pre>
<pre><span class="kwrd">FROM</span> dbo.[TestData]</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">SELECT</span> TestData.SomeId,max_dt.SomeDate</pre>
<pre><span class="kwrd">FROM</span> dbo.[TestData]</pre>
<pre class="alt"><span class="kwrd">INNER</span> <span class="kwrd">JOIN</span> dbo.fn_SomeInlineFunction() <span class="kwrd">AS</span> max_dt</pre>
<pre> <span class="kwrd">ON</span> max_dt.SomeId = TestData.SomeId</pre>
<pre class="alt">GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The results of the queries should look similar to my results below.    The things of note are the reads and the CPU required to satisfy the each query.  The number of reads and CPU required to satisfy the scalar UDF  is astronomically greater than the inline TVF.  <style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style></p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicrPLoPKGhLy1i2Pq5qnKyEzRX1RTYii4-vly2AZehUPmzcIm9uMsC8vEkOLUomNYwtDbr9CNBj5PaDJsg-IMuIz0m-8mTJA4DNMe7Gi6e2xDxGqaYwruxR6fevSiu-FIDZU8qdJtYhA/s1600-h/image%5B4%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirAhWOk3Z053PG3G9Mw5mC8yrkl6lO2vt9RqvKSl1vHfVXL5V1Xrmc86sST-GH69RJdGcu_Z3cnKWAYSA0JmWrHLttMYs-5bHJncDLASs65pJdok8DGREuiewwaABshdrWvBXHO3F45g/?imgmax=800" width="814" height="138" /></a> </p>
<p>If the above screenshot is not enough to discourage you from using scalar UDFs, I do not know what can.  The point being that there are all kinds of great alternatives to encapsulating code logic, without the use of scalar functions.  Inline TVFs offer a SET based  approach for encapsulating business logic; plus the optimizer is able to use existing statistics and indexes to optimize inline TVFs.  It is my recommendation that you should try to convert all scalar UDFs to inline TVFs.  I know this is not always possible, but it is a good start.  I typically try to stay away from scalar and multi-line UDFs, unless absolutely necessary.  I hope that you have learned something new and that you can use this example to get the needed signoff to change those problematic scalar UDFs, into inline TVFs.</p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com2tag:blogger.com,1999:blog-4646137438366687895.post-8766098267602861882009-10-20T19:17:00.001-07:002009-11-11T21:28:49.270-08:00IO Stats – What Are You Missing?<p>In this post, I will address a common misconception that IO statistics are always reliable.  The truth of the matter is IO stats can sometimes yield incorrect information, which in turn may influence bad coding habits. I have had developers tell me that the scalar UDF query is better because it has less IO than the set based TVF query.  In situations like these it is important to express the message that you must account for more than IO when implementing optimization techniques, but in some cases IO can be misinterpreted. The question on the table is, “How can IO statistics be wrong?”  The short answer is IO stats are mostly correct and only under certain circumstances IO statistics are misrepresented in SSMS.  So what are these magical circumstances? The IO statistics become invalid anytime a scalar UDF is used.  The optimizer only accounts for the base table and not any of the IO encountered inside the scalar UDF, which misconstrues the query IO.  Let’s look at an example.</p> <p>First I will create the tables, with data.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">USE</span> [tempdb]</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.t1'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> tempdb.dbo.t1;</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> t1(</pre>
<pre class="alt">id <span class="kwrd">INT</span>,</pre>
<pre>col <span class="kwrd">CHAR</span>(1)</pre>
<pre class="alt">);</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>INSERT <span class="kwrd">INTO</span> t1 <span class="kwrd">VALUES</span> (1,<span class="str">'a'</span>);</pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> t1 <span class="kwrd">VALUES</span> (2,<span class="str">'b'</span>);</pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> OBJECT_ID(<span class="str">'tempdb.dbo.t2'</span>) <span class="kwrd">IS</span> <span class="kwrd">NOT</span> <span class="kwrd">NULL</span></pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> <span class="kwrd">DROP</span> <span class="kwrd">TABLE</span> tempdb.dbo.t2;</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> t2(</pre>
<pre class="alt">t2_id <span class="kwrd">INT</span> <span class="kwrd">IDENTITY</span>(1,1),</pre>
<pre>t1_id <span class="kwrd">INT</span>,</pre>
<pre class="alt">col <span class="kwrd">CHAR</span>(1)</pre>
<pre>);</pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre> </pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (1,<span class="str">'c'</span>);</pre>
<pre>INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (1,<span class="str">'d'</span>);</pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (1,<span class="str">'e'</span>);</pre>
<pre>INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (1,<span class="str">'f'</span>);</pre>
<pre class="alt">INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (2,<span class="str">'d'</span>);</pre>
<pre>INSERT <span class="kwrd">INTO</span> t2 <span class="kwrd">VALUES</span> (2,<span class="str">'g'</span>);</pre>
<pre class="alt">GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>The next step is to create our scalar UDF.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">FUNCTION</span> dbo.fn_ConcatenateCols(@id <span class="kwrd">INT</span>)</pre>
<pre><span class="kwrd">RETURNS</span> <span class="kwrd">VARCHAR</span>(8000)</pre>
<pre class="alt"><span class="kwrd">AS</span></pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> <span class="kwrd">DECLARE</span> @rtn <span class="kwrd">varchar</span>(8000)</pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">SELECT</span> @rtn = <span class="kwrd">COALESCE</span>(@rtn + <span class="str">','</span>,<span class="str">''</span>) + t2.col</pre>
<pre> <span class="kwrd">FROM</span> dbo.t2</pre>
<pre class="alt"> <span class="kwrd">WHERE</span> t2.t1_id = @id</pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">RETURN</span> @rtn</pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt">GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Now that I have all my sample DDL in place, we can run a simple test to measure our IO.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">SET</span> NOCOUNT <span class="kwrd">ON</span> </pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"><span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">ON</span></pre>
<pre><span class="kwrd">GO</span></pre>
<pre class="alt"> </pre>
<pre>--Missing I/O</pre>
<pre class="alt"><span class="kwrd">SELECT</span> id,dbo.fn_ConcatenateCols(id)</pre>
<pre><span class="kwrd">FROM</span> [dbo].[t1];</pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">SET</span> <span class="kwrd">STATISTICS</span> IO <span class="kwrd">OFF</span></pre>
<pre class="alt"><span class="kwrd">GO</span></pre>
<pre>/*</pre>
<pre class="alt">id </pre>
<pre>---------<span class="rem">-- --------</span></pre>
<pre class="alt">1 c,d,e,f</pre>
<pre>2 d,g</pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">Table</span> <span class="str">'t1'</span>. Scan <span class="kwrd">count</span> 1, logical <span class="kwrd">reads</span> 1, physical <span class="kwrd">reads</span> 0, <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0, lob logical <span class="kwrd">reads</span> 0, lob physical <span class="kwrd">reads</span> 0, lob <span class="kwrd">read</span>-ahead <span class="kwrd">reads</span> 0.</pre>
<pre class="alt">*/</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style><style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p></p>
<p>So what is missing from the results above?  If you look closely you will see that <strong>t2 </strong>is nowhere in the IO stats.  Let’s start a profiler trace and run the same query again.  Open SQL Server profiler and use the standard template. Once the profile is tracing, run the same query again. If you want the most accurate number of reads make sure to turn off query results Tools –> Options –> Query Results –> SQL Server –> Results To Grid –> Discard Results After Execution. This time around you will see the number of reads is 9 , as shown below.</p>
<p><a href="http://lh5.ggpht.com/_ayZBUzPGG9A/St5vOpsIM_I/AAAAAAAAAQw/mY2d8wIJPC0/s1600-h/image%5B3%5D.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://lh3.ggpht.com/_ayZBUzPGG9A/St5vPTi_pJI/AAAAAAAAAQ0/2Y6DCLs7WmY/image_thumb%5B1%5D.png?imgmax=800" width="434" height="313" /></a> </p>
<p>As you can see IO statistics are a lot different than the actual number of logical reads, using IO Statistics.  This behavior can be a huge surprise to many unsuspecting victims.  Yes I did use the word victim :).  I say victim because this usually occurs to an individual that expects IO to be presented correctly and trusts Microsoft enough to not question their information.  This “victim” never thinks twice about questioning the information returned, which can be a huge performance problem.  </p>
<p>The take away is developers should always be careful when using scalar functions because they can really degrade performance and never trust anyone’s word,  not even Microsoft’s or mine. <strong>Always</strong> test yourself.  If you have not done so, I recommend reading my post on correlated subqueries, as it does apply to functions as well, <a title="http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html" href="http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html">http://jahaines.blogspot.com/2009/06/correlated-sub-queries-for-good-or-evil.html</a>.  In my next post, I will show you how to get rid scalar functions and use Inline TVFs to optimize performance, while encapsulating code logic. </p>
<p>Until next time, happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com0tag:blogger.com,1999:blog-4646137438366687895.post-31339346268423732942009-10-16T23:19:00.001-07:002009-11-11T21:28:49.271-08:00Exporting Binary Files To The File System<p>In my last post I demonstrated how to use SSIS to load binary files into a SQL Server 2005 VARBINARY(MAX) column, <a title="http://jahaines.blogspot.com/2009/10/ssis-importing-binary-files-into.html" href="http://jahaines.blogspot.com/2009/10/ssis-importing-binary-files-into.html">http://jahaines.blogspot.com/2009/10/ssis-importing-binary-files-into.html</a>.  This post will focus on recreating the binary documents on the file system.  I will be using a combination of TSQL and the BCP utility to perform the export, <a title="http://msdn.microsoft.com/en-us/library/ms162802.aspx" href="http://msdn.microsoft.com/en-us/library/ms162802.aspx">http://msdn.microsoft.com/en-us/library/ms162802.aspx</a>.  I will be using the same table and data from the last post.  I will start by creating the TSQL to dynamically create a BCP command.</p> <p>Below is the stored procedure I will use to export the data.  You will see that I have decided to use a cursor to process all documents.  The procedure also accepts a DocID, which limits the export to a single document.  A cursor is fine here because we are limited to executing a single BCP command; however, you could create an SSIS package that executes the stored procedure across multiple streams, if you need parallel processing.</p> <div class="csharpcode"> <pre class="alt"><span class="kwrd">CREATE</span> <span class="kwrd">PROCEDURE</span> usp_ExportBinaryFiles(</pre>
<pre> @DocID <span class="kwrd">INT</span> = <span class="kwrd">NULL</span>,</pre>
<pre class="alt"> @OutputFilePath <span class="kwrd">VARCHAR</span>(500) = <span class="str">'C:\'</span></pre>
<pre>)</pre>
<pre class="alt"><span class="kwrd">AS</span> </pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">DECLARE</span> @<span class="kwrd">sql</span> <span class="kwrd">VARCHAR</span>(8000)</pre>
<pre class="alt"> </pre>
<pre><span class="kwrd">IF</span> @DocID <span class="kwrd">IS</span> <span class="kwrd">NULL</span> --<span class="kwrd">Open</span> <span class="kwrd">Cursor</span> <span class="kwrd">to</span> export <span class="kwrd">all</span> images</pre>
<pre class="alt"><span class="kwrd">BEGIN</span></pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">DECLARE</span> curExportBinaryDocs <span class="kwrd">CURSOR</span> FAST_FORWARD <span class="kwrd">FOR</span></pre>
<pre> <span class="kwrd">SELECT</span> <span class="str">'BCP "SELECT Doc FROM [tempdb].[dbo].[Documents] WHERE DocId ='</span> </pre>
<pre class="alt"> + <span class="kwrd">CAST</span>(DocId <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(5)) + <span class="str">'" queryout '</span> + @OutputFilePath </pre>
<pre> + DocName + <span class="str">'.'</span> + DocType + <span class="str">' -S A70195\Dev -T -fC:\Documents\Documents.fmt'</span></pre>
<pre class="alt"> <span class="kwrd">FROM</span> dbo.Documents</pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">OPEN</span> curExportBinaryDocs</pre>
<pre> <span class="kwrd">FETCH</span> <span class="kwrd">NEXT</span> <span class="kwrd">FROM</span> curExportBinaryDocs <span class="kwrd">INTO</span> @<span class="kwrd">sql</span></pre>
<pre class="alt"> </pre>
<pre> <span class="kwrd">WHILE</span> <span class="preproc">@@FETCH_STATUS</span> = 0</pre>
<pre class="alt"> <span class="kwrd">BEGIN</span></pre>
<pre> --<span class="kwrd">PRINT</span> @<span class="kwrd">sql</span></pre>
<pre class="alt"> <span class="kwrd">EXEC</span> xp_cmdshell @<span class="kwrd">sql</span>,NO_OUTPUT</pre>
<pre> </pre>
<pre class="alt"> <span class="kwrd">FETCH</span> <span class="kwrd">NEXT</span> <span class="kwrd">FROM</span> curExportBinaryDocs <span class="kwrd">INTO</span> @<span class="kwrd">sql</span></pre>
<pre> <span class="kwrd">END</span></pre>
<pre class="alt"> </pre>
<pre> <span class="kwrd">CLOSE</span> curExportBinaryDocs</pre>
<pre class="alt"> <span class="kwrd">DEALLOCATE</span> curExportBinaryDocs</pre>
<pre><span class="kwrd">END</span></pre>
<pre class="alt"><span class="kwrd">ELSE</span> --Export a single image</pre>
<pre><span class="kwrd">BEGIN</span></pre>
<pre class="alt"> <span class="kwrd">SELECT</span> @<span class="kwrd">sql</span> = <span class="str">'BCP "SELECT Doc FROM [tempdb].[dbo].[Documents] WHERE DocId ='</span> </pre>
<pre> + <span class="kwrd">CAST</span>(DocId <span class="kwrd">AS</span> <span class="kwrd">VARCHAR</span>(5)) + <span class="str">'" queryout '</span> + @OutputFilePath </pre>
<pre class="alt"> + DocName + <span class="str">'.'</span> + DocType + <span class="str">' -S A70195\Dev -T -fC:\Documents\Documents.fmt'</span></pre>
<pre> <span class="kwrd">FROM</span> dbo.Documents</pre>
<pre class="alt"> <span class="kwrd">WHERE</span> DocID = @DocID</pre>
<pre> </pre>
<pre class="alt"> --<span class="kwrd">PRINT</span> @<span class="kwrd">sql</span></pre>
<pre> <span class="kwrd">EXEC</span> xp_cmdshell @<span class="kwrd">sql</span>,NO_OUTPUT</pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre> </pre>
<pre class="alt"><span class="kwrd">END</span></pre>
<pre>GO</pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The above stored procedure dynamically builds a BCP command.  Currently the stored procedure has two main parameters, DocID and OutputFilePath.  DocID is used when you want to export a single row, if no value is supplied the stored procedure will export all documents.  The outputFilePath is the directory where the files will be exported.  You will note that I have hard coded my server name and format file path.  You can add additional parameters if you need these attributes to be dynamic.  The next item on my list is to actually create the format file.  Create a .fmt file somewhere on your file system.   I put mine in the same folder as my other documents.  Copy the code below into the format file.</p>
<div class="csharpcode">
<pre class="alt">9.0 </pre>
<pre>1 </pre>
<pre class="alt">1 SQLBINARY 0 0 <span class="str">""</span> 1 Doc <span class="str">""</span> </pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>You may be asking yourself, why the format file looks skimpy, or is lacking content.  The format file is lacking content because it only has what we need.  All we need in the format file is the VARBINARY(MAX) column and its data type. You can find more info on format files here, <a title="http://msdn.microsoft.com/en-us/library/ms191516.aspx" href="http://msdn.microsoft.com/en-us/library/ms191516.aspx">http://msdn.microsoft.com/en-us/library/ms191516.aspx</a>.</p>
<p>Once the format file is in place, we can execute our stored procedure to process all documents.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">EXEC</span> dbo.usp_ExportBinaryFiles @OutputFilePath = <span class="str">'C:\Documents\BCP_Out\'</span></pre>
</div>
<style type="text/css">
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>That’s it!  It is that easy to output VARBINARY data onto the file system.  SQL Server 2005 has a slue of tools that make working with binary data very simplistic.  I hope that you have learned something new. Please stay tuned, as I plan to focus more on TSQL concepts and performance considerations.</p>
<p>Until next time happy coding.</p> Adam Haineshttp://www.blogger.com/profile/16288608920551626835noreply@blogger.com16