DBPedias

Your Database Knowledge Community

Brad Schulz

  1. T-SQL Tuesday #024: The Roundup

    T-SQL TuesdayThanx to everyone for their contributions to T-SQL Tuesday #024: Prox ‘n’ Funx.

    The roundup is finally here… Well, shux, I’ve been so busy that I almost let it fall through the crax, but before weex and weex went by, I finally read through the stax of submissions. I certainly had a lot of kix reading them… you could hear shriex of delight coming out of my office window.

    (Okay, I admit it… The roundup was late because I spent a lot of time looking up words that end with a -ks sound… It loox like you can count me among the ranx of freax ‘n’ geex ‘n’ punx ‘n’ jerx ‘n’ dorx ‘n’ quax who carry something a little too far… It stinx, doesn’t it?).

    Methinx it was a success! The SQL Community rox!

    But don’t take it from me… Read the submissions for yourself. Click on the linx below for some good reading.

    declare @Subject varchar(100)='T-SQL Tuesday #024: Prox ''n'' Funx';
     
    insert [Brad Schulz].Blog 
    select Subject='T-SQL Tuesday #024 Roundup'
          ,Content
    from 
      (
      select Content from [Rob Farley].Blog where Subject=@Subject
    union all
      select Content from [Noel McKinney].Blog where Subject=@Subject
    union all
      select Content from [Greg Lucas].Blog where Subject=@Subject
    union all
      select Content from [Michael J. Swart].Blog where Subject=@Subject
    union all
      select Content from [Kent Chenery].Blog where Subject=@Subject
    union all
      select Content from [Steve Wales].Blog where Subject=@Subject
    union all
      select Content from [Merrill Aldrich].Blog where Subject=@Subject
    union all
      select Content from [Aaron Bertrand].Blog where Subject=@Subject
    union all
      select Content from [Pinal Dave].Blog where Subject=@Subject
    union all
      select Content from [Rich Brown].Blog where Subject=@Subject
    union all
      select Content from [Bob Pusateri].Blog where Subject=@Subject
    union all
      select Content from [Kerry Tyler].Blog where Subject=@Subject
    union all
      select Content from [Jes Schultz Borland].Blog where Subject=@Subject
    union all
      select Content from [Thomas Rushton].Blog where Subject=@Subject
    union all
      select Content from [Jason Brimhall].Blog where Subject=@Subject
    union all
      select Content from [Nancy Hidy Wilson].Blog where Subject=@Subject
    union all
      select Content from [David Howard].Blog where Subject=@Subject
    union all
      select Content from [Brad Schulz].Blog where Subject=@Subject
    ) Contributions
    
    Rob Farley contributes a wonderfully detailed post comparing inline functions and non-inline functions, filled with lots of examples and query plans. Rob categorically states that “if it’s not inline, it’s rubbish.” I couldn’t agree more, but then again, I’m always a sucker for any blog post that uses the word “rubbish”.

    Noel McKinney talks about a situation concerning the abuse and overuse of stored procedures. Imagine a database that is completely devoid of data and yet consumes truckloads of disk space… because it houses 27,000 stored procedures! A good story to keep in the back of your mind in designing solutions.

    Greg Lucas gives us a detailed explanation of a useful ExceptionHandler procedure that is designed to be used in the CATCH block of a TRY…CATCH construct, logging detailed information about the error in a table. It encapsulates and standardizes error handling so that the developer can save time in coding.

    Michael J. Swart talks about some interesting numbers related to stored procedures. At first I thought it would be the usual numbers like “maximum parameters allowed” and “maximum nested levels”, but if you know Michael, then you know he wouldn’t only spout out dry statistics like that. By the time you finish his list, you’ll be struggling to catch your breath from laughing.

    Kent Chenery joins T-SQL Tuesday for the first time, contributing a CLR routine for calculating Easter in any given year. That’s pretty cool… I never even thought about the fact that an algorithm existed for calculating Easter. Make sure you read the Wikipedia article he references… the detail behind the algorithm is fascinating.

    Steve Wales is another first-time contributor to T-SQL Tuesday. He talks about the differences in coding procedures and functions between SQL Server and that other database created by that company in my neck of the woods, Oracle. He also provides a link to an earlier post of his that warns about the fact that you have to be aware of the compatibility mode when using built-in functions.

    Merrill Aldrich offers his ode to composable solutions, illustrating how table-valued functions (and views) offer maximum composability and flexibility (and that stored procedures don’t offer that same flexibility). As a rabid fan of TVF’s, I couldn’t agree more.

    Aaron Bertrand wrote a terrific post on the shortcomings of INFORMATION_SCHEMA views earlier this month, but he waited for T-SQL Tuesday to post his pièce de résistance, plunging the knife deeper in talking about INFORMATION_SCHEMA.ROUTINES (and syscomments and sp_helptext) and how they just don’t cut it in getting the full procedure definition.

    Pinal Dave talks about the new CUME_DIST() analytic function in SQL2012, which provides a really cool easy-to-implement percentile-calculating capability. Pinal’s post is the first of a series of blog posts about the new SQL2012 analytic functions… make sure you add them to your must-read list.

    Rich Brown tells us about the benefits of using stored procedures for Reporting Services datasets, abstracting away the data layer from the presentation layer. He also brings up an interesting idea of putting Reporting Services procedures into their own schema.

    Bob Pusateri gives an overview of the pros and cons of both procedures and functions. Speaking of procedures, though, I have to also mention that I found Bob’s post of October 18th to be very interesting in terms of using a cool trick of using sp_executesql to create a view in another database besides the current one.

    Kerry Tyler discusses scenarios (and some myths) regarding transactions being left open by procedures that produce errors or experience timeouts, concluding with a discussion on XACT_ABORT.

    Jes Schultz Borland expresses her frustration about the fact that a Reporting Services dataset only recognizes the first result set from a procedure that produces multiple result sets. It is confusing, being that the term dataset has different meanings across platforms (in .NET the dataset class truly is a collection of datatables).

    Thomas Rushton makes a “Hell yes” case in answer to the question, “Should I learn to use stored procedures?” And he follows up with a second contribution talking about the best practice in releasing stored procedure scripts (or scripts for any kind of object for that matter).

    Jason Brimhall talks about functions… specifically functions that he is thankful for, functions he can’t live without, and functions that he had fun with in delving into SQL Server internals. I’m going to set aside some time to read more about those.

    Nancy Hidy Wilson reminds us of the myriad of information we can glean from the good old SERVERPROPERTY() function… and she reminds us to check out DATABASEPROPERTY() and DATABASEPROPERTYEX() as well.

    David Howard discussed bad plans resulting from parameter sniffing and gave a list of techniques to solve the problem.

    And, finally, my contribution talked about a procedure that I use to deal with a client system that contains no stored procedures at all… it finds the original text of ad-hoc queries in a Forced Parameterization database.

    Thanx again for your contributions!
  2. T-SQL Tuesday #024: A Procedure To Deal With No Procedures

    T-SQL TuesdayOh my… Here it is Wednesday, and I’m a day late for T-SQL Tuesday #024, which is hosted this month by some joker named Brad Schulz. Hopefully he’ll understand and accept my late entry… If he doesn’t, then I’ll just have to make him understand.

    The theme for this month is Prox ‘n’ Funx (aka Procedures and Functions)… and I’m going to talk about a procedure that I wrote to help me deal with the lack of procedures.

    You probably had to re-read that latest sentence, didn’t you? Well, this whole blog post might turn your brain inside-out a bit, so you might want to read it slowly or be prepared to re-read it after you finish it. My apologies in advance.

    Here’s the situation… purely hypothetical, you understand… (wink, wink).

    Imagine yourself faced with a SQL Server Database that is just a big dumb file cabinet. All it holds is data, and there isn’t a single procedure or function to be found… every single query that goes to the system is an ad-hoc query constructed within an enormous .NET application.

    For example, a typical query in the C# code may be put together like this:

    string SQL = "";
    SQL += " SELECT soh.SalesOrderNumber ";
    SQL += "       ,soh.ShipDate ";
    SQL += "       ,soh.CustomerID ";
    SQL += "       ,s.Name ";
    SQL += "       ,soh.TotalDue ";
    SQL += " FROM Sales.SalesOrderHeader soh ";
    SQL += " JOIN Person.Address a ON soh.ShipToAddressID=a.AddressID ";
    SQL += " JOIN Person.StateProvince sp ON a.StateProvinceID=sp.StateProvinceID ";
    SQL += " JOIN Sales.Store s ON soh.CustomerID=s.CustomerID ";
    SQL += " WHERE soh.SalesOrderID IN ";
    SQL += "         (SELECT SalesOrderID ";
    SQL += "          FROM Sales.SalesOrderDetail ";
    SQL += "          WHERE ProductID=" + SQLUtils.SQLValue(ProdID) + ") ";
    SQL += "   AND soh.ShipDate>=" + SQLUtils.SQLValue(FromDate);
    SQL += "   AND soh.ShipDate<" + SQLUtils.SQLValue(ToDate);
    SQL += "   AND a.City=" + SQLUtils.SQLValue(CityName);
    SQL += "   AND sp.Name=" + SQLUtils.SQLValue(StateProvinceName);
    DataTable dtResult = SQLUtils.ExecSQLQuery(SQL)
    
    Never mind the thoughts that are going through your head regarding readability. Never mind the maintenance nightmare. Never mind the potential for SQL Injection. Never mind the…

    Oh never mind.

    So hundreds of thousands of ad-hoc queries get sent to the system all day long.

    My job? Make things go faster.

    Okay, fine. However, there’s one other wrinkle. This particular database operates under Forced Parameterization. This may reduce the frequency of query compilations, but it makes my job a little more complicated.

    Let me illustrate… Let’s set the AdventureWorks database to use Forced Parameterization:

    alter database AdventureWorks set parameterization forced
    
    With that in place, every ad-hoc query sent to the system is parameterized… In other words, any literal within the ad-hoc query is converted into a parameter and the query as a whole is parameterized so that its plan can be reused by any subsequent ad-hoc query that has the same “code shape”.

    Let’s execute the following query in AdventureWorks:

    SELECT soh.SalesOrderNumber
    ,soh.ShipDate 
    ,soh.CustomerID
    ,s.Name
    ,soh.TotalDue
    FROM Sales.SalesOrderHeader soh 
    JOIN Person.Address a ON soh.ShipToAddressID=a.AddressID 
    JOIN Person.StateProvince sp ON a.StateProvinceID=sp.StateProvinceID 
    JOIN Sales.Store s ON soh.CustomerID=s.CustomerID 
    WHERE soh.SalesOrderID IN (SELECT SalesOrderID 
    FROM Sales.SalesOrderDetail 
    WHERE ProductID=897)
      AND soh.ShipDate>='20030801'
      AND soh.ShipDate<'20030901'
      AND a.City='Milsons Point'
      AND sp.Name='New South Wales'
    
    (Those of you who read my blog regularly may be shocked by my use of UPPER CASE keywords… I never do that… But I’m doing it here to illustrate a point).

    As you may know, there is a DMV called sys.dm_exec_cached_plans, which contains information about all the query plans that have been cached, along with a plan_handle column that you can use to acquire the actual text of the query that created the plan. There’s also a DMV called sys.dm_exec_query_stats, which contains performance statistics information for cached plans, but its plan_handle (or sql_handle) column points to a different kind of text when you’re dealing with Forced Parameterization.

    Here is a simplified version of a Performance Statistics Query that I run that uses sys.dm_exec_query_stats to acquire performance information so that I can find plans that perform a lot of reads (the ones with the most reads are at the top).:

    select Code=(select [processing-instruction(q)]=':'+nchar(13)+qt.text+nchar(13)
                 for xml path(''),type)
          ,qp.Query_Plan
    ,[#Reads]=qs.total_logical_reads
    ,[#Execs]=qs.execution_count
    from sys.dm_exec_query_stats qs
    cross apply sys.dm_exec_sql_text(plan_handle) qt
    cross apply sys.dm_exec_query_plan(plan_handle) qp
    where qt.text not like '%sys.dm\_%' escape '\'
    order by [#Reads] desc
    
    Here’s what the result looks like in SSMS:



    In case you’re wondering about that “processing-instruction” and XML gibberish for the generation of the first column, you can read more about that here. But, in short, you can see that it creates a hyperlink to the query text. When I click on that link, I can see that the text looks like this:

    (@0 int,@1 varchar(8000),@2 varchar(8000),
    @3 varchar(8000),@4 varchar(8000))select soh . SalesOrderNumber , 
    soh . ShipDate , soh . CustomerID , s . Name , soh . TotalDue 
    from Sales . SalesOrderHeader soh join Person . Address a on 
    soh . ShipToAddressID = a . AddressID join Person . StateProvince sp 
    on a . StateProvinceID = sp . StateProvinceID join Sales . Store s 
    on soh . CustomerID = s . CustomerID where soh . SalesOrderID in 
    ( select SalesOrderID from Sales . SalesOrderDetail where ProductID 
    = @0 ) and soh . ShipDate > = @1 and soh . ShipDate < @2 and 
    a . City = @3 and sp . Name = @4
    
    I have split it into multiple lines for this blog post… In reality it’s one looooonnnng string. But you can see that this is NOT the actual text of the query that I executed. In parameterizing this query, SQL converted all my UPPER CASE keywords into lower case, and it got rid of any extraneous white space (multiple spaces, tabs, carriage returns, line feeds), and it inserted a single space between all commas and periods/full-stops and operators.

    Now when another query comes along that is exactly like the first query (except querying for different values)…

    SELECT soh.SalesOrderNumber
    ,soh.ShipDate 
    ,soh.CustomerID
    ,s.Name
    ,soh.TotalDue
    FROM Sales.SalesOrderHeader soh 
    JOIN Person.Address a ON soh.ShipToAddressID=a.AddressID 
    JOIN Person.StateProvince sp ON a.StateProvinceID=sp.StateProvinceID 
    JOIN Sales.Store s ON soh.CustomerID=s.CustomerID 
    WHERE soh.SalesOrderID IN (SELECT SalesOrderID 
    FROM Sales.SalesOrderDetail 
    WHERE ProductID=870)
      AND soh.ShipDate>='20030101'
      AND soh.ShipDate<'20040101'
      AND a.City='London'
      AND sp.Name='England'
    
    …and we once again execute our Performance Statistics Query to look at the plan cache...



    …we see that the same parameterized query has been reused (note the #Execs is equal to 2).

    (Those of you with a quick eye may have also noticed that the #Reads skyrocketed… I’ll talk about that in a later post).

    So my problem is this: In order to improve the performance of the application, I can use my Performance Statistics Query in order to find those plans that exhibited the most reads and see if they can be tuned via either the addition of an index or via a recommendation of rewriting the query a different way.

    But unfortunately all I get is that parameterized version of the query text. I have no indication what the ACTUAL ad-hoc queries were that were sent to the system. Yes, I can see the text of the ad-hoc queries via the sys.dm_exec_cached_plans DMV, but how do I find the exact ones that have to do with this parameterized query? It’s like looking for a needle in a haystack.

    So I wrote a procedure to do that for me. It essentially takes the parameterized query text and creates a LIKE pattern out of it so that I can find its original ad-hoc code via the plan_handle of the sys.dm_exec_cached_plans DMV.

    I create this procedure in a database called DBATools (which houses other utilities and data that I use for monitoring the server). It starts like so, receiving the text of the parameterized query code and putting it into a local @Text variable:

    use DBATools
    go
     
    create procedure GetAdhocQueries
       @ParameterizedQuery nvarchar(max)
    as
     
    declare @Text nvarchar(max)
    set @Text=@ParameterizedQuery
    
    Its first step is to get rid of the parameter declarations at the beginning of the query. Well, luckily, all these parameterized queries are of the pattern “(@0 … )xxxxx”, where “xxxxx” is one of the following: SELECT or UPDATE or INSERT or DELETE or MERGE or WITH (in the case of a CTE). So let’s use that knowledge to strip off the parameter list:

    /*Get rid of parameter list*/
    select @Text=substring(@Text,isnull(min(Pos),0)+1,len(@Text))
    from (select charindex(N')select ',@Text)
          union all
          select charindex(N')insert ',@Text)
          union all
          select charindex(N')update ',@Text)
          union all
          select charindex(N')delete ',@Text)
          union all
          select charindex(N')merge ',@Text)
          union all
          select charindex(N')with ',@Text)) x(Pos)
    where Pos>0
    
    So that makes our @Text variable looks like the following, stripped of its parameter list (again, I’m artificially word-wrapping the text here… it’s just one loonng string):

    select soh . SalesOrderNumber , soh . ShipDate , 
    soh . CustomerID , s . Name , soh . TotalDue 
    from Sales . SalesOrderHeader soh join Person . Address a on 
    soh . ShipToAddressID = a . AddressID join Person . StateProvince sp 
    on a . StateProvinceID = sp . StateProvinceID join Sales . Store s 
    on soh . CustomerID = s . CustomerID where soh . SalesOrderID in 
    ( select SalesOrderID from Sales . SalesOrderDetail where ProductID 
    = @0 ) and soh . ShipDate > = @1 and soh . ShipDate < @2 and 
    a . City = @3 and sp . Name = @4
    
    Next I need to find all the parameters in the text. They are all surrounded with a single space, so they are easy to find because they have the pattern “ @[0-9] ” (for a single digit parameter) or “ @[0-9][0-9] ” (for a two-digit parameter), etc. So the following will find all 4-digit, 3-digit, 2-digit, and 1-digit parameters in the text and substitute them with a single percent sign (%):

    /*Substitute all parameters (i.e. @0 or @1 or ...) with Percents (%)*/
    declare @NumDigs tinyint
           ,@Pos int
    set @NumDigs=5
    while @NumDigs>1
    begin
      set @NumDigs=@NumDigs-1
    while 1=1
    begin  --Continue substituting until there are no more
        set @Pos=patindex(N'% @'+replicate(N'[0-9]',@NumDigs)+N' %',@Text)
        if @Pos=0 break  --No more... we're done with the loop
        set @Text=stuff(@Text,@Pos,3+@NumDigs,N' % ')
      end
    end
    
    And now our @Text variable looks like this:

    select soh . SalesOrderNumber , soh . ShipDate , 
    soh . CustomerID , s . Name , soh . TotalDue 
    from Sales . SalesOrderHeader soh join Person . Address a on 
    soh . ShipToAddressID = a . AddressID join Person . StateProvince sp 
    on a . StateProvinceID = sp . StateProvinceID join Sales . Store s 
    on soh . CustomerID = s . CustomerID where soh . SalesOrderID in 
    ( select SalesOrderID from Sales . SalesOrderDetail where ProductID 
    = % ) and soh . ShipDate > = % and soh . ShipDate < % and 
    a . City = % and sp . Name = %
    
    Now, I need to take care of those spaces that had been inserted by the optimizer between every operator and keyword in the query. So I replace all those spaces with percent signs. At the same time, I also replace any normal pattern-matching characters (like a carat or left square bracket or right square bracket) with percent signs. I don’t want them screwing up my final query into the plan cache. Finally, I place a percent sign at the beginning and the end of the text:

    /*Replace all spaces and pattern characters (i.e. "[", "]", "^") with Percents (%)*/
    /*Also place a Percent at the beginning and the end*/
    set @Text=N'%'
             +replace(replace(replace(replace(@Text
    ,N' '
                                             ,N'%')
                                     ,N'['
                                     ,N'%')
                             ,N']'
                             ,N'%')
                     ,N'^'
                     ,N'%')
             +N'%'
    
    At this point our @Text variable looks like this:

    %select%soh%.%SalesOrderNumber%,%soh%.%ShipDate%,%
    soh%.%CustomerID%,%s%.%Name%,%soh%.%TotalDue%
    from%Sales%.%SalesOrderHeader%soh%join%Person%.%Address%a%on%
    soh%.%ShipToAddressID%=%a%.%AddressID%join%Person%.%StateProvince%sp%
    on%a%.%StateProvinceID%=%sp%.%StateProvinceID%join%Sales%.%Store%s%
    on%soh%.%CustomerID%=%s%.%CustomerID%where%soh%.%SalesOrderID%in%
    (%select%SalesOrderID%from%Sales%.%SalesOrderDetail%where%ProductID%
    =%%%)%and%soh%.%ShipDate%>%=%%%and%soh%.%ShipDate%<%%%and%
    a%.%City%=%%%and%sp%.%Name%=%%%
    
    Just to be neat and tidy, I get rid of all the multiple percent signs and replace them with a single one:

    /*Transform all multiple Percents into single ones*/
    set @Text=replace(@Text,N'%%%',N'%%')
    set @Text=replace(@Text,N'%%',N'%')
    
    And that results in this:

    %select%soh%.%SalesOrderNumber%,%soh%.%ShipDate%,%
    soh%.%CustomerID%,%s%.%Name%,%soh%.%TotalDue%
    from%Sales%.%SalesOrderHeader%soh%join%Person%.%Address%a%on%
    soh%.%ShipToAddressID%=%a%.%AddressID%join%Person%.%StateProvince%sp%
    on%a%.%StateProvinceID%=%sp%.%StateProvinceID%join%Sales%.%Store%s%
    on%soh%.%CustomerID%=%s%.%CustomerID%where%soh%.%SalesOrderID%in%
    (%select%SalesOrderID%from%Sales%.%SalesOrderDetail%where%ProductID%
    =%)%and%soh%.%ShipDate%>%=%and%soh%.%ShipDate%<%and%
    a%.%City%=%and%sp%.%Name%=%
    
    Finally, since LIKE patterns are not allowed to be more than 4000 characters, I truncate the @Text variable to that length if needed:

    /*Truncate to 4000 characters max*/
    if len(@Text)>4000 set @Text=left(@Text,3999)+N'%' 
    
    So finally, at this point, we can now find all the ad-hoc queries in the cache whose text is LIKE our @Text variable:

    /*Find our Adhoc queries that match the pattern*/
    select Code=(select [processing-instruction(q)]=N':'+nchar(13)+qt.text+nchar(13)
                 for xml path(''),type)
          ,DB=db_name(QueryDatabaseID)
          ,[#Bytes]=cp.size_in_bytes
    ,cp.usecounts 
    from sys.dm_exec_cached_plans cp
    cross apply (select QueryDatabaseID=convert(int,value)
                 from sys.dm_exec_plan_attributes(cp.plan_handle)
                 where attribute='dbid') F_DB
    cross apply sys.dm_exec_sql_text(cp.plan_handle) qt
    where objtype='Adhoc'
      and qt.text like @Text
    order by cp.usecounts desc
    
    Here is the result:



    And when we click on one of the hyperlinks, we can see the original ad-hoc query in all its glory, just as it was originally submitted.

    <?q :
    SELECT soh.SalesOrderNumber
          ,soh.ShipDate 
          ,soh.CustomerID
          ,s.Name
          ,soh.TotalDue
    FROM Sales.SalesOrderHeader soh 
    JOIN Person.Address a ON soh.ShipToAddressID=a.AddressID 
    JOIN Person.StateProvince sp ON a.StateProvinceID=sp.StateProvinceID 
    JOIN Sales.Store s ON soh.CustomerID=s.CustomerID 
    WHERE soh.SalesOrderID IN (SELECT SalesOrderID 
                               FROM Sales.SalesOrderDetail 
                               WHERE ProductID=870)
      AND soh.ShipDate>='20030101'
      AND soh.ShipDate<'20040101'
      AND a.City='London'
      AND sp.Name='England'
    
    ?>
    
    So, to take advantage of this new procedure, I just incorporate it into my original Performance Statistics Query, like so:

    select Code=(select [processing-instruction(q)]=':'+nchar(13)+qt.text+nchar(13)
                 for xml path(''),type)
          ,qp.Query_Plan
    ,[#Reads]=qs.total_logical_reads
    ,[#Execs]=qs.execution_count
    ,AdHocStmts
    from sys.dm_exec_query_stats qs
    cross apply sys.dm_exec_sql_text(plan_handle) qt
    cross apply sys.dm_exec_query_plan(plan_handle) qp
    cross apply
      (select AdHocStmts=case
                           when qt.text like '(@0 %'
                           then N'exec DBATools.dbo.GetAdhocQueries '''
                               +replace(qt.text,N'''',N'''''')+N' '''
                           else null
                          end) F_Adhoc
    where qt.text not like '%sys.dm\_%' escape '\'
    order by [#Reads] desc
    
    In short, if the query text (in sys.dm_exec_sql_text) starts with a “(@0 “, I know it’s a parameterized query, so I will have the AdHocStmts column populated with the text to call my GetAdhocQueries procedure. When I execute the above query, I get the following result with the new AdHocStmts column:



    And I can copy/paste the AdHocStmts column into a query window and execute it (again, I only word-wrap it here for clarity… it’s actually a looonng single line of text).

    exec DBATools.dbo.GetAdhocQueries '(@0 int,@1 varchar(8000),@2 varchar(8000),
    @3 varchar(8000),@4 varchar(8000))select soh . SalesOrderNumber , soh . 
    ShipDate , soh . CustomerID , s . Name , soh . TotalDue from Sales . 
    SalesOrderHeader soh join Person . Address a on soh . ShipToAddressID = 
    a . AddressID join Person . StateProvince sp on a . StateProvinceID = 
    sp . StateProvinceID join Sales . Store s on soh . CustomerID = s . 
    CustomerID where soh . SalesOrderID in ( select SalesOrderID from 
    Sales . SalesOrderDetail where ProductID = @0 ) and soh . ShipDate 
    > = @1 and soh . ShipDate < @2 and a . City = @3 and sp . Name = @4 '
    
    And that will find the ad-hoc queries that are associated with that particular parameterized query.

    So this GetAdhocQueries procedure allows me to find the ad-hoc query needles in the plan cache haystack. Now if I only had a way to find those $#&@ queries in the C# code as easily… oh well, that’s somebody else’s problem.
  3. T-SQL Tuesday #023: Flip Side of the JOIN

    T-SQL TuesdayUnbelievable… It’s been almost 5 months since I last posted something here. I had a lot going on since that last post: A daughter graduating from college, a son graduating from high school, a few family vacation trips, moving my son into college, and between all of that, I was juggling an overwhelming amount of work from 3 demanding clients (and still am, quite frankly).

    But I’m back now, ready to finally re-JOIN the SQL blogging world once again.

    And that is very apt, because this post is part of the October T-SQL Tuesday on the subject of JOINs, hosted by Stuart Ainsworth.

    So, let’s not waste any time… Let’s plunge in…

    We are all aware of the various JOINs that are available to us in the T-SQL syntax: INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, and CROSS JOIN. But what I wanted to talk about today are two other kinds of JOINs: The (LEFT or RIGHT) Semi JOIN and the (LEFT or RIGHT) Anti Semi JOIN.

    These are not available to us directly in the syntactical sense, but they are employed in a Query Plan when you use certain types of queries.

    A LEFT Semi JOIN returns rows from the left side that have at least one matching row on the right side. At first glance, this seems very much like a regular INNER JOIN, except for one thing. With a LEFT Semi JOIN, the rows from the left table will be returned at most once. Even if the table on the right side contains hundreds of matches for the row on the left side, only one copy of the left-hand row will be returned.

    For example, let’s find Handlebar Products (SubCategory=4) in the Products table that appeared on at least one Order. If we use a regular INNER JOIN…

    select p.ProductID
    from Production.Product p
    join Sales.SalesOrderDetail sod on p.ProductID=sod.ProductID 
    where ProductSubcategoryID=4
    /*
    ProductID
    ---------
          808
          808
          808
        . . .
          809
          809
          809
        . . .
          947
          947
          947
    (1531 row(s) affected)
    */
    
    …We get tons of duplicate Product IDs. The (actual) execution plan for the above query looks like so:



    Note that the data flow arrow from the Products table showed that 8 rows were processed, and the total rows that were requested by the Nested Loops operator from the SalesOrderDetail Index Seek for those 8 product rows were 1531.

    In order to eliminate all the duplicates, we have to introduce a DISTINCT to the query:

    select distinct p.ProductID
    from Production.Product p
    join Sales.SalesOrderDetail sod on p.ProductID=sod.ProductID 
    where ProductSubcategoryID=4
    /*
    ProductID
    ---------
          808
          809
          810
          811
          813
          946
          947
    */
    
    Interestingly enough, if we look at the actual execution plan for that query, you would probably expect to find the same plan as before except with an Aggregate operator at the top of the tree to eliminate the duplicates and shrink the 1531 rows down to 7, but instead, this is what we find:



    Note that the Nested Loops operator is a LEFT Semi JOIN. Somehow the optimizer is “smart enough” to realize (because of our INNER JOIN and the DISTINCT and our result set only consisting of columns from one side of the JOIN) that it would be more expensive to do the full INNER JOIN and then eliminate the duplicates, and so it employed a LEFT Semi JOIN. This is much more efficient because the Nested Loops operator only needs to request a single row from the SalesOrderDetail Index Seek operator for each Product processed… If a single row exists in SalesOrderDetails, then it can release the Product row to the Select operator for output. If no row exists, then it tosses out the Product row and moves on. If you hover over the data flow arrow coming out of the Index Seek, you’ll see that only 7 rows were passed along (because one of the 8 Product rows did not have a match).

    I don’t know about you, but the hairs on the back of my neck stand up whenever I see a DISTINCT in a query. This same solution could be employed (more clearly in my opinion) via three other types of queries, all of which (by nature) involve a Semi JOIN… an INTERSECT query or an EXISTS query or an IN query:

    select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    intersect
    select ProductID
    from Sales.SalesOrderDetail 
    
    select ProductID
    from Production.Product p
    where ProductSubcategoryID=4
    and exists (select * 
    from Sales.SalesOrderDetail 
    where ProductID=p.ProductID)
    
    select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    and ProductID in (select ProductID
    from Sales.SalesOrderDetail)
    
    All three of the above queries produce the exact same plan, which is the very efficient Semi JOIN plan that we just examined.

    It’s really a matter of style as to which approach that you use. I prefer EXISTS or IN. The INTERSECT operator is kind of cool, but it is very limiting. For example, let’s say we wanted the result set to include the Name of the Product as well. In the EXISTS and IN queries (and for that matter in our original INNER JOIN/DISTINCT query), we simply add the Name column to the SELECT clause and we’re done… And the query plan would remain unchanged except for the fact that an extra column will come from the Product table.

    But with the INTERSECT query, we have to introduce the Name column to both sides of the INTERSECT, meaning that we have to add an extra JOIN to get the Name:

    select ProductID
    ,Name
    from Production.Product
    where ProductSubcategoryID=4
    intersect
    select sod.ProductID
    ,p.Name
    from Sales.SalesOrderDetail sod
    join Production.Product p on sod.ProductID=p.ProductID 
    /*
    ProductID Name
    --------- ----------------------
          808 LL Mountain Handlebars
          809 ML Mountain Handlebars
          810 HL Mountain Handlebars
          811 LL Road Handlebars
          813 HL Road Handlebars
          946 LL Touring Handlebars
          947 HL Touring Handlebars
    */
    
    And the execution plan now involves a lot more work:



    We could also try to re-work the INTERSECT query using a CTE or a derived table to just get the ProductID’s and then JOIN the result to the Products table to get the Name column like so…

    with ProductsInOrders as
    (
      select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    intersect
      select ProductID
    from Sales.SalesOrderDetail 
    )
    select pio.ProductID
    ,p.Name
    from ProductsInOrders pio
    join Production.Product p on pio.ProductID=p.ProductID 
    /*
    ProductID Name
    --------- ----------------------
          808 LL Mountain Handlebars
          809 ML Mountain Handlebars
          810 HL Mountain Handlebars
          811 LL Road Handlebars
          813 HL Road Handlebars
          946 LL Touring Handlebars
          947 HL Touring Handlebars
    */
    
    …But we still can’t get around the fact that we have to access the Products table multiple times. In fact, the execution plan for the above query is quite amusing when you look at it:



    For each Product row acquired in the Clustered Index Scan, it does a SEEK into the same Clustered Index to get the Name! What a waste of resources.

    Now on to Anti Semi JOINs…

    A LEFT Anti Semi JOIN returns rows from the left side that have no matching rows on the right side… It’s the exact opposite of the Semi JOIN. As you can probably guess, this kind of JOIN is employed when you execute a query using EXCEPT or NOT EXISTS or NOT IN:

    select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    except
    select ProductID
    from Sales.SalesOrderDetail 
    /*
    ProductID
    ---------
          812
    */
    
    select ProductID
    from Production.Product p
    where ProductSubcategoryID=4
    and not exists (select * 
    from Sales.SalesOrderDetail 
    where ProductID=p.ProductID)
    /*
    ProductID
    ---------
          812
    */
    
    select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    and ProductID not in (select ProductID
    from Sales.SalesOrderDetail)
    /*
    ProductID
    ---------
          812
    */
    
    All three of the above queries produce the exact same execution plan using a LEFT Anti Semi JOIN:



    So, for each of the 8 Product rows, the Nested Loops operator requests a row from the SalesOrderDetail table. If one exists, then it tosses the Product row aside and moves on. If one does not exist, then it releases the Product row up to the Select operator for output.

    The EXCEPT operator has the same limitations as was described for the INTERSECT operator and therefore is not as useful as the NOT EXISTS or NOT IN types of queries.

    One important note about NOT IN. It is only equivalent to the NOT EXISTS query if the column being checked is non-nullable. If the ProductID in Sales.SalesOrderDetail allowed NULLs, then the NOT IN query plan would look like this:



    There’s a lot of logic employed in the plan to handle the fact that there may be NULLs in SalesOrderDetail. We can return back to our more simplified query, however, by adding a WHERE IS NOT NULL predicate to our IN subquery:

    select ProductID
    from Production.Product
    where ProductSubcategoryID=4
    and ProductID not in (select ProductID
    from Sales.SalesOrderDetail
    where ProductID is not null)
    
    So if you prefer the NOT IN style over the NOT EXISTS style of querying, it’s a good idea to get in the habit of including a WHERE IS NOT NULL predicate to the subquery.

    By the way, many people in the past have tried to emulate the Anti Semi JOIN behavior by doing a LEFT JOIN and adding a WHERE IS NULL to the query to only find rows that have no match on the right side, like so:

    select p.ProductID
    from Production.Product p
    left join Sales.SalesOrderDetail sod on p.ProductID=sod.ProductID 
    where ProductSubcategoryID=4
    and sod.ProductID is null
    
    But this actually produces a plan that LEFT JOINs everything and then employs a Filter to only allow the IS NULL non-matches, creating a lot of unnecessary work and a much more inefficient query than the true Anti Semi JOIN:



    I’ve seen people in the past saying that the LEFT JOIN/IS NULL approach is faster than the NOT EXISTS approach, but frankly, I can’t see it. If anyone has an example to offer, I’d certainly like to take a look.
  4. CTE: Coolest T-SQL Enhancement

    T-SQL TuesdayThis blog entry is participating in T-SQL Tuesday #018, hosted this month by Bob Pusateri.

    You are invited to visit his blog and join the party and read more blogs participating in this month’s theme: CTEs.

    CTE stands for Common Table Expression, but it should stand for Coolest T-SQL Enhancement. In fact, this T-SQL Tuesday topic is a perfect followup to last month’s topic of APPLY, because APPLY and CTEs (and Window Functions) were all added in SQL2005 as new indispensable enhancements to the language.

    The APPLY Operator is an incredibly versatile tool that helps you to create “functions on-the-fly” and do column manipulations. Similarly, a CTE is a tool that helps you create “views on-the-fly” and do row manipulations.

    There are many things that you can do with CTEs that you could do with APPLY as well; however, what sets CTEs apart are that they can be recursive and that they can be re-used (in the same query). You can’t re-use an APPLY operation.

    I’m sure there will be many posts about recursion this month, but I did a big treatise on that subject last year, so I’m going to focus on a couple of other cool features of CTEs.

    Column Aliasing

    One aspect of CTEs that are not used that often (and that surprisingly not many people even know about) is the fact that you can provide column aliases in its definition.

    For example, let’s say you have a simple CTE query like the following using the NorthWind database (I’m taking a break from AdventureWorks):

    with MadridCusts as
    (
      select ID=CustomerID
    ,Company=CompanyName
    ,Contact=ContactName
    ,Phone
    from Customers
    where City='Madrid'
    )
    select ID,Company,Contact,Phone
    from MadridCusts
    /*
    ID    Company                              Contact          Phone
    ----- ------------------------------------ ---------------- --------------
    BOLID Bolido Comidas preparadas            Martin Sommer    (91) 555 22 82
    FISSA FISSA Fabrica Inter. Salchichas S.A. Diego Roel       (91) 555 94 44
    ROMEY Romero y tomillo                     Alejandra Camino (91) 745 6200
    */
    
    As you can see, I’m defining new column aliases for each of the columns (except Phone) within the CTE. But in the WITH clause, I can override those columns aliases with different aliases if I want to. (In fact, you didn’t hear it here, but this is a hilarious trick you can play on your colleagues when they move away from their desk for a few moments):

    with MadridCusts(Phone,ID,Company,Contact) as
    (
      select ID=CustomerID
    ,Company=CompanyName
    ,Contact=ContactName
    ,Phone
    from Customers
    where City='Madrid'
    )
    select ID,Company,Contact,Phone
    from MadridCusts
    /*
    ID                                   Company          Contact         Phone
    ------------------------------------ ---------------- --------------- -----
    Bolido Comidas preparadas            Martin Sommer    (91) 555 22 82  BOLID
    FISSA Fabrica Inter. Salchichas S.A. Diego Roel       (91) 555 94 44  FISSA
    Romero y tomillo                     Alejandra Camino (91) 745 6200   ROMEY
    */
    
    Diabolical, isn’t it?

    Multiple CTEs Building Upon Each Other

    You can define several CTEs that build upon each other. In fact, last year, I wrote a blog post about a query that plays Poker. Via about a dozen CTEs, it creates the deck of cards, deals them out to 10 people, evaluates the hands, and shows the winner… all in one single query.

    The following is a ridiculously simple (and useless) example of CTEs building upon each other:

    with USACustomers as
    (
      select CustomerID,CompanyName,ContactName,Region,City,Phone,PostalCode
    from Customers
    where Country='USA'
    )
    ,USACustomersInOregon as
    (
      select *
      from USACustomers
    where Region='OR'
    )
    ,OregonCustomersInPortland as
    (
      select *
      from USACustomersInOregon
    where City='Portland'
    )
    select * from OregonCustomersInPortland
    
    Each CTE uses the result of the previous to continue to narrow down the result set. But since each CTE is treated as a view, the query optimizer is able to “push” the predicates into a single one. It is exactly the same in every aspect as the following query:

    select CustomerID,CompanyName,ContactName,Region,City,Phone,PostalCode
    from Customers
    where Country='USA'
      and Region='OR'
      and City='Portland'
    

    Updating and Deleting

    Another aspect of CTEs that many people don’t realize is that they are updatable.

    Let’s make a temporary copy of the Customers table and Orders table add a new column to them called UpdateColumn:

    if object_id('tempdb..#Custs','U') is not null drop table #Custs
    select *, cast(null as char(1)) as UpdateColumn
    into #Custs
    from Customers
    if object_id('tempdb..#Orders','U') is not null drop table #Orders
    select *, cast(null as char(1)) as UpdateColumn
    into #Orders
    from Orders
    
    Now we can do our same OregonCustomersInPortland CTE query, but this time we will update that new UpdateColumn with an ‘X’ for each of those Portland Customers. Note that we have to introduce the UpdateColumn to the SELECT list of the initial query:

    with USACustomers as
    (
      select CustomerID,CompanyName,ContactName,Region,City,Phone,PostalCode,UpdateColumn
    from #Custs
    where Country='USA'
    )
    ,USACustomersInOregon as
    (
      select *
      from USACustomers
    where Region='OR'
    )
    ,OregonCustomersInPortland as
    (
      select *
      from USACustomersInOregon
    where City='Portland'
    )
    update OregonCustomersInPortland 
    set UpdateColumn='X'
    /*
    (2 row(s) affected)
    */
    
    As long as the column(s) you want to update are defined in the CTE, you can update it. In fact, you can introduce the UpdateColumn multiple times within the CTEs and you can update any one of them and it will work fine. (Note that you can’t update more than one of them, because SQL will not allow you to update a column more than once:

    with USACustomers as
    (
      select CustomerID,CompanyName,ContactName,Region,City,Phone,PostalCode,UpdateColumn
    from #Custs
    where Country='USA'
    )
    ,USACustomersInOregon as
    (
      select *,UpdateColumnAgain=UpdateColumn
    from USACustomers
    where Region='OR'
    )
    ,OregonCustomersInPortland as
    (
      select *,UpdateColumnStillAgain=UpdateColumn
    from USACustomersInOregon
    where City='Portland'
    )
    update OregonCustomersInPortland 
    set UpdateColumnStillAgain='X'
    /*
    (2 row(s) affected)
    */
    
    Using CTEs to UPDATE or DELETE can perhaps make things a little clearer when you’re JOINing in other tables or involving more complicated query methods. For example, let’s say Howard Snyder, a contact for some Customer in Northwind, calls us and wants to change his last open order so that it ships via Speedy Express rather than the usual Federal Shipping.

    So we can use a CTE to find open orders belonging to a customer whose contact is named Howard Snyder that are set up for Federal Shipping, get the most recent one (based on Order Date) and update its Shipping Method to Speedy Express instead:

    with HowardSnyderOpenOrdersViaFedShipping as
    (
      select o.ShipVia
    ,RowNum=row_number() over (order by o.OrderDate desc)
      from #Orders o
    join Customers c on o.CustomerID=c.CustomerID
    join Shippers s on o.ShipVia=s.ShipperID 
    where o.ShippedDate is null  --Open Orders
        and c.ContactName='Howard Snyder'
        and s.CompanyName='Federal Shipping'
    )
    update HowardSnyderOpenOrdersViaFedShipping 
    set ShipVia=(select ShipperID 
    from Shippers 
    where CompanyName='Speedy Express')
    where RowNum=1  --Only most recent order
    /*
    (1 row(s) affected)
    */
    
    Coo-ul, huh?

    Generating Numbers

    This has been seen in hundreds of blogs and books, but I still marvel at the following method of generating a table of numbers (from 1 to 1,000,000). It’s elegant and brief and fast:

    with 
      L0(c) as (select 0 from (values (0),(0),(0)) f(c)) --3 Rows
     ,L1(c) as (select 0 from L0 a,L0 b,L0 c)            --27 Rows (3x3x3)
     ,L2(c) as (select 0 from L1 a,L1 b,L1 c)            --19683 Rows (27x27x27)
     ,L3(c) as (select 0 from L2 a,L2 b)                 --387,420,489 Rows (19683x19683)
     ,NN(n) as (select row_number() over (order by (select 0)) from L3)
    select n into #Nums from NN where n<=1000000
    
    That’s just too cool. Note that because of the VALUES row constructor syntax, the above will only work in SQL2008. To make it work in SQL2005 or earlier, just change the first CTE to use UNION ALLs instead:

    with 
      L0(c) as (select 0 union all select 0 union all select 0) --3 Rows
    

    Step-By-Step Clarity

    Like the APPLY operator, CTEs are terrific for step-by-step self-documentation. They make your code easier to follow and understand what’s going on.

    Here’s an example of seeing whether the current date/time falls within the Daylight Savings Time (as defined by the United States, unless you live in Hawaii or Arizona). Again, this will only work in SQL2008… Change the VALUES row constructor syntax to UNION ALL syntax to make it work in SQL2005:

    with 
      ZeroThruSix as
    (
           select N from (values (0),(1),(2),(3),(4),(5),(6)) F(N)
        )
     ,FirstDayInMarch(FirstDayInMarch) as
    (
           select convert(datetime,str(year(getdate()),4)+'0301 02:00')
        )
     ,FirstSevenDaysInMarch(MarchDate) as
    (
           select dateadd(day,N,FirstDayInMarch)
           from FirstDayInMarch
    cross join ZeroThruSix
    )
     ,SecondSundayInMarch(SecondSundayInMarch) as 
    (
           select dateadd(day,7,MarchDate)
           from FirstSevenDaysInMarch
    where datename(weekday,MarchDate)='Sunday'
        )
     ,FirstDayInNovember(FirstDayInNovember) as
    (
           select convert(datetime,str(year(getdate()),4)+'1101 02:00')
        )
     ,FirstSevenDaysInNovember(NovemberDate) as
    (
           select dateadd(day,N,FirstDayInNovember)
           from FirstDayInNovember 
    cross join ZeroThruSix 
    )
     ,FirstSundayInNovember(FirstSundayInNovember) as
    (
           select NovemberDate
    from FirstSevenDaysInNovember 
    where datename(weekday,NovemberDate)='Sunday'
        )
    select IsDST=convert(bit,case
                               when getdate()>=SecondSundayInMarch
    and getdate()<FirstSundayInNovember 
    then 1
    else 0
    end)
    from SecondSundayInMarch
    cross join FirstSundayInNovember 
    
    Cute, huh? Note how the ZeroThruSix CTE was used more than once? That’s the reusability I was mentioning earlier. You can also see how I incorporated column aliasing in most of the CTEs as well.

    You can follow what’s going on step-by-step in the above query. Before SQL2005 and CTEs you would have to write the query like so:

    select IsDST=convert(bit,case
                               when getdate()>=SecondSundayInMarch
    and getdate()<FirstSundayInNovember 
    then 1
    else 0
    end)
    from 
      (
        select dateadd(day,N+7,convert(datetime,str(year(getdate()),4)+'0301 02:00'))
        from (select 0 union all select 1 union all select 2 union all select 3
    union all select 4 union all select 5 union all select 6) F(N)
        where datename(weekday
                      ,dateadd(day,N,convert(datetime,str(year(getdate()),4)+'0301 02:00')))
              ='Sunday'
      ) SecondSundayInMarch(SecondSundayInMarch)
    cross join
      (
        select dateadd(day,N,convert(datetime,str(year(getdate()),4)+'1101 02:00'))
        from (select 0 union all select 1 union all select 2 union all select 3
    union all select 4 union all select 5 union all select 6) F(N)
        where datename(weekday
                      ,dateadd(day,N,convert(datetime,str(year(getdate()),4)+'1101 02:00')))
              ='Sunday' 
    ) FirstSundayInNovember(FirstSundayInNovember)
    
    Not quite as clear, is it? But both queries produce the exact same query plan.

    I hope you’ve enjoyed these CTE examples and appreciate their power. I couldn’t live without them.
  5. The Index Tuning Detective

    A Shocking Revelation!Part of tuning queries is being a detective in figuring out what indexes need to be created… and figuring out what indexes may need to be tweaked a bit.

    So to all you Nancy Drews and Hardy Boys out there: Get your magnifying glass and let’s unravel some mysteries!

    Our first adventure will be The Case Of The Missing Indexes.

    This has been talked about before in other blogs, because SQL Server already provides some tools to find missing indexes, but hopefully I’ll go just a little bit farther with the concept.

    But it’s the second adventure, The Case Of The Key Lookup Killer, that I’m looking forward to sharing with you.

    It may help you uncover some shocking revelations about your queries.

    Anyway, please read on…



    The Case of the Missing Indexes

    Consider the following query in AdventureWorks:

    select SalesOrderID,OrderDate 
    from Sales.SalesOrderHeader 
    where PurchaseOrderNumber is not null
    
    When we look at the Estimated Execution Plan for this query in SQL2008, we get a helpful hint about a missing index. It says that we can cut down the cost of the query by 94.2762% if we were to add the index that it suggests. And we can even right-click on the plan and choose Missing Index Details and it will provide a code window with the code necessary to create that index:

    SQL2008 Showing Missing Index

    /*
    Missing Index Details from SQLQuery6.sql-BRADPC\SQL08.AdventureWorks (BRADPC\Brad (53))
    The Query Processor estimates that implementing the following index 
    could improve the query cost by 94.2762%.
    */
    
    /*
    USE [AdventureWorks]
    GO
    CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
    ON [Sales].[SalesOrderHeader] ([PurchaseOrderNumber])
    INCLUDE ([SalesOrderID],[OrderDate])
    GO
    */
    
    Of course, the index is not a perfect suggestion because it suggests we INCLUDE the SalesOrderID column, which is ridiculous because SalesOrderID is the Clustered Index Key, so it would be part of the index automatically anyway.

    But never mind that… This is still pretty cool stuff.

    And it’s not limited to SQL2008 either. Yes, SQL2008 will provide the helpful hint when you look at the Estimated Plan, but SQL2005 still has the same information behind the scenes. If you look at the Execution Plan XML (by right-clicking on the Plan and choosing Show Execution Plan XML), you will find the Missing Index information buried in there… usually towards the top, but not always… there can potentially be many Missing Index sections in the XML if the plan is for a multiple-statement batch (particularly a batch with IF conditions).

    /*
    <?xml version="1.0" encoding="utf-16"?>
    <ShowPlanXML ... xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
    ...
    <MissingIndexes>
      <MissingIndexGroup Impact="94.2762">
      <MissingIndex Database="[AdventureWorks]" Schema="[Sales]" Table="[SalesOrderHeader]">
         <ColumnGroup Usage="INEQUALITY">
            <Column Name="[PurchaseOrderNumber]" ColumnId="9" />
         </ColumnGroup>
         <ColumnGroup Usage="INCLUDE">
            <Column Name="[SalesOrderID]" ColumnId="1" />
            <Column Name="[OrderDate]" ColumnId="3" />
         </ColumnGroup>
       </MissingIndex>
      </MissingIndexGroup>
    </MissingIndexes>
    ...
    </ShowPlanXML>
    */
    
    This is great if you happen to be looking at a specific plan, but what if you want to know about missing indexes across many plans?

    There is a collection of DMV’s that report missing indexes for queries that have been executed on the server. The following is a query used by many DBA’s in order to find indexes that they can potentially add in order to speed up their queries:

    select index_advantage=user_seeks*avg_total_user_cost*(avg_user_impact*0.01)
          ,migs.last_user_seek
    ,TableName=mid.statement
          ,mid.equality_columns
    ,mid.inequality_columns 
    ,mid.included_columns 
    ,migs.unique_compiles 
    ,migs.user_seeks 
    ,migs.avg_total_user_cost 
    ,migs.avg_user_impact 
    from sys.dm_db_missing_index_group_stats migs with (nolock)
    join sys.dm_db_missing_index_groups mig with (nolock) 
    on migs.group_handle=mig.index_group_handle 
    join sys.dm_db_missing_index_details mid with (nolock) 
    on mig.index_handle=mid.index_handle 
    order by index_advantage desc 
    /*
     index_advantage last_user_seek          TableName                                  
    ---------------- ----------------------- -------------------------------------------
    3.23713751033778 2011-04-29 08:58:11.960 [AdventureWorks].[Sales].[SalesOrderHeader]
    
    equality_columns inequality_columns   included_columns            
    ---------------- -------------------- --------------------------- 
    NULL             [PurchaseOrderNumber [SalesOrderID], [OrderDate] 
    
    unique_compiles user_seeks avg_total_user_cost avg_user_impact
    --------------- ---------- ------------------- ---------------
                  1          6   0.572255959259259           94.28
    */
    
    You can see our familiar Impact figure of 94.28% in the last column, but since the DMV’s record the number of seeks that could have potentially been performed (based on the number of times the query was executed on the server), this query calculates a theoretical “advantage” figure using that information, and orders the data in descending order of that “advantage”.

    Again, this is really cool that this information is recorded and we can effortlessly find out suggestions for indexes in our databases.

    But these indexes shouldn’t be created blindly… they should be created after some consideration of how they will impact the system. If you know your database backwards and forwards, you probably have a good idea of what kind of queries would benefit from the creation of these indexes. But if you’re a consultant coming in cold, you don’t really know much about the queries in the system.

    However, all is not lost!

    We can look directly inside the query cache (via the sys.dm_exec_cached_plans DMV), hunting for queries that are running right now (or have run recently), and pick out those that have missing index suggestions in their Execution XML data. This can be used in concert with the Missing Index DMV query above to gain more knowledge about the queries that will benefit.

    I put together the following query to do just that. The comments within the code should help you to figure out what is going on. Use the XML snippet shown earlier to follow how the code hunts for what it wants.

    Note that I only wanted to pay attention to cached queries that had been used at least 5 times. I also wanted to limit the number of columns making up the key to 5 and the number of INCLUDEd columns to 5. I also only wanted to pay attention to Missing Indexes that have an Impact of at least 50%. You can fiddle with these quantities in the WHERE clause to match your needs.

    The query provides the Key Column list and the INCLUDE Column List and the command you can use to actually create the index if you wish. It also provides a hyperlink to the code of the batch or procedure that would benefit from the index so you can see the query involved. There is also a hyperlink to the Execution Plan so you can investigate that as well.

    The query is ORDERed BY the Impact figure in descending order, so the ones with the biggest benefit will be at the top.

    with xmlnamespaces 
    (
      default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
    )
    select Impact
    ,TableName=IxDB+'.'+IxSchema+'.'+IxTable
    ,KeyCols
    ,IncludeCols
    ,IndexCommand
    ,usecounts
    ,size_in_bytes
    ,objtype
    ,BatchCode
    ,QueryPlan=qp.query_plan 
    from sys.dm_exec_cached_plans qs 
    cross apply 
    --Get the Query Text
      sys.dm_exec_sql_text(qs.plan_handle) qt             
    cross apply 
    --Get the Query Plan
      sys.dm_exec_query_plan(qs.plan_handle) qp
    cross apply
      --Get the Code for the Batch in Hyperlink Form
      (select BatchCode
    =(select [processing-instruction(q)]=':'+nchar(13)+qt.text+nchar(13)
                  for xml path(''),type)
      ) F_Code
    cross apply
      --Find the Missing Indexes Group Nodes in the Plan
      qp.query_plan.nodes('//MissingIndexes/MissingIndexGroup') F_GrpNodes(GrpNode)
    cross apply 
    --Pull out the Impact Figure
      (select Impact=GrpNode.value('(./@Impact)','float')) F_Impact
    cross apply 
    --Get the Missing Index Nodes from the Group
      GrpNode.nodes('(./MissingIndex)') F_IxNodes(IxNode)
    cross apply 
    --Pull out the Database,Schema,Table of the Missing Index
      (select IxDB=IxNode.value('(./@Database)','sysname')
             ,IxSchema=IxNode.value('(./@Schema)','sysname')
             ,IxTable=IxNode.value('(./@Table)','sysname')
      ) F_IxInfo
    cross apply 
    --How many INCLUDE columns are there;
      --And how many EQUALITY/INEQUALITY columns are there?
      (select NumIncludes
    =IxNode.value('count(./ColumnGroup[@Usage="INCLUDE"]/Column)','int')
             ,NumKeys
    =IxNode.value('count(./ColumnGroup[@Usage!="INCLUDE"]/Column)','int')
      ) F_NumIncl
    cross apply 
    --Pull out the Key Columns and the Include Columns from the various Column Groups
      (select EqCols=max(case when Usage='EQUALITY' then ColList end)
             ,InEqCols=max(case when Usage='INEQUALITY' then ColList end)
             ,IncludeCols=max(case when Usage='INCLUDE' then ColList end)
       from IxNode.nodes('(./ColumnGroup)') F_ColGrp(ColGrpNode)
       cross apply 
    --Pull out the Usage of the Group? (EQUALITY of INEQUALITY or INCLUDE)
         (select Usage=ColGrpNode.value('(./@Usage)','varchar(20)')) F_Usage
    cross apply 
    --Get a comma-delimited list of the Column Names in the Group
         (select ColList=stuff((select ','+ColNode.value('(./@Name)','sysname')
                                from ColGrpNode.nodes('(./Column)') F_ColNodes(ColNode)
                                for xml path(''))
                              ,1,1,'')
         ) F_ColList
    ) F_ColGrps
    cross apply
      --Put together the Equality and InEquality Columns
      (select KeyCols=isnull(EqCols,'')
                     +case 
    when EqCols is not null and InEqCols is not null 
    then ',' 
    else '' 
    end
                     +isnull(InEqCols,'')
      ) F_KeyCols
    cross apply 
    --Construct a CREATE INDEX command
      (select IndexCommand='create index <InsertNameHere> on '
                          +IxDB+'.'+IxSchema+'.'+IxTable+' ('
                          +KeyCols+')'
                          +isnull(' include ('+IncludeCols+')','')) F_Cmd
    where qs.cacheobjtype='Compiled Plan'
      and usecounts>=5    --Only interested in those plans used at least 5 times
      and NumKeys<=5      --Limit to the #columns we're willing to have in the index
      and NumIncludes<=5  --Limit to the #columns we're willing to have in the INCLUDE list
      and Impact>=50      --Only indexes that will have a 50% impact
    order by Impact desc
    
    Here is the output (without the hyperlinks) for the AdventureWorks query we were discussing:

    /*
     Impact TableName                                  
    ------- -------------------------------------------
    94.2762 [AdventureWorks].[Sales].[SalesOrderHeader]
    
    KeyCols               IncludeCols                
    --------------------- -------------------------- 
    [PurchaseOrderNumber] [SalesOrderID],[OrderDate]  
    
    IndexCommand                                                                                           
    ----------------------------------------------------------------------------
    create index <InsertNameHere> on [AdventureWorks].[Sales].[SalesOrderHeader] 
    ([PurchaseOrderNumber]) include ([SalesOrderID],[OrderDate])
    
    usecounts size_in_bytes objtype
    --------- ------------- -------
            9         40960 Adhoc  
    */
    



    The Case of the Key Lookup Killer

    Now on to the part of this article I’m really excited about, because it will help you possibly tweak existing indexes to help improve your queries. This is something that you cannot get from any DMV’s.

    Consider the following query:

    select h.SalesOrderID
    ,h.CustomerID 
    ,h.OrderDate 
    ,d.LineTotal 
    from Sales.SalesOrderHeader h
    join Sales.SalesOrderDetail d on h.SalesOrderID=d.SalesOrderID 
    where h.CustomerID in (117,119,126,196,236,435)
      and d.ProductID=942
    
    This is what its query plan looks like (click on the image to see a larger view):

    Query with Key Lookups

    Note that it is able to use the SalesOrderDetail’s index on ProductID to easily find the rows for ProductID 942. However, for each of those rows, it has to do a Key Lookup into the Clustered Index in order to get the columns for the LineTotal, which is a computed column based on the columns OrderQty, UnitPrice, and UnitPriceDiscount. The LineTotal is calculated by the Compute Scalar operator.

    Similarly, the query makes use of SalesOrderHeader’s index on CustomerID to find the rows for the desired CustomerID’s. But again, it has to do Key Lookups to get the OrderDate.

    What if we could eliminate those Key Lookups? If we were to INCLUDE the OrderQty and UnitPrice and UnitPriceDiscount columns in the ProductID index and INCLUDE the OrderDate column in the CustomerID index, then the query would be covered completely by those indexes. The query plan would then look like this:

    Query Covered by Indexes

    The cost of this query is 0.0128915, which is a 92% improvement over the Key Lookup query, which had a cost of 0.163265.

    I put together a query to look in the cache for queries containing Key Lookups and then pull out the columns that we could potentially INCLUDE in an index in order to improve performance.

    Here’s the section of the Execution Plan XML that involves the Sales.SalesOrderDetail Index Seek and its Key Lookup… I abbreviated it so that only the relevant portions are displayed:

    /*
    <?xml version="1.0" encoding="utf-16"?>
    <ShowPlanXML ... xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
    ...
    <RelOp ... PhysicalOp="Nested Loops" ...>
      ...
      <NestedLoops ...>
        ...
        <RelOp ... LogicalOp="Index Seek" ...>
          ...
          <IndexScan ...>
            ...
            <Object Database="[AdventureWorks]" 
                    Schema="[Sales]" 
                    Table="[SalesOrderDetail]" 
                    Index="[IX_SalesOrderDetail_ProductID]" 
                    Alias="[d]" ... />
          </IndexScan>
        </RelOp>
        <RelOp  ... LogicalOp="Compute Scalar" ...>
          <ComputeScalar>
            ...
            <RelOp ... LogicalOp="Clustered Index Seek" ...>
              <OutputList>
                <ColumnReference ... Column="OrderQty" />
                <ColumnReference ... Column="UnitPrice" />
                <ColumnReference ... Column="UnitPriceDiscount" />
              </OutputList>
              <IndexScan Lookup="true" ...>
                ...
                <Object Database="[AdventureWorks]" 
                        Schema="[Sales]" 
                        Table="[SalesOrderDetail]" 
                        Index="[PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]" 
                        Alias="[d]" ... />
              </IndexScan>
            </RelOp>
          </ComputeScalar>
        </RelOp>
      </NestedLoops>
    </RelOp>
    ...
    </ShowPlanXML>
    */
    
    You can see the Nested Loops RelOp Node has a node down its hierarchy for the Index Seek into the IX_SalesOrderDetail_ProductID index, and it also has a node down its hierarchy for the Clustered Index Seek.

    Here’s what my query does, step by step…

    It finds Clustered Index Seek RelOp Nodes with an Index Scan Node with the Lookup attribute equal to True. Those are the Key Lookups. For each one found, it saves the Database, Schema, Table, and (possible) Alias Name. It also gets the list of columns in the OutputList node and also counts how many columns there are.

    Then it goes up the hierarchy, looking for a Nested Loops RelOp Node. It may be the immediate RelOp Node, or it could be two RelOp Nodes up the hierarchy. In this case, because of the Compute Scalar, we have to go up two RelOp Nodes.

    Once it finds the appropriate Nested Loops RelOp Node, it goes down its hierarchy looking for any Index Seek or Index Scan RelOp Nodes.

    For each of those found, it saves the Database, Schema, Table, and (possible) Alias Name. And it gets the name of the Index that was involved in the Seek or Scan.

    So now we have Lookup Data, and we have to see if any of the Index Data we found (there could be more than one) match in terms of Database and Schema and Table and (possible) Alias. Once the match is found, we can output it.

    It’s a bit complicated, but with a little study in reading the above steps and in reading the comments in the code, you can hopefully figure out what’s going on.

    Like the Missing Index query, I only look for cached plans that have been used at least 5 times, and I’m only interested in an INCLUDE list of no more than 5 columns. The output includes the usual hyperlinks to the code and to the plan.

    I order the query’s output by TableName and IndexName so you can see similar suggestions clustered together. Note also that the columns in the suggested INCLUDE list are in alphabetical order.

    Here is what my query suggested for our AdventureWorks Key Lookup query:

    /*
    TableName                                   IndexName                         
    ------------------------------------------- --------------------------------  
    [AdventureWorks].[Sales].[SalesOrderDetail] [IX_SalesOrderDetail_ProductID]   
    [AdventureWorks].[Sales].[SalesOrderHeader] [IX_SalesOrderHeader_CustomerID]  
    
    TableAliasInQuery ColumnsToInclude                     usecounts size_in_bytes objtype
    ----------------- ------------------------------------ --------- ------------- -------
    [d]               OrderQty,UnitPrice,UnitPriceDiscount         6        196608 Adhoc
    [h]               OrderDate                                    6        196608 Adhoc
    */
    
    So it is suggesting that the SalesOrderDetail index called IX_SalesOrderDetail_ProductID should INCLUDE the columns OrderQty, UnitPrice, and UnitPriceDiscount. And it is making a similar suggestion to INCLUDE the OrderDate column in the IX_SalesOrderHeader_CustomerID index.

    Again, as with any kind of tuning exercise, you should not make these changes blindly, but consider the possible implications before going forward.

    The code for this query for INCLUDE Column Suggestions is below. I hope you found it as useful as I have. It actually helped me to find an error in a client’s stored procedure… In suggesting a column to INCLUDE, I could see by looking at the query that it was actually using an incorrect WHERE predicate, and so we were able to fix it before it caused a bug. I’ve also used its suggestions to improve some queries considerably.

    So here’s the query… Please let me know if it’s been useful to you… Enjoy!

    with xmlnamespaces 
    (
      default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
    )
    select TableName=IxDB+'.'+IxSchema+'.'+IxTable
    ,IndexName=IxIndex
    ,TableAliasInQuery=isnull(IxAlias,IxTable)
          ,ColumnsToInclude=ColList
    ,usecounts
    ,size_in_bytes
    ,objtype
    ,BatchCode
    ,QueryPlan=qp.query_plan 
    from sys.dm_exec_cached_plans qs 
    cross apply 
    --Get the Query Text
      sys.dm_exec_sql_text(qs.plan_handle) qt             
    cross apply 
    --Get the Query Plan
      sys.dm_exec_query_plan(qs.plan_handle) qp
    cross apply
      --Get the Code for the Batch in Hyperlink Form
      (select BatchCode
    =(select [processing-instruction(q)]=':'+nchar(13)+qt.text+nchar(13)
                  for xml path(''),type)
      ) F_Code
    cross apply
      --Find the Key Lookups in the Plan
      qp.query_plan.nodes
      (
        '//RelOp[@LogicalOp="Clustered Index Seek"]/IndexScan[@Lookup=1]'
      ) F_Lookup(LookupNode)
    cross apply 
    --Get the Database,Schema,Table of the Lookup
      --Also get the Alias (if it exists) in case the table
      --  is used more than once in the query
      (select LookupDB=LookupNode.value('(./Object[1]/@Database)','sysname')
             ,LookupSchema=LookupNode.value('(./Object[1]/@Schema)','sysname')
             ,LookupTable=LookupNode.value('(./Object[1]/@Table)','sysname')
             ,LookupAlias=isnull(LookupNode.value('(./Object[1]/@Alias)','sysname'),'')
             ,ColumnCount=LookupNode.value('count(../OutputList[1]/ColumnReference)','int')
      ) F_LookupInfo
    cross apply 
    --Get the Output Columns
      (select stuff(
                (select ','+ColName
    from LookupNode.nodes('(../OutputList[1]/ColumnReference)') F_Col(ColNode)
                 cross apply 
                   (select ColName=ColNode.value('(./@Column)','sysname')) F_ColInfo
    order by ColName
    for xml path(''),type).value('(./text())[1]','varchar(max)')
                ,1,1,'')
      ) F_ColList(ColList)
    outer apply
      --Get the Parent RelOp Node, hoping that it is a Nested Loops operator.
      --Use OUTER APPLY because we may not find it
      LookupNode.nodes
      (
        '(./../../..[@PhysicalOp="Nested Loops"])'
      ) F_ParentLoop(ParentLoopNode)
    outer apply
      --Get the GrandParent RelOp Node, hoping that it is a Nested Loops operator.
      --Use OUTER APPLY because we may not find it
      LookupNode.nodes
      (
        '(./../../../../..[@PhysicalOp="Nested Loops"])'
      ) F_GrandParentLoop(GrandParentLoopNode)
    cross apply 
    --Get the Nested Loop Node... Could be the Parent or the GrandParent
      (select LoopNode=isnull(ParentLoopNode.query('.')
                             ,GrandParentLoopNode.query('.'))
      ) F_LoopNode
    cross apply
      --Now that we (hopefully) have a Nested Loops Node, let's find a descendant
      --of that node that is an Index Seek or Index Scan and acquire its Object Information
      LoopNode.nodes
        (
          '//RelOp[@LogicalOp="Index Scan" or @LogicalOp="Index Seek"]
           /IndexScan[1]/Object[1]'
        ) F_SeekNode(SeekObjNode)
    cross apply
      --Get the Database,Schema,Table and Index of the Index Seek/Scan
      --Also get the Alias (if it exists) so we can match it up with 
      --  the Lookup Table
      (select IxDB=SeekObjNode.value('(./@Database)','sysname')
             ,IxSchema=SeekObjNode.value('(./@Schema)','sysname')
             ,IxTable=SeekObjNode.value('(./@Table)','sysname')
             ,IxAlias=isnull(SeekObjNode.value('(./@Alias)','sysname'),'')
             ,IxIndex=SeekObjNode.value('(./@Index)','sysname')
      ) F_SeekInfo
    where qs.cacheobjtype='Compiled Plan'
      and usecounts>=5       --Only interested in those plans used at least 5 times
      and LookupDB=IxDB          --( Lookup and IndexSeek/Scan )
      and LookupSchema=IxSchema  --(   Database,Schema,Table,  )
      and LookupTable=IxTable    --(   and [possible] Alias    )
      and LookupAlias=IxAlias    --(   must match              )
      and ColumnCount<=5     --Limit to the #columns we're willing to INCLUDE
    order by TableName
    ,IndexName
    ,ColumnsToInclude
    
  6. T-SQL Tuesday #017: APPLY: It Slices! It Dices! It Does It All!

    T-SQL TuesdayThis blog entry is participating in T-SQL Tuesday #017, hosted this month by Matt Velic.

    You are invited to visit his blog and join the party and read more blogs participating in this month’s theme: APPLY Knowledge.

    I’ve been a fan of the APPLY operator since the beginning. That’s why it’s incorporated into the name of my blog. And I've blogged about its power about a dozen times.

    I don’t know how anyone lived without it before SQL2005.

    Practically everyone knows that you can use it to invoke table-valued functions. That is its most obvious usage and it’s about the only way you’ll see APPLY demonstrated in 97% of the books on SQL Server (if they even mention it at all).

    APPLY is a cool cat, baby!But, as you can see by the twinkle in APPLY’s eyes at the left (even behind the cool shades), he’s got other things up his sleeve.

    (Okay, the goofy pointed hat and the goatee are a little much, but hey, if you want great artwork, you’re in the wrong place… Michael J. Swart or Kendra Little are the masters of illustration).

    APPLY is capable of soooooo much more than just invoking TVF’s. It is incredibly versatile. It seems like it can do anything!

    Let’s take a look at what you can do!



    Call Table-Valued Functions!

    Yawn. Okay, so this is the most common way people use APPLY. Here’s a quick demo… For each Vista credit card expiring in Jun2008, let’s get the Contact information for that card using a built-in function in AdventureWorks.

    select f.FirstName
    ,f.LastName
    ,f.JobTitle
    ,f.ContactType
    ,cc.CardNumber
    from Sales.CreditCard cc
    join Sales.ContactCreditCard ccc on cc.CreditCardID=ccc.CreditCardID
    cross apply dbo.ufnGetContactInformation(ccc.ContactID) f
    where cc.ExpYear=2008
    and cc.ExpMonth=6
    and cc.CardType='Vista'
    /*
    FirstName LastName JobTitle         ContactType   CardNumber
    --------- -------- ---------------- ------------- --------------
    Peggy     Justice  Owner            Store Contact 11119759315644
    John      McClane  Purchasing Agent Store Contact 11119490672347
    Laura     Cai      NULL             Consumer      11112813884091
    Natalie   Gonzales NULL             Consumer      11114369564985
    Jarrod    Sara     NULL             Consumer      11116045498593
    Katherine Bailey   NULL             Consumer      11119100149656
    Stephanie Gray     NULL             Consumer      11112324154556
    Shawna    Sharma   NULL             Consumer      11116413893110
    Mindy     Rai      NULL             Consumer      11115163407997
    Jackson   Jai      NULL             Consumer      11112011871602
    And so on... (74 rows total)
    */
    
    That’s very convenient, but kind of boring in the grand scheme of things. So let’s move on.

    Execute SubQueries!

    Why even bother to create a function when you can just create a table on the fly via a correlated subquery with APPLY?

    You are only limited by your imagination.

    Here’s an example…

    For each store with a main office in Wisconsin, let’s look at the top 3 products (and their dollar amounts) that they bought in terms of dollars.

    select c.CustomerID
    ,s.Name
    ,f.ProductID
    ,ProductName=p.Name
    ,f.PurchaseAmt
    from Sales.Customer c
    join Sales.Store s on c.CustomerID=s.CustomerID
    join Sales.CustomerAddress ca on c.CustomerID=ca.CustomerID
    join Person.Address a on ca.AddressID=a.AddressID
    join Person.StateProvince sp on a.StateProvinceID=sp.StateProvinceID
    cross apply (select top 3 ProductID
    ,PurchaseAmt=sum(LineTotal)
                 from Sales.SalesOrderHeader soh
    join Sales.SalesOrderDetail sod on soh.SalesOrderID=sod.SalesOrderID
    where CustomerID=c.CustomerID
    group by ProductID
    order by sum(LineTotal) desc) f
    join Production.Product p on f.ProductID=p.ProductID
    where ca.AddressTypeID=3 --MainOffice
      and sp.StateProvinceCode='WI'
    /*
    CustomerID Name                        ProductID ProductName                 PurchaseAmt
    ---------- --------------------------- --------- -------------------------- ------------
           418 Good Bike Shop                    795 Road-250 Black, 52         30367.350000
           418 Good Bike Shop                    794 Road-250 Black, 48         24136.807500
           418 Good Bike Shop                    792 Road-250 Red, 58           23508.517500
           453 Unique Bikes                      773 Mountain-100 Silver, 44    16319.952000
           453 Unique Bikes                      771 Mountain-100 Silver, 38    12239.964000
           453 Unique Bikes                      772 Mountain-100 Silver, 42    12239.964000
           543 Friendly Neighborhood Bikes       782 Mountain-200 Black, 38      9638.958000
           543 Friendly Neighborhood Bikes       868 Women's Mountain Shorts, M   671.904000
           543 Friendly Neighborhood Bikes       869 Women's Mountain Shorts, L   335.952000
           606 Little Bicycle Supply Shop        717 HL Road Frame - Red, 62     1717.800000
           606 Little Bicycle Supply Shop        838 HL Road Frame - Black, 44    858.900000
           606 Little Bicycle Supply Shop        738 LL Road Frame - Black, 52    809.328000
    */
    
    That’s pretty slick, huh?

    Shred XML!

    Using the .nodes() function, coupled with the .value() and .query() functions, we can use APPLY to do some cool tricks with XML.

    For the first 10 JobCandidates, let’s pull information out of the Resume column, which is of type XML. We’ll get their Name and the schools (there might be more than one) that they attended, listing them in order of their graduation date.:

    with xmlnamespaces 
    (
      'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' as ns
    )
    select JobCandidateID
    ,Name
    ,Education=stuff(EduList,1,2,'')
    from HumanResources.JobCandidate
    cross apply
      Resume.nodes('/ns:Resume') F_ResumeNode(ResumeNode)
    cross apply 
      ResumeNode.nodes('(./ns:Name)') F_NameNode(NameNode)
    cross apply 
      (select Name=NameNode.value('(./ns:Name.First[1])','nvarchar(50)')
                  +' '
                  +NameNode.value('(./ns:Name.Last[1])','nvarchar(50)')
      ) F_Name
    cross apply 
      (select EduList=ResumeNode.query('for $p in (./ns:Education)
                                        order by $p/ns:Edu.EndDate
                                        return concat("; ",string($p/ns:Edu.School))'
                                       ).value('.','nvarchar(200)')
      ) F_Edu
    where JobCandidateID<=10
    /*
    JobCandidateID Name                 Education
    -------------- -------------------- ----------------------------------------------------
                 1 Shai Bassli          Midwest State University
                 2 Max Benson           Evergreen High School ; Everglades State College
                 3 Krishna Sunkammurali Western University
                 4 Stephen Jiang        Louisiana Business College of New Orleans
                 5 Thierry D'Hers       Université d'Aix-Marseille
                 6 Christian Kleinerman Lycée technique Émile Zola ; Université de Perpignan
                 7 Lionel Penuchot      Université de Lyon
                 8 Peng Wu              Western University
                 9 Shengda Yang         Evergreen High School ; Soutern State College
                10 Tai Yee              Midwest State University
    */
    
    As Miley Cyrus would say: That’s really cool.

    Introduce New Columns!

    This is probably the best use of APPLY because it makes code so much more clear.

    Consider the following query, which groups the 2002 Sales by Month. That’s done by the DATEADD/DATEDIFF logic, but it has to be repeated in the GROUP BY and the SELECT and the ORDER BY:

    select Mth=datename(month
                       ,dateadd(month
                               ,datediff(month,'19000101',OrderDate)
                               ,'19000101'))
          ,Total=sum(TotalDue)
    from Sales.SalesOrderHeader 
    where OrderDate>='20020101'
      and OrderDate<'20030101'
    group by dateadd(month
                    ,datediff(month,'19000101',OrderDate)
                    ,'19000101')
    order by dateadd(month
                    ,datediff(month,'19000101',OrderDate)
                    ,'19000101')
    /*
    Mth              Total
    --------- ------------
    January   1605782.1915
    February  3130823.0378
    March     2643081.0798
    April     1905833.9088
    May       3758329.2949
    June      2546121.9618
    July      3781879.0708
    August    5433609.3426
    September 4242717.7166
    October   2854206.7518
    November  4427598.0006
    December   3545522.738
    */
    
    I don’t know about you, but I hate all that repetition, and it looks a little busy. So APPLY to the rescue:

    select Mth=datename(month,FirstDayOfMth)
          ,Total=sum(TotalDue)
    from Sales.SalesOrderHeader 
    cross apply 
      (
        select FirstDayOfMth=dateadd(month
                                    ,datediff(month,'19000101',OrderDate)
                                    ,'19000101')
      ) F_Mth
    where OrderDate>='20020101'
      and OrderDate<'20030101'
    group by FirstDayOfMth
    order by FirstDayOfMth
    /*
    Mth              Total
    --------- ------------
    January   1605782.1915
    February  3130823.0378
    March     2643081.0798
    April     1905833.9088
    May       3758329.2949
    June      2546121.9618
    July      3781879.0708
    August    5433609.3426
    September 4242717.7166
    October   2854206.7518
    November  4427598.0006
    December   3545522.738
    */
    
    Now isn’t that much clearer as to what’s going on? And it costs nothing at all! The query plans of both of the queries above are exactly the same!

    Perform Complicated Calculations!

    This is the part of APPLY that I really love. Let’s look at an example.

    Let’s say that you have a table of comma-delimited lists of one or more integers:

    create table #t
    (
       ID int identity(1,1)
      ,ListOfNums varchar(50)
    )
    insert #t
    values ('279,37,972,15,175')
          ,('17,72')
          ,('672,52,19,23')
          ,('153,798,266,52,29')
          ,('77,349,14')
    select * from #t
    /*
    ID ListOfNums
    -- -----------------
     1 279,37,972,15,175
     2 17,72
     3 672,52,19,23
     4 153,798,266,52,29
     5 77,349,14
    */
    
    Your job: Pull out only the rows that have the 4th number in the list less than 50 and sort the output by the 3rd number in the list.

    Easy, right? Ha ha ha ha ha ha ha hee hee hee hee hee hee ho ho ho ho haw haw giggle chuckle guffaw!

    Before the APPLY operator, SQL2000 folks would have to resort to something ludicrous like this in order to accomplish this task:

    select ID
    ,ListOfNums
    from #t
    where substring(ListOfNums+',,,,',charindex(',',ListOfNums+',,,,',
          charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,')+1)+1)+1,
          (charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,',
          charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,')+1)+1)+1)-
          charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,',
          charindex(',',ListOfNums+',,,,')+1)+1))-1)
          < 50
    order by substring(ListOfNums+',,,,',charindex(',',ListOfNums+',,,,',
             charindex(',',ListOfNums+',,,,')+1)+1,(charindex(',',ListOfNums+',,,,',
             charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,')+1)+1)-
             charindex(',',ListOfNums+',,,,',charindex(',',ListOfNums+',,,,')+1))-1)
    /*
    ID ListOfNums
    -- -------------
     2 17,72
     5 77,349,14
     3 672,52,19,23
     1 279,37,972,15
    */
    
    But now, through the magic of APPLY, you can have a much clearer query:

    select ID
    ,ListOfNums
    from #t
    cross apply (select WorkString=ListOfNums+',,,,') F_Str
    cross apply (select p1=charindex(',',WorkString)) F_P1
    cross apply (select p2=charindex(',',WorkString,p1+1)) F_P2
    cross apply (select p3=charindex(',',WorkString,p2+1)) F_P3
    cross apply (select p4=charindex(',',WorkString,p3+1)) F_P4      
    cross apply (select Num3=convert(int,substring(WorkString,p2+1,p3-p2-1))
                       ,Num4=convert(int,substring(WorkString,p3+1,p4-p3-1))) F_Nums
    where Num4<50
    order by Num3
    /*
    ID ListOfNums
    -- -------------
     2 17,72
     5 77,349,14
     3 672,52,19,23
     1 279,37,972,15
    */
    
    See how I used APPLY to write a little program of a sort? First, I added commas to the end of the column to account for possibly missing numbers in the list. Then I calculated the position of the first comma in that string (p1). Then I calculated the position of the second comma (p2), and that can only be done by using the p1 position I calculated in the previous step. I continue on getting the position of the third and fourth comma. And now that I have those, I can pull out Num3 (from between the second and third comma) and Num4 (from between the third and fourth comma). And I can now use those values in my WHERE and ORDER BY clause.

    And the best part? NO COST! The above two queries are exactly the same as far as the optimizer is concerned. All those CROSS APPLYs are glommed together into a Compute Scalar operator, essentially coming up with really complicated expressions like you see in the first query. Take a look at the query plan yourself and you’ll see.

    Replace the UNPIVOT operator!

    Throw the UNPIVOT operator out the window… The optimizer really translates it into an APPLY operator under the hood anyway… and you can have control over NULLs and differing datatypes.

    Look at the following example, which is a function that accepts a CustomerID and spits out information on the customer in a vertical fashion (note that there can be multiple contacts for a customer… this just spits out the first one… thus the ROW_NUMBER() logic):

    create function VerticalMainOfficeData
    (
      @CustomerID int
    )
    returns table
    as
    return
    with BaseData as
    (
      select SeqNo=row_number() over (order by ContactType)
            ,Name,AddressLine1,AddressLine2,City,StateProvinceName
    ,PostalCode,CountryRegionName
    ,ContactType,ContactName,Phone,EmailAddress
    ,YearOpened,NumberEmployees,Specialty 
    ,SquareFeet,Brands,AnnualSales
    from AdventureWorks.Sales.vStoreWithDemographics
    cross apply (select ContactName=isnull(Title+' ','')
                                     +FirstName+' '
                                     +isnull(MiddleName+' ','')
                                     +LastName
    +isnull(' '+Suffix,'')) F_Name
    where CustomerID=@CustomerID
    and AddressType='Main Office'
    )
    select Property,Value
    from BaseData
    cross apply 
       (values ('NAME AND ADDRESS:','')
              ,('  Name',Name)
              ,('  Address',AddressLine1)
              ,('  ',AddressLine2)
              ,('  City',City)
              ,('  State/Province',StateProvinceName)
              ,('  Postal Code',PostalCode)
              ,('  Country/Region',CountryRegionName)
              ,('','')
              ,('CONTACT:','')
              ,('  Type',ContactType)
              ,('  Name',ContactName)
              ,('  Phone',Phone)
              ,('  EmailAddress',EmailAddress)
              ,('','')
              ,('DEMOGRAPHIC INFO:','')
              ,('  Year Opened',str(YearOpened,4))
              ,('  Number of Employees',convert(varchar(10),NumberEmployees))
              ,('  Specialty',Specialty)
              ,('  Square Feet',convert(varchar(10),SquareFeet))
              ,('  Brands',Brands)
              ,('  Annual Sales','$'+convert(varchar(20),AnnualSales,1))) P(Property,Value)
    where SeqNo=1
    and Value is not null
    
    Watch it in action:

    select * from VerticalMainOfficeData(34)
    /*
    Property              Value
    --------------------- -----------------------------
    NAME AND ADDRESS:     
      Name                Cycles Wholesaler & Mfg.
      Address             Science Park South, Birchwood
                          Stanford House
      City                Warrington
      State/Province      England
      Postal Code         WA3 7BH
      Country/Region      United Kingdom
    
    CONTACT:              
      Type                Owner
      Name                Ms. Barbara J. German
      Phone               1 (11) 500 555-0181
      EmailAddress        barbara4@adventure-works.com
    
    DEMOGRAPHIC INFO:     
      Year Opened         1999
      Number of Employees 15
      Specialty           Touring
      Square Feet         21000
      Brands              4+
      Annual Sales        $800,000.00
    */
    
    Can UNPIVOT do that? Not in a million years. It’s toast!

    Make JOINs Extinct!
    `
    Okay, this is really kind of a joke, but really, think about it… What is a JOIN? For each row in the first table, I want to JOIN it somehow with a row or rows in the second table. That sounds like an APPLY type of thing, doesn’t it? Well it is!

    Look at the following traditional JOIN query, which finds all the Accessories that are Yellow, Blue, White:

    select SubCategoryName=s.Name 
    ,p.ProductID
    ,ProductName=p.Name
    ,p.Color
    from Production.ProductSubCategory s 
    join Production.Product p on s.ProductSubcategoryID=p.ProductSubcategoryID 
    where s.ProductCategoryID=3  --Accessories
      and p.Color in ('Yellow','Blue','White')
    order by SubCategoryName 
    ,p.ProductID   
    /*
    SubCategoryName ProductID ProductName                     Color
    --------------- --------- ------------------------------- ------
    Jerseys               881 Short-Sleeve Classic Jersey, S  Yellow
    Jerseys               882 Short-Sleeve Classic Jersey, M  Yellow
    Jerseys               883 Short-Sleeve Classic Jersey, L  Yellow
    Jerseys               884 Short-Sleeve Classic Jersey, XL Yellow
    Socks                 709 Mountain Bike Socks, M          White
    Socks                 710 Mountain Bike Socks, L          White
    Socks                 874 Racing Socks, M                 White
    Socks                 875 Racing Socks, L                 White
    Vests                 864 Classic Vest, S                 Blue
    Vests                 865 Classic Vest, M                 Blue
    Vests                 866 Classic Vest, L                 Blue
    */
    
    You can replace that JOIN with a CROSS APPLY:

    select SubCategoryName=s.Name 
    ,p.ProductID
    ,ProductName=p.Name
    ,p.Color
    from Production.ProductSubCategory s 
    cross apply (select *
                 from Production.Product 
    where ProductSubcategoryID=s.ProductSubcategoryID) p
    where s.ProductCategoryID=3  --Accessories
      and p.Color in ('Yellow','Blue','White')
    order by SubCategoryName 
    ,p.ProductID   
    /*
    SubCategoryName ProductID ProductName                     Color
    --------------- --------- ------------------------------- ------
    Jerseys               881 Short-Sleeve Classic Jersey, S  Yellow
    Jerseys               882 Short-Sleeve Classic Jersey, M  Yellow
    Jerseys               883 Short-Sleeve Classic Jersey, L  Yellow
    Jerseys               884 Short-Sleeve Classic Jersey, XL Yellow
    Socks                 709 Mountain Bike Socks, M          White
    Socks                 710 Mountain Bike Socks, L          White
    Socks                 874 Racing Socks, M                 White
    Socks                 875 Racing Socks, L                 White
    Vests                 864 Classic Vest, S                 Blue
    Vests                 865 Classic Vest, M                 Blue
    Vests                 866 Classic Vest, L                 Blue
    */
    
    The query plans for both of those are exactly the same!

    And LEFT JOINs can be replaced by OUTER APPLYs!

    Today UNPIVOTs and JOINs… Tomorrow the world! Bwu hu hu ha ha ha ha haaaaaaa (Diabolical laughter).

    Do it ALL!

    For my final example, I’ll do ALL of the above (except for the JOIN replacement, which was just kind of a joke/trick anyway).

    In doing the examples above, my query cache got populated with the text and plans of the queries I executed. We will look in the cache for the CROSS APPLY(TOP 3) query that was in the Execute SubQueries! section above, shred its query plan, looking for the operators, figure out their percentage cost, and list them in descending order of that cost. For Scans and Seeks and Joins, we will show the table, column and/or index used. And it will be presented in a vertical manner.

    Note that the challenge here is finding the cost of each operator… it is not stored in the plan. Each operator has a Total Subtree Cost, but that is the cost of the operator itself PLUS the Subtree Costs of each of its immediate children operators. So for each operator, I had to find its children, total up their Subtree Costs and subtract that from the Subtree Cost of the operator to get the Individual Cost of the operator. This is done in the CROSS APPLY of the OperatorCosts CTE.

    Hopefully the comments are self-explanatory:

    with xmlnamespaces 
    (
      default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
    )
    ,OperatorData as
    (
      select ParentNodeID
    ,NodeID
    ,OperatorDesc 
    ,ScanSchema,ScanTable,ScanIndex
    ,LoopSchema,LoopTable,LoopColumn
    ,HashSchema,HashTable,HashColumn
    ,SubTreeCost 
    from sys.dm_exec_query_stats qs
    cross apply 
    --Get the Query Text
        sys.dm_exec_sql_text(qs.sql_handle) qt             
    cross apply 
    --Get the Query Plan
        sys.dm_exec_query_plan(qs.plan_handle) qp
    cross apply
        --Get the RelOp nodes from the Plan
        qp.query_plan.nodes('//RelOp') F_RelNodes(RelNode)
      cross apply 
    --Pull out the various attributes from the RelOp Node
        --And also extract the ParentNodeID of the operator
        (select ParentNodeID=RelNode.value('(../../@NodeId)','int')
               ,NodeID=RelNode.value('(./@NodeId)','int')
               ,LogicalOp=RelNode.value('(./@LogicalOp)','varchar(50)')
               ,PhysicalOp=RelNode.value('(./@PhysicalOp)','varchar(50)')
               ,SubTreeCost=RelNode.value('(./@EstimatedTotalSubtreeCost)','float')
        ) F_OpInfo
    cross apply 
    --Make a nice description out of the Operator
         (select OperatorDesc=case
                                when LogicalOp=PhysicalOp
    then PhysicalOp
    else PhysicalOp+' ('+LogicalOp+')'
                              end
        ) F_OpDesc
    outer apply 
    --Get child nodes having to do with a Scan/Seek
        --Note that OUTER APPLY is used since there may not
        --be any child nodes of this type
        RelNode.nodes('(./IndexScan[1]/Object[1])') 
        F_ScanNode(ScanNode)
      outer apply 
    --And pull out their Table/Index information
        (select ScanSchema=ScanNode.value('(./@Schema)','varchar(50)')
               ,ScanTable=ScanNode.value('(./@Table)','varchar(50)')
               ,ScanIndex=ScanNode.value('(./@Index)','varchar(100)')
        ) F_ScanInfo
    outer apply 
    --Get child nodes having to do with Nested Loops
        --Note that OUTER APPLY is used since there may not
        --be any child nodes of this type
        RelNode.nodes('(./NestedLoops[1]/OuterReferences[1]/ColumnReference[1])') 
        F_LoopNode(LoopNode)
      outer apply 
    --And pull out their Table/Column information
        (select LoopSchema=LoopNode.value('(./@Schema)','varchar(50)')
               ,LoopTable=LoopNode.value('(./@Table)','varchar(50)')
               ,LoopColumn=LoopNode.value('(./@Column)','varchar(50)')
        ) F_LoopInfo
    outer apply
        --Get child nodes having to do with Hash Joins
        --Note that OUTER APPLY is used since there may not
        --be any child nodes of this type
        RelNode.nodes('(./Hash[1]/HashKeysBuild[1]/ColumnReference[1])') 
        F_HashNode(HashNode)
      outer apply
        --And pull out their Table/Column information
        (select HashSchema=HashNode.value('(./@Schema)','varchar(50)')
               ,HashTable=HashNode.value('(./@Table)','varchar(50)')
               ,HashColumn=HashNode.value('(./@Column)','varchar(50)')
        ) F_HashInfo
    where qt.text like '%select top 3 ProductID%'
        and qt.text not like '%with xmlnamespaces%'  --Exclude this query
    )
    ,OperatorCosts as
    (
      --Calculate the Individual Costs by subtracting each Operator's
      --SubTreeCost minus its immediate children's SubTreeCosts
      select NodeID
    ,OperatorDesc 
    ,ScanSchema,ScanTable,ScanIndex
    ,LoopSchema,LoopTable,LoopColumn
    ,HashSchema,HashTable,HashColumn
    ,OperatorCost=convert(numeric(16,8),SubTreeCost-ChildrenSubTreeCost)
      from OperatorData o
    cross apply 
    --Calculate the sum of the SubTreeCosts of the immediate children
        (select ChildrenSubTreeCost=isnull(sum(SubTreeCost),0)
         from OperatorData
    where ParentNodeID=o.NodeID) F_ChildCost
    )
    ,CostPercents as
    (
      --Calculate the CostPercent using a window function
      select NodeID
    ,OperatorDesc 
    ,ScanSchema,ScanTable,ScanIndex
    ,LoopSchema,LoopTable,LoopColumn
    ,HashSchema,HashTable,HashColumn
    ,CostPercent=convert(numeric(5,1),100*OperatorCost/sum(OperatorCost) over ())
      from OperatorCosts       
    )
    select Information
    from CostPercents
    cross apply
      --UNPIVOT the information into a vertical presentation
      (values ('NodeID '+convert(varchar(5),NodeID)
              +' ('+convert(varchar(10),CostPercent)+'%):')
             ,('  '+OperatorDesc)
             ,('    Table: '+ScanSchema+'.'+ScanTable)
             ,('    Index: '+ScanIndex)
             ,('    Table: '+LoopSchema+'.'+LoopTable)
             ,('    Column: '+LoopColumn)
             ,('    Table: '+HashSchema+'.'+HashTable)
             ,('    Column: '+HashColumn)) P(Information)
    where Information is not null   --Eliminate NULL rows
    order by CostPercent desc
    /*
    Information
    ----------------------------------------------------------------
    NodeID 13 (19.6%):
      Sort (TopN Sort)
    NodeID 15 (19.6%):
      Sort
    NodeID 9 (18.2%):
      Clustered Index Scan
        Table: [Sales].[CustomerAddress]
        Index: [PK_CustomerAddress_CustomerID_AddressID]
    NodeID 11 (14.5%):
      Clustered Index Seek
        Table: [Sales].[Store]
        Index: [PK_Store_CustomerID]
    NodeID 20 (8.7%):
      Clustered Index Seek
        Table: [Sales].[SalesOrderDetail]
        Index: [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]
    NodeID 5 (5.8%):
      Hash Match (Inner Join)
        Table: [Person].[Address]
        Column: AddressID
    NodeID 12 (4.9%):
      Clustered Index Seek
        Table: [Sales].[Customer]
        Index: [PK_Customer_CustomerID]
    NodeID 18 (4.7%):
      Index Seek
        Table: [Sales].[SalesOrderHeader]
        Index: [IX_SalesOrderHeader_CustomerID]
    NodeID 30 (2.3%):
      Clustered Index Seek
        Table: [Production].[Product]
        Index: [PK_Product_ProductID]
    NodeID 7 (0.6%):
      Index Seek
        Table: [Person].[StateProvince]
        Index: [AK_StateProvince_StateProvinceCode_CountryRegionCode]
    NodeID 8 (0.6%):
      Index Seek
        Table: [Person].[Address]
        Index: [IX_Address_StateProvinceID]
    NodeID 6 (0.2%):
      Nested Loops (Inner Join)
        Table: [Person].[StateProvince]
        Column: StateProvinceID
    NodeID 3 (0.2%):
      Nested Loops (Inner Join)
        Table: [Sales].[CustomerAddress]
        Column: CustomerID
    NodeID 0 (0.0%):
      Nested Loops (Inner Join)
        Table: [Sales].[SalesOrderDetail]
        Column: ProductID
    NodeID 1 (0.0%):
      Nested Loops (Inner Join)
        Table: [Sales].[Customer]
        Column: CustomerID
    NodeID 2 (0.0%):
      Nested Loops (Inner Join)
        Table: [Sales].[CustomerAddress]
        Column: CustomerID
    NodeID 19 (0.0%):
      Compute Scalar
    NodeID 16 (0.0%):
      Compute Scalar
    NodeID 17 (0.0%):
      Nested Loops (Inner Join)
        Table: [Sales].[SalesOrderHeader]
        Column: SalesOrderID
    NodeID 14 (0.0%):
      Stream Aggregate (Aggregate)
    */
    Is that, like, waaaaay cool, or what?!

    I hope I’ve convinced you how powerful the APPLY operator can be. I couldn’t live without it.
  7. Documenting Your SQL Agent Jobs

    XKCD RTFMAs a developer, I hate writing documentation.

    And people in general hate reading (or just don’t read) documentation or manuals, as indicated by the XKCD comic.

    So why in the world would I go through the crazy exercise of writing a stored procedure to generate documentation for SQL Agent Jobs?

    Well, for one thing, I guess I must be a masochist.

    Second of all, I enjoy a challenge.

    Third, I’m not really writing documentation… I’m just, er, regurgitating it.

    And fourth, I have to admit that it does come in very handy when I go in to visit a new client and find out what they’ve got running. ”What? You SHRINK your databases every single night? Do you like poking yourself in the head with an icepick too?”

    This documentation I’m spitting out is not just a bunch o’ columns that get plopped into an SSMS grid results window… I went the extra mile and created actual HTML code to generate nice professional content suitable for viewing in your favorite browser (and, when printed, it’s also suitable for wrapping fish, training puppies, and lining birdcages).

    Here’s a sample of the output (click on the picture to see a larger version):

    Documentation Sample in Browser

    If that makes you drool, then keep on reading. If it makes you yawn, then go take a nap.

    The name of the stored procedure is called usp_SQLAgentJobDocumentation, and you can download it from my SkyDrive and install it into whatever database you wish.

    It just generates a (multi-row) single-column NVARCHAR(MAX) result called HTM. You could just run it in SSMS, but the output is not going to do you much good unless you copy/paste it into a text file. (Important note: Direct the output to GRID, not to TEXT, because a TEXT output window can truncate some of the output. Once it has output to GRID, click in the grid and do a Select All (CTRL+A) and then Copy/Paste).

    You can create a (CmdExec) job that uses BCP to output the procedure’s contents to a file, as in the following example:

    bcp "exec YOURDATABASEHERE.YOURSCHEMAHERE.usp_SQLAgentJobDocumentation" 
        queryout "X:\SomeFolder\$(ESCAPE_DQUOTE(MACH))_SQLAgentDocumentation.htm" 
        -S$(ESCAPE_SQUOTE(SRVR)) -T -c -w
    (If you have a local named instance, you may want to add $(ESCAPE_DQUOTE(INST)) to the filename).

    Alternately, you can create an SSIS Package that will execute the stored procedure and output the contents to a Flat File Destination.

    Then, you can browse away in the file to your heart’s content.

    I made my best attempt to FULLY document the jobs, including multiple schedules, alerts, all possible notifications, etc… The only thing that is left out is anything involving Target Servers, mainly because I couldn’t test it out, but I imagine it’s as easy as JOINing in the sysjobservers and systargetservers tables of the msdb database. If someone would like to test it for me, leave a comment for me or send me an email and I’ll send off a revised copy that will address that.

    One extra tidbit is that the documentation shows the average job duration over the last 100 executions, so you can get a good idea of how long the job takes to run. In addition, each individual Job Step shows the duration of its last execution. Both are spelled out in xx Hours xx Mins xx Secs easy-to-read format.

    By the way, I did not include any information as to the last date/time executed or next date/time to be executed, because I figured it would be outdated within a half-hour of producing the documentation.

    The procedure is just one huge single SELECT statement, with liberal use of CTE’s and CROSS APPLY’s (so it will only run in SQL2005 and beyond). Currently it will generate HTML for ALL jobs, but if you want to revise it to accept some kind of parameters to filter only certain jobs (by Job_ID or Name or whatever), it’s easy enough to do. The procedure’s first CTE is called SelectedJobs and you can make your modifications there:

    alter procedure usp_SQLAgentJobDocumentation
    --Optional parameters here to filter jobs you want
    as
    with SelectedJobs as
    (
      --If you want to add parameters to the procedure to filter out
      --certain jobs, you can do it in the WHERE clause here
      select *
      from msdb.dbo.sysjobs j
    --where job_id=@Job_ID
      --where name like '%'+@JobNamePattern+'%'
      --where category_id=@JobCategoryID
    )
    ...
    
    There were a couple of challenges in putting this together. Namely…

    One: In my last blog post, I talked about handling the goofy integer representations of dates and times and durations in the sysjobhistory table.

    Two: Books Online is a little spotty in giving detail about some of the msdb table columns, so I had to search around the web or use trial-and-error to find out how some of the columns worked. One example is the flags column in the sysjobsteps table, which (I found) is used to represent the various outputs of a Job Step. Books Online simply says “Reserved” for the description of this column, which is no help whatsoever.

    Three: I had to unpivot the various data into the various HTML table structures, but that was easily accomplished via a CROSS APPLY, which I’ll talk about a little next week in my T-SQL Tuesday post. And getting the various HTML output to come out in the correct order was something I had to always keep in mind. I also had to worry about various HTML encodings of the ampersand and less-than and greater-than characters.

    Four: Finally, translating all the various sysschedules columns into a readable English phrase was fun, coming up with possible phrases like ”Every 20 Minutes From 6:00am to 10:00pm Every 2 Months on the 3rd Wednesday of the Month”. I realize now that a few others have done this already (like SQLFool Michelle Ufford), but I’m glad I attacked it from scratch anyway. The code from that CTE is below if you’re interested.

    I’m sure there are 3rd-party products that produce stuff like this, but I don’t care. I did it because I can, and it was fun. Yes, I’m a SQL nerd.

    I hope you find this to be useful. I must admit it’s been a great help to me and my clients.

    select schedule_id
    ,name
    ,SchedDesc=TimeOfDay+Frequency+EffDtRange
    from msdb.dbo.sysschedules 
    cross apply 
    --Translate the dates and times into DATETIME values
      --And translate the times into HH:MM:SSam (or HH:MMam) strings
      (select StDate=convert(datetime
                            ,convert(varchar(8),active_start_date))
             ,EnDate=convert(datetime
                            ,convert(varchar(8),active_end_date))
             ,StTime=convert(datetime
                            ,stuff(stuff(right(1000000+active_start_time
    ,6)
                                        ,3,0,N':')
                                  ,6,0,N':'))
             ,EnTime=convert(datetime
                            ,stuff(stuff(right(1000000+active_end_time
    ,6)
                                        ,3,0,N':')
                                  ,6,0,N':'))
      ) F_DtTm
    cross apply
      --Translate the times into appropriate HH:MM:SSam or HH:MMam char formats
      (select replace(replace(replace(substring(lower(convert(varchar(30),StTime,109))
                                               ,13,14)
                                     ,N':000',N'')
                             ,N':00a',N'a')
                     ,N':00p',N'p')
             ,replace(replace(replace(substring(lower(convert(varchar(30),EnTime,109))
                                               ,13,14)
                                     ,N':000',N'')
                             ,N':00a',N'a')
                     ,N':00p',N'p')
      ) F_Tms(StTimeString,EnTimeString)
    cross apply 
    --What Time of Day? Single Time or Range of Times/Intervals
      (select case 
    when freq_subday_type=0
    then N''
                else case 
    when freq_subday_type=1
    then N'At '
                       else N'Every '
                           +convert(nvarchar(10),freq_subday_interval)
                           +' '
                           +case freq_subday_type
    when 2 then N'Second'
                              when 4 then N'Minute'
                              when 8 then N'Hour'
                            end
                           +case 
    when freq_subday_interval=1 then N'' else N's' end
                           +N' From '
                     end
                    +StTimeString
    +case
                       when freq_subday_type=1
    then N''
                       else N' to '+EnTimeString
    end
                    +N' '
              end
      ) F_Tm(TimeOfDay)
    cross apply
      --Translate Frequency  
      (select case freq_type
    when 1
    then N'One Time Only'
                when 4
    then N'Every '
                    +case freq_interval 
    when 1
    then N'Day'
                       else convert(nvarchar(10),freq_interval)+N' Days'
                     end
                when 8
    then N'Every '
                    +case freq_recurrence_factor
    when 1
    then N''
                       else convert(nvarchar(10),freq_recurrence_factor)+N' Weeks on '
                     end
                    +stuff(case when freq_interval& 1<>0 then N', Sunday' else N'' end
                          +case when freq_interval& 2<>0 then N', Monday' else N'' end
                          +case when freq_interval& 4<>0 then N', Tuesday' else N'' end
                          +case when freq_interval& 8<>0 then N', Wednesday' else N'' end
                          +case when freq_interval&16<>0 then N', Thursday' else N'' end
                          +case when freq_interval&32<>0 then N', Friday' else N'' end
                          +case when freq_interval&64<>0 then N', Saturday' else N'' end
                          ,1,2,N'')
                when 16
    then N'Every '
                    +case freq_recurrence_factor 
    when 1
    then N'Month '
                       else convert(nvarchar(10),freq_recurrence_factor)+N' Months '
                     end
                    +N'on the '
                    +convert(nvarchar(10),freq_interval)
                    +case 
    when freq_interval in (1,21,31)
                       then N'st'
                       when freq_interval in (2,22)
                       then N'nd'
                       when freq_interval in (3,23)
                       then N'rd'
                       else N'th'
                     end
                    +N' of the Month'
                when 32
    then N'Every '
                    +case freq_recurrence_factor 
    when 1
    then N'Month '
                       else convert(nvarchar(10),freq_recurrence_factor)+N' Months '
                     end
                    +N'on the '
                    +case freq_relative_interval 
    when  1 then N'1st '
                       when  2 then N'2nd '
                       when  4 then N'3rd '
                       when  8 then N'4th '
                       when 16 then N'Last '
                     end
                    +case freq_interval 
    when  1 then N'Sunday'
                       when  2 then N'Monday'
                       when  3 then N'Tuesday'
                       when  4 then N'Wednesday'
                       when  5 then N'Thursday'
                       when  6 then N'Friday'
                       when  7 then N'Saturday'
                       when  8 then N'Day'
                       when  9 then N'Weekday'
                       when 10 then N'Weekend Day'
                     end
                    +N' of the Month'
                when 64
    then N'When SQL Server Agent Starts'
                when 128
    then N'Whenever the CPUs become Idle'
                else N'Unknown'
              end
      ) F_Frq(Frequency)
    cross apply
      --When is it effective?
      (select N' (Effective '+convert(nvarchar(11),StDate,100)
             +case  
    when EnDate='99991231'
                then N''
                else N' thru '+convert(nvarchar(11),EnDate,100)
              end
             +N')'           
    ) F_Eff(EffDtRange)
    
  8. SysJobHistory Outliers

    Bell Curve with Standard DeviationsIn honor of NCAA March Madness, former basketball star (and current SQL MVP and SQLRockStar) Thomas LaRock (b | t) wrote a series of blog posts about SQL Server System Tables entitled, appropriately enough, March Madness. They ran from Mar17 to Apr04. Check them out… there is sure to be something in there that you didn’t know about before or that you’ll find to be really cool and useful.

    Which brings me to this post. I really enjoyed the concept of Tom’s Apr02 post on the sysjobhistory table in msdb. He had a query that found the SQL Agent Jobs that are outliers… in other words, jobs that are atypical in terms of their run duration… to put it still another way, this query found jobs that ran longer than what would seem to be statistically “normal”.

    Tom’s query looked at the previous 24 hours’ worth of job history and found those job steps that ran longer than the average duration plus two standard deviations (i.e. the mean plus two sigmas). In the picture of the bell curve above, that is in the green and gray zone at the right.

    I thought this was a great concept, but I wanted to expand on it and introduce some additional features.

    First of all, as Tom mentioned, the sysjobhistory table stores dates and times as integers, which is a real pain in the posterior. The dates are stored as YYYYMMDD integers and the times (and durations) are stored as HHMMSS integers. So, for example, if I look at a random entry in my sysjobhistory table…

    select top 1 run_date, run_time, run_duration
    from msdb.dbo.sysjobhistory
    /*
    run_date run_time run_duration
    -------- -------- ------------
    20110302    74856          106
    */
    
    the run_time of 74856 means that the item started at 07:48:56 and the run_duration of 106 means that the item ran for a total of 00:01:06, or 1 minute and 6 seconds. So my first job was to translate these goofy integer representations into something meaningful.

    The run_date is easy, since we are beyond the Middle Ages and all years that we deal with are 4 digits, so I could just directly translate a date by converting the integer to a VARCHAR(8) and then converting that to a DATETIME:

    select top 1 convert(datetime,convert(varchar(8),run_date))
    from msdb.dbo.sysjobhistory
    /* 2011-03-02 00:00:00.000 */
    
    The run_time takes a little more work, because the CONVERT function insists on times being in HH:MM:SS format, with 2 digits for each element, meaning leading zeroes had to be there. So first I would introduce the leading zeroes I need by adding 1000000 to the number and then taking the RIGHT-most 6 characters (T-SQL will implicitly convert the integer to a VARCHAR before doing the RIGHT function):

    select top 1 right(1000000+run_time,6)
    from msdb.dbo.sysjobhistory
    /* 074856 */
    
    Then it’s a matter of introducing the colons to separate the elements:

    select top 1 stuff(stuff(right(1000000+run_time,6),3,0,':'),6,0,':')
    from msdb.dbo.sysjobhistory
    /* 07:48:56 */
    
    Now, when I combine that with my VARCHAR representation of the run_date and put them together in YYYYMMDD HH:MM:SS format, I can CONVERT it into a complete date and time of the item:

    select top 1 convert(datetime,convert(varchar(8),run_date)
                                 +' '
                                 +stuff(stuff(right(1000000+run_time,6),3,0,':'),6,0,':'))
    from msdb.dbo.sysjobhistory
    /* 2011-03-02 07:48:56.000 */
    
    The run_duration requires a little more manipulation to translate the HHMMSS integer representation into just a number of seconds (so the value of 106, representing 00:01:06, is translated into 66 seconds):

    select top 1 run_duration/10000*3600   --Hours*3600 = Seconds
                +run_duration%10000/100*60 --Minutes*60 = Seconds
                +run_duration%100          --Seconds
    from msdb.dbo.sysjobhistory
    /* 66 */            
    
    So now that I have those formulas down, I can put together a query that would give me more meaningful information. This query will give me information for successfully-completed jobs… Entries with a Step_ID of 0 indicate a date/time/duration of the job as a whole:

    select job_id
    ,date_executed=convert(datetime,convert(varchar(8),run_date))
                                         +' '
                                         +stuff(stuff(right(1000000+run_time
    ,6)
                                                      ,3,0,':')
                                                ,6,0,':')
          ,secs_duration=run_duration/10000*3600
    +run_duration%10000/100*60
    +run_duration%100
    from msdb.dbo.sysjobhistory
    where step_id=0     --Job Outcome
      and run_status=1  --Succeeded
    /*
    job_id                               date_executed           secs_duration
    ------------------------------------ ----------------------- -------------
    ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 07:59:17.000            66
    ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 08:23:49.000            74
    ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 13:07:24.000            61
    etc, etc, etc
    */
    
    Now that I have that information, I can put that query into a CTE and make good use of it.

    In order to find the outliers, I wanted the ability to define a certain “history window” or “baseline window” of a range of dates of job runs from which I could calculate the Average and the Standard Deviation. So the query would accept a @HistoryStartDate and @HistoryEndDate date range (I would convert the dates to full days starting/ending at midnight). Also, for a halfway meaningful sample, I wanted an option to calculate the statistics for jobs that had run a minimum number of times (represented by @MinHistExecutions). And, since we may only want to look at outliers of jobs that are kind of “beefy”, I wanted to introduce the option to only pay attention to jobs that had a minimum average duration of some kind (@MinAvgSecsDuration).

    So this query would give us our history/baseline calculations:

    declare @HistoryStartDate datetime = '19000101'
           ,@HistoryEndDate datetime = getdate()
           ,@MinHistExecutions int = 10
    ,@MinAvgSecsDuration int = 30
    ;
    with JobHistData as
    (
      select job_id
    ,date_executed=convert(datetime,convert(varchar(8),run_date))
                                           +' '
                                           +stuff(stuff(right(1000000+run_time
    ,6)
                                                        ,3,0,':')
                                                  ,6,0,':')
            ,secs_duration=run_duration/10000*3600
    +run_duration%10000/100*60
    +run_duration%100
    from msdb.dbo.sysjobhistory
    where step_id=0     --Job Outcome
        and run_status=1  --Succeeded
    )
    select job_id
    ,AvgDuration=avg(secs_duration*1.)
          ,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
    from JobHistData
    where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
      and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
    group by job_id
    having count(*)>=@MinHistExecutions
    and avg(secs_duration*1.)>=@MinAvgSecsDuration
    /*
    job_id                               AvgDuration    AvgPlus2StDev
    ------------------------------------ ----------- ----------------
    ECA51518-7144-49DE-9153-EB12D42AE0D9   69.373333 108.739927181991 
    0BC1CB96-602F-417A-AB42-CAF98E1E39A1   47.266666  82.860423255389
    etc, etc, etc
    */
    
    And now that we have that, we can look for jobs in an “analysis window” date range (@AnalysisStartDate and @AnalysisEndDate)… perhaps the last 24 or 48 hours… to see which ones have a duration that exceeds the AvgPlus2StDev of the baseline.

    Here is a stored procedure to do just that.

    create procedure usp_SQLAgentJobOutliers
       @HistoryStartDate datetime  = null
      ,@HistoryEndDate datetime    = null
      ,@AnalysisStartDate datetime = null
      ,@AnalysisEndDate datetime   = null
      ,@MinHistExecutions int      = 10
    ,@MinAvgSecsDuration int     = 30
    as
     
    if @HistoryStartDate is null set @HistoryStartDate='19000101' ;
    if @HistoryEndDate is null set @HistoryEndDate=getdate() ;
    if @AnalysisStartDate is null set @HistoryStartDate='19000101' ;
    if @AnalysisEndDate is null set @HistoryEndDate=getdate() ;
     
    with JobHistData as
    (
      select job_id
    ,date_executed=convert(datetime,convert(varchar(8),run_date))
                                           +' '
                                           +stuff(stuff(right(1000000+run_time
    ,6)
                                                        ,3,0,':')
                                                  ,6,0,':')
            ,secs_duration=run_duration/10000*3600
    +run_duration%10000/100*60
    +run_duration%100
    from msdb.dbo.sysjobhistory
    where step_id=0     --Job Outcome
        and run_status=1  --Succeeded
    )
    ,JobHistStats as
    (
      select job_id
    ,AvgDuration=avg(secs_duration*1.)
            ,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
      from JobHistData
    where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
        and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
      group by job_id
    having count(*)>=@MinHistExecutions
    and avg(secs_duration*1.)>=@MinAvgSecsDuration
    )
    select jd.job_id
    ,JobName=j.name
    ,ExecutionDate=jd.date_executed
    ,[Duration (secs)]=jd.secs_duration
    ,[Historical Avg Duration (secs)]=AvgDuration
    ,[Min Threshhold (secs)]=AvgPlus2StDev
    from JobHistData jd
    join JobHistStats jhs on jd.job_id=jhs.job_id
    join msdb.dbo.sysjobs j on jd.job_id=j.job_id
    where date_executed>=dateadd(day,datediff(day,'19000101',@AnalysisStartDate),'19000101')
      and date_executed<dateadd(day,1+datediff(day,'19000101',@AnalysisEndDate),'19000101')
      and secs_duration>AvgPlus2StDev
    
    If you don’t specify any parameters, it will use the ENTIRE sysjobhistory table for both the “history/baseline window” and the “analysis window”. Try calling the procedure with various parameters and see what you come up with. See if any jobs of yours have run with an atypically long duration.

    If you want to get more granular and analyze individual Job Steps as opposed to Jobs as a whole, the following procedure will do that for you.

    create procedure usp_SQLAgentJobStepOutliers
       @HistoryStartDate datetime  = null  
    ,@HistoryEndDate datetime    = null
      ,@AnalysisStartDate datetime = null
      ,@AnalysisEndDate datetime   = null
      ,@MinHistExecutions int      = 10
    ,@MinAvgSecsDuration int     = 30
    as
     
    if @HistoryStartDate is null set @HistoryStartDate='19000101' ;
    if @HistoryEndDate is null set @HistoryEndDate=getdate() ;
    if @AnalysisStartDate is null set @HistoryStartDate='19000101' ;
    if @AnalysisEndDate is null set @HistoryEndDate=getdate() ;
     
    with JobHistData as
    (
      select job_id
    ,step_id
    ,date_executed=convert(datetime,convert(varchar(8),run_date))
                                           +' '
                                           +stuff(stuff(right(1000000+run_time
    ,6)
                                                        ,3,0,':')
                                                  ,6,0,':')
            ,secs_duration=run_duration/10000*3600
    +run_duration%10000/100*60
    +run_duration%100
    from msdb.dbo.sysjobhistory
    where step_id<>0
    and run_status=1  --Succeeded
    )
    ,JobHistStats as
    (
      select job_id
    ,step_id
    ,AvgDuration=avg(secs_duration*1.)
            ,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
      from JobHistData
    where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
        and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
      group by job_id
    ,step_id
    having count(*)>=@MinHistExecutions
    and avg(secs_duration*1.)>=@MinAvgSecsDuration
    )
    select jd.job_id
    ,JobName=j.name
    ,jd.step_id
    ,s.step_name
    ,ExecutionDate=jd.date_executed
    ,[Duration (secs)]=jd.secs_duration
    ,[Historical Avg Duration (secs)]=AvgDuration
    ,[Min Threshhold (secs)]=AvgPlus2StDev
    from JobHistData jd
    join JobHistStats jhs on jd.job_id=jhs.job_id
    join msdb.dbo.sysjobs j on jd.job_id=j.job_id
    join msdb.dbo.sysjobsteps s on jd.job_id=s.job_id and jd.step_id=s.step_id 
    where date_executed>=dateadd(day,datediff(day,'19000101',@AnalysisStartDate),'19000101')
      and date_executed<dateadd(day,1+datediff(day,'19000101',@AnalysisEndDate),'19000101')
      and secs_duration>AvgPlus2StDev
    
    However, a word of warning: I was disappointed to find that the sysjobhistory table stores only the Step_ID and not the Step_UID unique identifier. So a Job’s Step_ID #1 today may not have been its Step_ID #1 a month ago. If you move steps up/down in the step list of the SQL Agent Job dialog, it changes their Step_ID value from that point forward, but it doesn’t change anything in history. So you may be comparing apples and oranges.

    Anyway, I hope you find these procedures useful for finding those outliers. Thanks again to Thomas LaRock for the idea.

    My next blog post will also be dealing with SQL Agent Jobs… I think you’ll really like that one… Stay tuned!
  9. Shrink Your Databases Regularly

    Oh, no, no, no… Not that kind of shrinking! (But I did get your attention, didn’t I?)

    shrink \shringk\ n. : (slang) [short for headshrinker] a clinical psychiatrist or psychologist [Tony Soprano sees his shrink, Dr. Melfi, every week.] vt. : (slang) to psychoanalyze [Shrink your databases regularly.] shrank, shrunk


    A note to the reader: This is based on several true stories. Some events were changed or otherwise fictionalized for dramatic purposes.



    Shrink Your Databases RegularlyDr. Ben Adryl: Good afternoon. I’m Dr. Ben Adryl, but many people just refer to me by my initials DBA. And you are Mr. Tabase, I presume?

    Deigh Tabase: Yes, my first name is Deighton. Most people just call me Deigh for short.

    Dr. Ben Adryl (DBA): Welcome. Now how can I help you today?

    Deigh Tabase (Database): Well, I seem to have a lot of problems, and they’ve been getting worse with each passing day. It was suggested that I make an appointment with you. But I have to admit, I’ve never been to a shrink before.

    DBA: I would prefer that you not use the “s” word in this office. I’m not really a traditional psychotherapist anyway… I take more of a holistic approach and work with all aspects of a person’s well-being, like sanity, health, productivity. Think of me as a life coach. I’m kind of like Dr. Phil and Dr. Oz and David Allen all rolled into one.

    Database: That sounds great, doc. How do we start?

    DBA: Tell me a little bit about some of your problems.

    Database: Well, for one thing, I’ve been suffering from claustrophobia as long as I can remember. Plus I feel bloated and tired and overworked and I can’t seem to get organized. I guess I’m kind of a mess.

    DBA: Don’t worry, all is not lost. I’m sure I can help. Let me explain how this works. I’m going to ask you several questions, using a special technique to draw more information out of your subconscience. This technique is called Dynamic Management View Querying.

    Database: Will it hurt?

    DBA: Oh no, not at all. Based on what I find, I may be able to give you recommendations on how you can find relief, and, in some cases, I may be able to administer treatment immediately.

    Database: That would be wonderful, doc.

    DBA: So tell me more about yourself… Your age, your background, etc…

    select Created=convert(varchar(20),d.create_date,100)
          ,LastRead=isnull(convert(varchar(20),s.last_read_date,100),'')
          ,LastWrite=isnull(convert(varchar(20),s.last_write_date,100),'')
          ,CompLevel=d.compatibility_level
          ,RecovModel=d.recovery_model_desc
    ,LogReuseWait=d.log_reuse_wait_desc
    ,Collation=d.collation_name
    ,AutoStats=case 
    when d.is_auto_create_stats_on=1
    then 'Create '
                       else ''
                     end
                    +case
                       when d.is_auto_update_stats_on=1
    then 'Update '
                       else ''
                     end
                    +case
                       when d.is_auto_update_stats_async_on=1
    then 'Asynch '
                       else ''
                     end
          ,Parameterization=case
                              when d.is_parameterization_forced=1
    then 'Forced'
                              else 'Simple'
                            end
          ,UserAccess=d.user_access_desc
    ,[State]=d.state_desc
    from sys.databases d
    outer apply (select last_read_date
    =max(case
                                 when last_user_scan>=isnull(last_user_seek,'19000101')
                                  and last_user_scan>=isnull(last_user_lookup,'19000101')
                                 then last_user_scan
    when last_user_seek>=isnull(last_user_scan,'19000101')
                                  and last_user_seek>=isnull(last_user_lookup,'19000101')
                                 then last_user_seek
    else last_user_lookup
    end)
                       ,last_write_date=max(last_user_update)
                 from sys.dm_db_index_usage_stats
                 where database_id=d.database_id) s
    where database_id=db_id('Deighton')
    /*
    (Reformatted for clarity):
    Created............ Dec 18 2010  8:37PM
    LastRead........... Mar 29 2011 11.24AM
    LastWrite.......... Mar 29 2011 11:23AM
    CompLevel.......... 100
    RecovModel......... FULL
    LogReuseWait....... LOG_BACKUP
    Collation.......... SQL_Latin1_General_CP1_CI_AS
    AutoStats.......... Create Update
    Parameterization... Simple
    UserAccess......... MULTI_USER
    State.............. ONLINE
    */
    
    DBA: Hmmm… Interesting.

    Database: What is it, doc?

    DBA: Oh nothing… Don’t worry… I often mumble to myself as I do some of this analysis. Pay it no mind. Tell me a little more about yourself …

    use Deighton
    go
    select LogicalName=name
    ,[Type]=case when [type]=0 then 'Data' else 'Log' end
          ,SizeAllocated=convert(varchar(20),convert(decimal(12,2),size*8./1024))+'MB'
          ,SpaceUsed=convert(varchar(20),convert(decimal(12,2),UsedPages*8./1024))+'MB'
          ,[%Full]=convert(varchar(20),convert(decimal(12,1),UsedPages*100./size))+'%'
          ,SpaceLeft=convert(varchar(20),convert(decimal(12,2),(size-UsedPages)*8./1024))+'MB'
          ,[%Avail]=convert(varchar(20),convert(decimal(12,1),(size-UsedPages)*100./size))+'%'
          ,Growth=case
                    when is_percent_growth=1
    then convert(varchar(20),growth)+'%'
                    else convert(varchar(20),convert(decimal(12,2),growth*8./1024))+'MB'
                  end
          ,MaxSize=case
                     when max_size=-1
    then 'Unlimited'
                     else convert(varchar(20),convert(decimal(12,0),max_size*8./1024))+'MB'
                   end
          ,PhysicalFile=physical_name
    from sys.database_files 
    cross apply (select UsedPages=fileproperty(name,'SpaceUsed')) F_Used
    where [type] in (0,1)  --Rows,Log
    /*
    (Reformatted for clarity):
    Deighton_Data (Data):
      Size:   1756.00MB
      Used:   1755.00MB (99.9%)
      Avail:     1.00MB (0.1%)
      Growth:    1.00MB
      Max:    Unlimited
      File:   C:\Microsoft SQL Server\MSSQL.1\MSSQL\data\Deighton_Data.MDF
    Deighton_Log (Log):
      Size:  14198.81MB
      Used:  13745.87MB (96.8%)
      Avail:   452.95MB (3.2%)
      Growth:       10%
      Max:    Unlimited
      File:   C:\Microsoft SQL Server\MSSQL.1\MSSQL\data\Deighton_Log.LDF
    */
    
    DBA: Ahhh… I see… You mentioned earlier that you suffer from claustrophobia.

    Database: Yes, all the time. And often I have these anxiety or panic attacks where I kind of explode, and it makes me feel a little better, but not for long.

    DBA: Well, I can cure you of that immediately.

    Database: You’re kidding… Really?

    DBA: Yes. Now just hold still while I…

    /*
    The database felt claustrophobic because it was using up 99.9% of its allocated space.
    And its FILEGROWTH is only 1MB (the default), so every time it reached full capacity and
    had an anxiety attack, which was often, it only experienced "autogrowth" of a piddly 
    little 1MB, which might have made it feel relatively better for a short while, but it 
    would still be only 1MB away from being full capacity again and having another attack.  
    If the database's initial allocated size upon creation was 200MB, then it had over 1500 
    autogrowth anxiety attacks in its lifetime!
    So DBA bumped the allocated size up to 3500MB to relieve the claustrophobia and,
    at the same time, he changed the FILEGROWTH to 100MB. This was just a preventative
    measure, in case the database unexpectedly reached that 3500MB capacity.  But DBA will
    diligently monitor the used space and adjust sizes appropriately long before the
    database uses up that space.
    */
    alter database Deighton
    modify file (name=N'Deighton_Data'
                ,size=3500MB
    ,filegrowth=100MB)
    
    DBA: How does that feel?

    Database: Omigosh, doc! That’s amazing! I feel so free now! Yippee!

    DBA: I think I can take care of that bloating problem you mentioned earlier also. One moment…

    /*
    The database's transaction log was HUGE. It was about 14GB in size, filled up
    to 96.8% capacity.  That's compared to its actual data, which was only 1.7GB.
    Since the database has a Recovery Model of FULL, its transaction log will 
    continue to grow larger and larger forever... unless a transaction log backup
    is performed.  That log backup will truncate the contents of the transaction
    log.  Note that a full backup of the database WILL NOT truncate the transaction
    log automatically... Log backups must be performed in order to do that.
    */
    backup log Deighton
    to disk=N'C:\SomeBackupLocation\DeightonLog-yyyymmdd-hhmmss.trn'
    
    DBA: How’s that?

    Database: Much better, doc!

    DBA: Perhaps at another time I can make a further adjustment, though it is a procedure that I very rarely perform.

    Database: Okay.

    DBA: Now tell me about diet and exercise. Do you exercise regularly?

    Database: Well… I… er…

    /*
    Obtain the 5 most recent backups of each type performed.
    */
    with BackupDetail as
    (
      select BackupType
    ,backup_finish_date
    ,RowNum=row_number() over (partition by BackupType 
    order by backup_finish_date desc)
      from msdb.dbo.backupset 
    cross apply (select BackupType=case [type]
    when 'D' then 'Full'
                                       when 'I' then 'Differential'
                                       when 'L' then 'Log'
                                       when 'F' then 'File'
                                       when 'G' then 'Differential File'
                                       when 'P' then 'Partial'
                                       when 'Q' then 'Differential Partial'
                                       else 'N/A'
                                     end) F_Type
    where database_name='Deighton'
    )
    select BackupType
    ,backup_finish_date
    from BackupDetail
    where RowNum<=5
    order by BackupType
    ,RowNum
    /*
    BackupType backup_finish_date
    ---------- -----------------------
    Full       2011-03-20 23:01:33.000
    Full       2011-03-04 22:19:52.000
    Full       2011-02-24 22:43:39.000
    Full       2011-02-19 22:24:03.000
    Full       2011-02-11 23:11:42.000
    Log        2011-03-29 11:35:31.000  <==This is the log backup just performed
    */
    
    DBA: Hmmm… It seems pretty sporadic.

    Database: Yeah, well…

    DBA: Listen, Deighton, this is very important. I know you’re the kind of guy who wants to live life to the FULLest. In order to really do that, you have to do regular exercise and have a healthy diet. Look at it this way… If something happens to you and you get very sick or are in a bad accident, you want to get better quickly and RESTORE yourself to FULL health, don’t you?

    Database: Yes.

    DBA: You don’t want to be a SIMPLEton and only be restored to a fraction of yourself, do you?

    Database: No, I suppose not.

    DBA: Then we have to get you into an exercise regimen [full and differential backups] and you should be drinking water regularly during the day [log backups] to help flush out your system. I know it seems like a pain, but it can be really easy and doesn’t take all that much time. To get you motivated, I can put you in touch with a personal trainer named C. Quill Agent who will stay on top of these things for you, making sure that you do them regularly.

    Database: Thanks, doc.

    DBA: Don’t mention it. Now, on to other things… You mentioned overwork and disorganization?

    Database: Yeah. I feel like I’m doing unnecessary work at my place of business. I’m trying to remedy this by using a strategy that I saw in a self-help book, but it doesn’t seem to work. My workflow still seems incredibly inefficient, and I just feel kind of scatter-brained, if you know what I mean.

    DBA: Yes, I think I have an idea of what you’re talking about. Tell me about your methods of organization…

    use Deighton
    go
    select TableName=object_name(dm.object_id)
          ,IndexName=i.name
    ,IndexType=dm.index_type_desc
    ,[%Fragmented]=avg_fragmentation_in_percent
    from sys.dm_db_index_physical_stats(db_id(),null,null,null,'sampled') dm
    join sys.indexes i on dm.object_id=i.object_id and dm.index_id=i.index_id 
    order by avg_fragmentation_in_percent desc
    /*
    (Names of Tables and Indexes Disguised):
    TableName            IndexName               IndexType               %Fragmented
    -------------------- ----------------------- ------------------ ----------------
    qufi_efferhmiqfz     rhaqwqamxim             NONCLUSTERED INDEX 99.9925228054434
    qufi_efferhmiqfz     foliqemi                NONCLUSTERED INDEX 99.9788779993241
    qufiz                zuamriwiy               NONCLUSTERED INDEX 99.9643747773424
    qufi_efferhmiqfz     qufioh                  NONCLUSTERED INDEX 99.9625355911884
    qufiz                qufihefifomi            NONCLUSTERED INDEX 99.9515738498789
    qufiz                rmiefihhefi             NONCLUSTERED INDEX 99.9515503875969
    qufiz                lezfaphefi              NONCLUSTERED INDEX 99.9515503875969
    qufiz                erfogofy_zfemfhefi      NONCLUSTERED INDEX 99.9481058640374
    qufiz                erfogofy_rumplifihhefi  NONCLUSTERED INDEX 99.9481058640374
    qufiz                miplerihwofhoh          NONCLUSTERED INDEX 99.9339498018494
    qufiz                zuamrioh                NONCLUSTERED INDEX 99.9339061467284
    qufiz                hilifih                 NONCLUSTERED INDEX 99.9093381686310
    qufiz                umpheq                  NONCLUSTERED INDEX 99.9092558983666
    fuhu                 PK_fuhu                 CLUSTERED INDEX    99.8109640831758
    xmuwimz              PK_xmuwimz              CLUSTERED INDEX    99.7950819672131
    xmuwimz              hxe                     NONCLUSTERED INDEX 97.4358974358974
    qufiz                PK_qufiz                CLUSTERED INDEX    97.2856847758728
    xmuwimz              lezfqemi                NONCLUSTERED INDEX 97.2222222222222
    hogozouqz            PK_hogozouqz            CLUSTERED INDEX    97.0588235294118
    xmuwimz              fomzfqemi               NONCLUSTERED INDEX 96.9696969696970
    xmuwimz              reloriqziqamxim         NONCLUSTERED INDEX 96.9696969696970
    xmuwimz              zzq                     NONCLUSTERED INDEX 95.8333333333333
    xmuwimz              mzmoh                   NONCLUSTERED INDEX 95.4545454545455
    xmuwimz              fizfmirumh              NONCLUSTERED INDEX 95.4545454545455
    ruqferfzlezferrizzih PK_ruqferfzlezferrizzih CLUSTERED INDEX    93.7500000000000
    hogozouqz            hogozouqqemi            NONCLUSTERED INDEX 92.8571428571429
    xmuwimzlezferrizzih  PK_xmuwimzlezferrizzih  CLUSTERED INDEX    90.9090909090909
    ...And so on and so on
    */
    
    DBA: My goodness, I can see why you’re feeling overworked… You are scattered and fragmented. What is the strategy that you said you’re trying to alleviate this?

    Database: I wrote it down. Here it is… I try to do it often:

    use Deighton
    go
     
    /*
    Rebuild all Indexes and Statistics in the database
    */
    declare @Table nvarchar(255)
           ,@Sql nvarchar(255)
     
    declare TableCursor cursor
    for
    select quotename(table_schema)+'.'+quotename(table_name)
    from information_schema.tables
    where table_type='BASE TABLE'
     
    open TableCursor
    while 1=1
    begin
      fetch next from TableCursor into @Table
    if @@fetch_status<>0 break
      set @Sql='alter index all on '+@Table+' rebuild'
      exec (@Sql)
      set @Sql='update statistics '+@Table
    exec (@Sql)
    end
     
    close TableCursor
    deallocate TableCursor
    
    /*
    Reclaim disk space by shrinking
    */
    dbcc shrinkdatabase(N'Deighton',10)
    dbcc shrinkfile(N'Deighton_Data',10)
    dbcc shrinkfile(N'Deighton_Log',10)
    
    DBA: Oh my goodness, I definitely see the problem here. And I’ll bet you feel exhausted after doing this, am I right?

    Database: Yeah, you got that right.

    DBA: Hold on… Let me write down a few notes…

    /*
    The following will rebuild all indexes from scratch, even though that might not
    be necessary.  Indexes should be treated on a case-by-case bses.  An index should
    be REBUILT only when it has high fragmentation, REORGANIZEd if the fragmentation is 
    moderate, and not touched at all if the fragmentation is very low.  
    */
    set @Sql='alter index all on '+@Table+' rebuild'
    exec (@Sql)
    /*
    By default, statistics are auto-updated by the system when a table has been
    changed by a certain amount.  If a table has not changed at all, then there's no
    reason to update its statistics.  What's worse with the command below is that
    it is NOT updating the statistics WITH FULLSCAN.  The Index Rebuild above
    automatically rebuilt statistics for indexed columns "for free" (using FULLSCAN), 
    but the command below will just end up re-updating them based on only a
    SAMPLE, so it's duplicating the work of the Index Rebuild and creating less
    accurate statistics at the same time!  Ideally the command below should be doing
    UPDATE STATISTICS ... WITH FULLSCAN, COLUMNS so that it only updates non-indexed
    column statistics.  
    */
    set @Sql='update statistics '+@Table
    exec (@Sql)
     
    /*
    Oh, horror of horrors!  SHRINKing a database will just end up completely fragmenting
    up all the indexes that had been painstakingly rebuilt above!  This is like shooting
    yourself in the foot.  And it's essentially being done twice!  A SHRINKDATABASE 
    automatically shrinks each of its data and log files, so the two SHRINKFILE commands
    are just duplicating the process.
    */
    dbcc shrinkdatabase(N'Deighton',10)
    dbcc shrinkfile(N'Deighton_Data',10)
    dbcc shrinkfile(N'Deighton_Log',10)
    
    DBA: Okay, listen to me carefully. You must stop doing this immediately and never do it again! The first part is generally okay, except you’re spending way too much time and effort in redoing EVERYTHING from scratch. Contact a colleague of mine named Ola Hallengren… he has a much more intelligent approach to doing all of this. But the second part is a huge giant no-no, because it’s destroying everything you painstakingly did in the first part. It’s like you’ve arranged your papers on your desk in nice neat stacks and then a hurricane comes into the room and blows them all over the place. After our session, please read this article by Paul Randal on why you should not be doing this.

    Database: Okay, whatever you say, doc.

    DBA: It’s also possible that some of the stuff you’re organizing may be doing you more harm than good…

    /*
    Produce a list of all nonclustered, non-primary-key indexes in the database that
    deal with more than 5000 rows.  Compare their Reads vs Writes.  The list is sorted
    by Read/Write ratio.  Focus on the indexes toward the top of the list with a
    Read/Write ratio of under 1.00... They are candidates for DROPping from the database.
    */
    use Deighton
    go
    select TableName=o.Name
    ,IndexName=i.Name
    ,Rows
          ,Reads
    ,Writes
    ,[Reads/Write]
    ,Seeks=User_Seeks
    ,Scans=User_Scans
    ,Lookups=User_Lookups
    ,Definition='('+IndexColumns+')'+coalesce(' include ('+IncludeColumns+')','')
          ,DropIt='drop index '+quotename(i.Name)
                 +' on '+quotename(c.name)+'.'+quotename(object_name(s.object_id))
    from sys.dm_db_index_usage_stats s  
    join sys.indexes i ON s.object_id=i.object_id and s.index_id=i.index_id 
    join sys.objects o on s.object_id=o.object_id
    join sys.schemas c on o.schema_id=c.schema_id
    cross apply (select Rows=sum(p.Rows)
                 from sys.partitions p
    where object_id=s.object_id and index_id=s.index_id) F_Rows
    cross apply (select Reads=User_Seeks+User_Scans+User_Lookups
    ,Writes=User_Updates) F_RW 
    cross apply (select [Reads/Write]=cast(case
                                             when Writes<1
    then 100
    else 1.*Reads/Writes
    end as decimal(12,3))) F_RW2
    cross apply (select IndexColumns
    =stuff(
                             (select ','+c.Name
    +case ic.Is_Descending_Key 
    when 1 
    then ' DESC' 
    else '' 
    end
                              from sys.index_columns ic
    join sys.columns c on ic.Object_ID=c.Object_ID 
    and ic.Column_ID=c.Column_ID
    where ic.Object_ID=i.Object_ID 
    and ic.Index_ID=i.Index_ID
    and ic.Is_Included_Column=0
    order by ic.Index_Column_ID
    for xml path(''))
                           ,1,1,'')
                       ,IncludeColumns
    =stuff(
                             (select ','+c.Name
    from sys.index_columns ic
    join sys.columns c on ic.Object_ID=c.Object_ID 
    and ic.Column_ID=c.Column_ID
    where ic.Object_ID=i.Object_ID 
    and ic.Index_ID=i.Index_ID
    and ic.Is_Included_Column=1
    order by ic.Index_Column_ID
    for xml path(''))
                           ,1,1,'')) F_IC
    where s.database_id=db_id()
      and objectproperty(s.object_id,'IsUserTable')=1
    and i.type_desc='NONCLUSTERED'
      and i.is_primary_key=0
    and i.is_unique_constraint=0
    and Rows>5000
    order by [Reads/Write]
    ,Reads
    ,Writes
    /*
    (Names of Tables and Indexes Disguised):
    (Output abbreviated):
    TableName              IndexName                                Reads Writes Reads/Write
    ---------------------- ---------------------------------------- ----- ------ -----------
    eahoffmeol             eahoffmoel_rheqgilugmezfimoh                 0   1346       0.000
    ehhmizziz              ehhmizziz_ehhmizzfypi                        0   1880       0.000
    rheqgilug              rheqgilug_gluxelazimoh                       0  42235       0.000
    zaxlihgim              zaxlihgim_gloh                               0  62264       0.000
    rheqgilug_hifeolih     rheqgilug_hifeolih_rheqgilug_mezfimoh        1  37584       0.000
    zaxlihgim              zaxlihgim_miruqrolexli_ehjazfmiqfoh         82  62292       0.001
    zaxlihgim              zaxlihgim_rezhmiriopf_hifeolzoh            113  62323       0.002
    zaxlihgim              zaxlihgim_oqguorioh                        323  62641       0.005
    zaxlihgim              zaxlihgim_hozxamzexliofimoh                732  62996       0.012
    phuqiz                 phuqiz_phuqi                                75   2662       0.028
    eahoffmeol             eahoffmoel_zuamriwiy_zuamrioh               44   1346       0.033
    rheqgilug_mezfim       rheqgilugmezfim_zizzouqoh                  491   6759       0.073
    oqguori_xollexliofimz  oqguori_xollexliofimz_pulory_pleq_oh      1907   5440       0.351
    miruqrolexli_loqiofimz miruqrolexli_loqiofimz_fypi              13723  19833       0.692
    oqguori_xollexliofimz  oqguori_xollexliofimz_oqguorioh          11240   5440       2.066
    oqguori_xollexliofimz  oqguori_xollexliofimz_zuamriwiy_zuamrioh 11465   5117       2.241
    phuqiz                 phuqiz_zuamriwiy_zuamrioh                 8303   3476       2.389
    miruqrolexli_loqiofimz fw_miruqrolexli_loqiofimz_fypi           75078  24978       3.006
    rheqgilug_mezfim       rheqgilugmezfim_zuamriwiy_zuamrioh       38364   6759       5.676
    ehhmizziz              ehhmizziz_zuamriwiy_zuamrioh             21305   2959       7.200
    zopruhiz               rofy                                      2572      0     100.000
    zopruhiz               zfefi                                     2572      0     100.000
    zopruhiz               zopruhi                                  27203      0     100.000
    */
    
    DBA: See the items at the top? Those are cases where you’re spending much more time filing items away than you are acquiring them later. For example, in the fourth entry in the list, you’ve spent time and energy filing 62,264 items but you’ve never once accessed one of them after the fact. Organizing your information in that way is most likely more work than it’s worth, since you aren’t really taking advantage of it anyway.

    Database: Hmmm… I see what you mean, doc.

    DBA: Well, it looks like we’re out of time for today. Even though we started to accomplish a lot today, we’ve just scratched the surface, so I’d like to schedule you for regular visits so that we can monitor you and administer some more treatments as necessary.

    Database: Er… How long will we be doing this monitoring, doc?

    DBA: Why, the rest of your life of course. Here’s my bill.

    Database: Let’s see… WHAT?! That’s outrageous! I don’t have that kind of money! And you want to continue seeing me and billing me ridiculous amounts like this for the rest of my life? Are you insane? Just who the hell do you think you are? You high-and-mighty types make me want to vomit. You sit there on your high horse and think you know everything. Well, you know what you can do with this here bill? You can take it and just shove---

    drop database Deighton
    
    DBA: (Sigh). Another one bites the dust. Oh well… Ms. Foreachdb, would you please send in my next patient?
  10. DMV/DMF Info Just A Couple Clicks Away

    If you’re not reading Kendra Little’s blog, then you’re really missing out. It’s full of terrific technical information, but it’s her artwork and sense of humor that really make it special.

    This morning, Kendra wrote a really cool post called Dynamic Management Quickie: Exploring SQL Server’s System Views and Functions As You Work.

    It included a query that will help you find system views and/or functions that have a column name containing a certain search string. For example, to find stuff that has to do with CPU, you can execute the following query (which I rewrote/reformatted in Schulzified style):

    select SysObjName
    ,o.type_desc
    ,ColumnName=c.name
    from sys.system_columns c
    join sys.system_objects o on c.object_id=o.object_id
    cross apply (select SysObjName=schema_name(schema_id)+'.'+o.name) F_SON
    where c.name like '%cpu%'  --Enter Search String Here
    order by SysObjName
    /*
    SysObjName            type_desc                         ColumnName
    --------------------- --------------------------------- ---------------
    sys.dm_exec_requests  VIEW                              cpu_time
    sys.dm_exec_sessions  VIEW                              cpu_time
    sys.dm_os_schedulers  VIEW                              cpu_id
    sys.dm_os_sys_info    VIEW                              cpu_count
    sys.dm_os_sys_info    VIEW                              cpu_ticks
    sys.dm_os_sys_info    VIEW                              cpu_ticks_in_ms
    sys.fn_trace_gettable SQL_INLINE_TABLE_VALUED_FUNCTION  CPU
    sys.syslogins         VIEW                              totcpu
    sys.sysprocesses      VIEW                              cpu
    */
    
    This is really helpful, but Kendra went one step further and posted another query which added a column containing a URL so that you could open a browser window (CTRL+ALT+R) and paste the URL in to get more information on the subject.

    I thought this was a great idea, but, lazy person that I am, I wanted to cut down on the number of clicks and keystrokes that I had to do in order to get to the browser window... and I wanted all the information there so it was just one click away.

    So I came up with the following query:

    with SysObjInfo as
    (
      select SysObjInfo=(select nchar(13)
                               +N'System Object:    '+SysObjName+nchar(13)
                               +N'Type of Object:   '+o.type_desc+nchar(13)
                               +N'Column Name:      '+c.name+nchar(13)
                               +N'URL (CTRL+Click): '
                               +N'http://social.msdn.microsoft.com/'
                               +N'Search/en-US/?Refinement=117&Query='
                               +SysObjName
    +nchar(13)
                         from sys.system_columns c
    join sys.system_objects o on c.object_id=o.object_id
                         cross apply (select SysObjName=schema_name(schema_id)+'.'+o.name) F
    where c.name like '%cpu%'  --Enter Search String Here
                         order by SysObjName
    for xml path(''),type).value('.','nvarchar(max)')
    )
    select LinkToSysObjInfo
    =(select [processing-instruction(q)]=N':'+nchar(13)+SysObjInfo+nchar(13)                                 
    from SysObjInfo
    for xml path(''),type)
    
    You can find the explanation for the processing-instruction directive and all the other XML stuff in an article I wrote last June called Hyperlinks To T-SQL Code.

    When I execute that query in SSMS, it gives me a single hyperlink…

    System Object Query Result in SSMS

    When I click on that hyperlink, it opens an XML window with a list of all the views/functions that have to do with CPU…

    System Object Query Result in XML Window

    If I see one that interests me and I want to learn more, I just CTRL+Click on the URL and voila! There’s a browser window with MSDN Search giving me links to the view/function:

    System Object Information in Browser Window

    So now all the information you want on views/functions on a certain subject are just a couple clicks away.

    Thanks again to Kendra for the original idea.

    Now do yourself a favor and add her blog to your reader right now. Then sit back in your easy chair by the fire and enjoy catching up on her posts… Make yourself comfortable because you won’t be able to stop.
  1. 1
  2. Next ›
  3. Last »