SELECT Blog FROM Brad.Schulz CROSS APPLY SQL.Server(): March 2010

Thursday, March 25, 2010

OVER Exposed

The introduction of the OVER clause in SQL2005 gave us some terrific new capabilities in ranking data (via ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()) and in aggregating data (COUNT(), SUM(), AVG(), etc). In this blog entry, I specifically want to talk about the OVER clause with respect to aggregates.

OVER Clause Overview

Let’s take a look at AdventureWorks Sales Order data in the United States (which is represented by TerritoryID’s 1 thru 5):

select sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
order by SalesOrderID
        ,ProductID
/*
SalesOrderID CustomerID ProductID UnitPrice
------------ ---------- --------- ---------
       43659        676       709      5.70
       43659        676       711   20.1865
       43659        676       712    5.1865
       43659        676       714   28.8404
       43659        676       716   28.8404
       43659        676       771  2039.994
       43659        676       772  2039.994
       ...
(60153 Rows Total)
*/

Thanks to the OVER clause, we can effortlessly incorporate a column into that query that shows us each ProductID’s Average UnitPrice in that result:

select sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
      ,AvgPrice=avg(UnitPrice) over (partition by ProductID)
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
order by SalesOrderID
        ,ProductID
/*
SalesOrderID CustomerID ProductID UnitPrice  AvgPrice
------------ ---------- --------- --------- ---------
       43659        676       709      5.70    5.6525
       43659        676       711   20.1865   28.2153
       43659        676       712    5.1865    7.0082
       43659        676       714   28.8404   34.4369
       43659        676       716   28.8404   35.1093
       43659        676       771  2039.994 2085.9938
       43659        676       772  2039.994  2043.917
       ...
(60153 Rows Total)
*/

Let’s add one additional column that shows the UnitPrice’s Percent of the Average:

select sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
      ,AvgPrice=avg(UnitPrice) over (partition by ProductID)
      ,PercentOfAvg=UnitPrice/avg(UnitPrice) over (partition by ProductID)*100
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
order by SalesOrderID
        ,ProductID
/*
SalesOrderID CustomerID ProductID UnitPrice  AvgPrice PercentOfAvg
------------ ---------- --------- --------- --------- ------------
       43659        676       709      5.70    5.6525       100.84
       43659        676       711   20.1865   28.2153        71.54
       43659        676       712    5.1865    7.0082        74.00
       43659        676       714   28.8404   34.4369        83.74
       43659        676       716   28.8404   35.1093        82.14
       43659        676       771  2039.994 2085.9938        97.79
       43659        676       772  2039.994  2043.917        99.80
       ...
(60153 Rows Total)
*/

And finally, let’s just look at the 50 Line Items that had the lowest Percentages by changing our ORDER BY and adding a TOP 50 to the query:

select 
top 50 sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
      ,AvgPrice=avg(UnitPrice) over (partition by ProductID)
      ,PercentOfAvg=UnitPrice/avg(UnitPrice) over (partition by ProductID)*100
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
order by PercentOfAvg
        ,SalesOrderID
        ,ProductID
/*
SalesOrderID CustomerID ProductID UnitPrice AvgPrice PercentOfAvg
------------ ---------- --------- --------- -------- ------------
       69408        236       987   112.998 337.3797        33.49
       69413         43       987   112.998 337.3797        33.49
       69418        381       987   112.998 337.3797        33.49
       69442        233       987   112.998 337.3797        33.49
       ...
       69418        381       984   112.998 330.3681        34.20
       69422        650       984   112.998 330.3681        34.20
       69452          9       984   112.998 330.3681        34.20
       69468        237       984   112.998 330.3681        34.20
(50 Rows Total)
*/

Well, that was easy. The OVER clause made our code nice and compact. If we didn’t have the OVER clause available to us, we would have had to write the query like the one below, JOINing our source data to a derived table where we computed the aggregate via a GROUP BY clause:

;with SourceData as
(
  select sod.SalesOrderID
        ,CustomerID
        ,ProductID
        ,UnitPrice
  from AdventureWorks.Sales.SalesOrderDetail sod
  join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID
  where soh.TerritoryID between 1 and 5
)  
select 
top 50 SalesOrderID
      ,CustomerID
      ,S.ProductID
      ,UnitPrice
      ,AvgPrice
      ,PercentOfAvg=UnitPrice/AvgPrice*100
from SourceData S
join (select ProductID
            ,AvgPrice=avg(UnitPrice)
      from SourceData
      group by ProductID) G on S.ProductID=G.ProductID
order by PercentOfAvg
        ,SalesOrderID
        ,ProductID
/*
SalesOrderID CustomerID ProductID UnitPrice AvgPrice PercentOfAvg
------------ ---------- --------- --------- -------- ------------
       69408        236       987   112.998 337.3797        33.49
       69413         43       987   112.998 337.3797        33.49
       69418        381       987   112.998 337.3797        33.49
       69442        233       987   112.998 337.3797        33.49
       ...
       69418        381       984   112.998 330.3681        34.20
       69422        650       984   112.998 330.3681        34.20
       69452          9       984   112.998 330.3681        34.20
       69468        237       984   112.998 330.3681        34.20
(50 Rows Total)
*/

That JOIN/GROUP BY approach took a lot more code and it looks like a lot of work, mainly because it has to scan the SalesOrderHeader and SalesOrderDetail tables twice, as we can see in the query plan:

The query plan of the OVER Clause query, on the other hand, only scans those tables once:

So it must be a lot more efficient, right?

Wrong.

Let’s take a look at the statistics for the two queries, collected by Profiler (with Query Results Discarded so that we are purely looking at processing time of the query and not the time to render the output):

/*
Description           Reads  Writes   CPU  Duration
---------------------------------------------------
OVER Clause          125801       2   617      1406
JOIN with GROUP BY     3916       0   308      1176
*/

The OVER clause was a real pig as far as the Number of Reads is concerned. And it had higher CPU and Duration than the JOIN approach.

The OVER Clause Query Plan In Depth

Let’s take a closer look at the OVER clause actual execution plan once more, where I’ve added some annotations (please click on the image below to see a larger version in another browser window):

Looking at the outer loop (operators #5 thru #10), the rows that come from the merged table scans (#8, #9, #10) get sorted (#7) by ProductID. The Segment operator (#6) receives those rows and feeds them, groups (or segments) at a time, to the Table Spool operator (#5).

As the Table Spool (#5) receives those rows in a group, it saves a copy of the rows and then sends a single row to the Nested Loops operator (#4), indicating that it has done its job of saving the group’s rows.

Now that the Nested Loops operator (#4) has received its outer input, it calls for the inner input, which will come from operators #11 thru #15. The two other table spools in the plan (#14 and #15) are the same spool as #5, but they are replaying the copies of the rows that #5 had saved. The Stream Aggregate (#13) calculates the SUM(UnitPrice) and COUNT(*) and the Compute Scalar (#12) divides those two values to come up with the AVG(UnitPrice). This single value is JOINed (via the #11 Nested Loops operator) with the rows replayed by the Spool #15.

And then the Spool (#5) truncates its saved data and gets to work on the next group that it receives from the Segment operator (#6). And so on and so on, for each fresh group of ProductID's.

So, in a logical sense, this plan is also doing a GROUP BY and a JOIN, except it’s doing it with a single scan of the source data. However, to accomplish that, it has to do a triple scan of the spooled data, thus producing those hundreds of thousands of reads.

Memory Usage

If you hover your mouse over the SELECT operator (#1) in the actual query plan of our OVER Clause query, you can see that it requests a Memory Grant of 5288 Kbytes:

By comparison, our traditional JOIN/GROUP BY query only requests 1544 Kbytes.

In the case of the JOIN/GROUP BY query, the memory grant is for the “Top N” Sort and the two Hashing operators. As Adam Haines demonstrated in a recent blog entry, a TOP value of 100 or less does not require that much memory. It would only request 1024K. The additional memory required for the Hashes brought its request up to 1544K.

The OVER query, though is a different story. It also does a “Top N” Sort (#2), but, in addition, it does a full Sort (#7) of the source data, which requires a lot more memory than a “Top N” Sort. Thus the much larger memory grant request.

Things get a lot worse as we increase the size of our result set. Let’s add a lot more columns to our output, increasing the row size that the Sort operator must process from 27 bytes to 402 bytes…

select 
top 50 sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
      ,AvgPrice=avg(UnitPrice) over (partition by ProductID)
      ,PercentOfAvg=UnitPrice/avg(UnitPrice) over (partition by ProductID)*100
      /* Let's add a bunch of other columns */
      ,soh.AccountNumber,soh.BillToAddressID,soh.Comment,soh.ContactID
      ,soh.CreditCardApprovalCode,soh.CreditCardID,soh.CurrencyRateID,soh.DueDate
      ,soh.Freight,soh.ModifiedDate,soh.OnlineOrderFlag,soh.OrderDate
      ,soh.PurchaseOrderNumber,soh.RevisionNumber,soh.rowguid,soh.SalesOrderNumber
      ,soh.SalesPersonID,soh.ShipDate,soh.ShipMethodID,soh.ShipToAddressID 
      ,soh.SubTotal,soh.TaxAmt,soh.TerritoryID,soh.TotalDue,sod.CarrierTrackingNumber 
      ,sod.OrderQty,sod.SalesOrderDetailID,sod.SpecialOfferID,sod.UnitPriceDiscount 
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
order by PercentOfAvg
        ,SalesOrderID
        ,ProductID

For this, the Memory Grant request balloons to 25,264K. If we add all those extra columns to our traditional JOIN/GROUP BY query, its Memory Grant request still remains at 1544K.

Collecting Profiler statistics for the two queries with all those extra columns shows more interesting information:

/*
Description           Reads  Writes   CPU  Duration
---------------------------------------------------
OVER Clause          144936     790  1696      2125
JOIN with GROUP BY     3916       0   850      1725
*/

Note all the extra writes that were required for the OVER query. The row size of the data that we needed to sort is too big to do completely in memory and it spilled over into TempDB, writing 790 pages worth of data.

The lesson here is that the JOIN/GROUP BY approach is faster and more efficient and less of a memory hog if you’re processing a large number of rows and/or if your rows are wide.

For our JOIN/GROUP BY query, the optimizer has the flexibility to choose from a variety of plans to come up with the most cost-effective approach. In our examples, the optimizer determined that it was more cost-effective to use a Hash Aggregation and a Hash Join as opposed to sorting the data. The Hash operators are extremely efficient in handling large gobs of data… and that data does not have to be sorted. The OVER query, on the other hand, requires the data to be sorted, because it always uses a Segment operator (#6) and a Stream Aggregate operator (#13) which only work on pre-sorted data.

Taking the SORT out of the Equation

Okay, so let’s do another experiment. Let’s see how our two approaches work on data that is pre-sorted. Just for kicks, instead of PARTITIONing BY (or GROUPing BY) ProductID, let’s do it by SalesOrderID. Since both the SalesOrderHeader and SalesOrderDetail tables have a clustered index on SalesOrderID, the system can scan both files in SalesOrderID order, and do Merge Joins to join the two tables together, and we won’t require any sorts at all. And let’s also take out our TOP 50 (and the ORDER BY) so that no “Top N” Sort has to be performed on either query. So we won’t require any memory grants at all in either query, and no expensive Sort operations. Here are the revised queries:

/* OVER Approach */
select sod.SalesOrderID
      ,CustomerID
      ,ProductID
      ,UnitPrice
      /* Let's add a bunch of other columns */
      ,soh.AccountNumber,soh.BillToAddressID,soh.Comment,soh.ContactID
      ,soh.CreditCardApprovalCode,soh.CreditCardID,soh.CurrencyRateID,soh.DueDate
      ,soh.Freight,soh.ModifiedDate,soh.OnlineOrderFlag,soh.OrderDate
      ,soh.PurchaseOrderNumber,soh.RevisionNumber,soh.rowguid,soh.SalesOrderNumber
      ,soh.SalesPersonID,soh.ShipDate,soh.ShipMethodID,soh.ShipToAddressID 
      ,soh.SubTotal,soh.TaxAmt,soh.TerritoryID,soh.TotalDue,sod.CarrierTrackingNumber 
      ,sod.OrderQty,sod.SalesOrderDetailID,sod.SpecialOfferID,sod.UnitPriceDiscount 
      ,AvgPrice=avg(UnitPrice) over (partition by sod.SalesOrderID)
from AdventureWorks.Sales.SalesOrderDetail sod
join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID 
where TerritoryID between 1 and 5
 
/* JOIN/GROUP BY Approach */
;with SourceData as
(
  select sod.SalesOrderID
        ,CustomerID
        ,ProductID
        ,UnitPrice
        /* Let's add a bunch of other columns */
        ,soh.AccountNumber,soh.BillToAddressID,soh.Comment,soh.ContactID
        ,soh.CreditCardApprovalCode,soh.CreditCardID,soh.CurrencyRateID,soh.DueDate
        ,soh.Freight,soh.ModifiedDate,soh.OnlineOrderFlag,soh.OrderDate
        ,soh.PurchaseOrderNumber,soh.RevisionNumber,soh.rowguid,soh.SalesOrderNumber
        ,soh.SalesPersonID,soh.ShipDate,soh.ShipMethodID,soh.ShipToAddressID 
        ,soh.SubTotal,soh.TaxAmt,soh.TerritoryID,soh.TotalDue,sod.CarrierTrackingNumber 
        ,sod.OrderQty,sod.SalesOrderDetailID,sod.SpecialOfferID,sod.UnitPriceDiscount 
  from AdventureWorks.Sales.SalesOrderDetail sod
  join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID
  where soh.TerritoryID between 1 and 5
)  
select S.*
      ,AvgPrice
from SourceData S
join (select SalesOrderID
            ,AvgPrice=avg(UnitPrice)
      from SourceData S
      group by SalesOrderID) G on S.SalesOrderID=G.SalesOrderID

Now let’s compare the performance of those two approaches:

/*
Description           Reads  Writes   CPU  Duration
---------------------------------------------------
OVER Clause          192017       0  1445      1945
JOIN with GROUP BY     3914       0   683      1184
*/

The JOIN/GROUP BY query still outperforms the OVER query in every respect. The OVER query still had to do truckloads of Reads, because of the Spooling. Note that the reads this time were even higher than they were in our original example, because there are more Groups of SalesOrderID’s (12041) than there are Groups of ProductID’s (265), so the Spools worked overtime, saving and replaying all those extra groups of data.

So it seems that the OVER clause is very convenient as far as coding and clarity, but unless you have pre-sorted data in a small number of groups with a small number of rows, the JOIN/GROUP BY approach may very well provide a better plan. That’s something to think about.

If you want the most efficient query possible, check out both approaches for performance. There may be some times when the optimizer will look at a JOIN/GROUP BY query and will construct a plan that is exactly like the OVER query anyway. For example, the following JOIN/GROUP BY query will do just that:

;with SourceData as
(
  select sod.SalesOrderID
        ,CustomerID
        ,ProductID
        ,UnitPrice
  from AdventureWorks.Sales.SalesOrderDetail sod
  join AdventureWorks.Sales.SalesOrderHeader soh on sod.SalesOrderID=soh.SalesOrderID
  where soh.TerritoryID between 1 and 5
)  
select SalesOrderID
      ,CustomerID
      ,S.ProductID
      ,UnitPrice
      ,AvgPrice
      ,PercentOfAvg=UnitPrice/AvgPrice*100
from SourceData S
join (select ProductID
            ,AvgPrice=avg(UnitPrice)
      from SourceData
      group by ProductID) G on S.ProductID=G.ProductID
order by ProductID

Interestingly enough, it ONLY creates an OVER-type plan for the above because I specified that I wanted the output to be ORDERed BY ProductID, which is our GROUP BY column. Because of that, the optimizer figured it was more cost-effective to SORT the data by ProductID up front and do the spooling stuff than it was to do the Hashing approach and SORT the final data by ProductID at the end. If you take the ORDER BY out of the query above, then the optimizer will revert back to a plan with Hash operators and no spooling.

But I DISTINCTly Asked For…

I know I’ve kind of trashed the OVER clause a bit during the course of this article, and I hate to kick someone when they’re down, but I’m afraid I have to bring something else up.

The OVER clause does not accept DISTINCT in its aggregate functions. In other words, the following query will bring about an error:

select SalesOrderID
      ,ProductID
      ,UnitPrice
      ,NumDistinctPrices=count(distinct UnitPrice) over (partition by ProductID)
      ,AvgDistinctPrice=avg(distinct UnitPrice) over (partition by ProductID)
from AdventureWorks.Sales.SalesOrderDetail 
/*
Msg 102, Level 15, State 1, Line 4
Incorrect syntax near 'distinct'.
*/

If you want DISTINCT aggregates, you are forced to use the JOIN/GROUP BY approach to accomplish this.

However, there may be times when, for whatever reason, you must scan the data only once. To be honest, the whole catalyst for my writing this article was a situation just like that. My source data was a derived table that produced random rows… something like this:

;with RandomRowNumsAdded as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,RowNum=row_number() over (order by newid())
  from AdventureWorks.Sales.SalesOrderDetail 
)
select SalesOrderID
      ,ProductID
      ,UnitPrice
from RandomRowNumsAdded
where RowNum<=50
/*
SalesOrderID ProductID  UnitPrice
------------ --------- ----------
       47439       762    469.794
       72888       873       2.29
       47640       780  2071.4196
       51143       883    31.3142
       . . .
       48393       798   600.2625
       63980       711      34.99
       64590       780    2319.99
       64262       877       7.95
(50 Rows Total)
*/

(If you’re wondering why the heck I’d want something like this, then stay tuned… you’ll see why in a future blog post… my craziest one yet).

Clearly, that query will come out differently every time it is executed, so if I wanted to use a JOIN/GROUP BY to calculate a COUNT(DISTINCT) or AVG(DISTINCT), I couldn’t, because by scanning the information twice, each scan would produce two different outcomes and therefore the following JOIN/GROUP BY approach would not work, because the two scans would produce two different sets of ProductID’s and UnitPrices:

;with RandomRowNumsAdded as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,RowNum=row_number() over (order by newid())
  from AdventureWorks.Sales.SalesOrderDetail 
)
,SourceData as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
  from RandomRowNumsAdded
  where RowNum<=50
)
select S.*
      ,NumDistinctPrices
      ,AvgDistinctPrice
from SourceData S
join (select ProductID
            ,NumDistinctPrices=count(distinct UnitPrice)
            ,AvgDistinctPrice=avg(distinct UnitPrice)
      from SourceData 
      group by ProductID) G on S.ProductID=G.ProductID 
/*
Because of the two SCANs of the data, the JOIN above will
produce different numbers of rows (sometimes perhaps even
ZERO rows!) every time this query is run.
*/

So I had to come up with a different way to do DISTINCT aggregates by only doing a single scan on the source data.

The answer ended up being… ta-dah!... the OVER clause, ironically enough. Even though you can’t do DISTINCT Aggregates directly with OVER, you can accomplish the task by first performing a ranking function (ROW_NUMBER()), and then follow that with a traditional non-DISTINCT aggregation. Here’s how, using my random-row source data:

;with RandomRowNumsAdded as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,RowNum=row_number() over (order by newid())
  from AdventureWorks.Sales.SalesOrderDetail 
)
,SourceData as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,PriceSeq=row_number() over (partition by ProductID,UnitPrice order by UnitPrice)
  from RandomRowNumsAdded
  where RowNum<=50
)
select SalesOrderID
      ,ProductID
      ,UnitPrice
      ,NumDistinctPrices=count(case when PriceSeq=1 then UnitPrice end)
                         over (partition by ProductID)
      ,AvgDistinctPrice=avg(case when PriceSeq=1 then UnitPrice end) 
                        over (partition by ProductID)
from SourceData
/*
SalesOrderID ProductID UnitPrice NumDistinctPrices AvgDistinctPrice
------------ --------- --------- ----------------- ----------------
       73078       708     34.99                 1            34.99
       74332       708     34.99                 1            34.99
       69526       711   19.2445                 2          27.1172
       64795       711     34.99                 2          27.1172
       . . .
       51810       940    48.594                 1           48.594
       68514       955   2384.07                 1          2384.07
       71835       964    445.41                 1           445.41
       51823       968  1430.442                 1         1430.442
(50 Rows Total)
*/

We use ROW_NUMBER() to introduce a PriceSeq column, which resets to a value of 1 for each ProductID and UnitPrice combination. And then we only aggregate on those UnitPrices that have a PriceSeq of 1, since any rows with a PriceSeq of 2 and above will just be duplicate (non-DISTINCT) repeats of a ProductID and UnitPrice combination.

We already saw how an OVER aggregation query requires a Sort. In order to calculate DISTINCT aggregations, we require two sorts… one for the ROW_NUMBER() function to calculate our PriceSeq column and a second to do the aggregation. You can see this in the query plan:

This is a little odd, because the first sort orders the data by ProductID and UnitPrice, and the second sort orders the data by just ProductID. But that second sort is not necessary, because the first sort already put the data in a ProductID order anyway. It seems that the optimizer is not “smart enough” to know this and take advantage of it.

However, all is not completely lost. If the only DISTINCT aggregation you require in your query is a COUNT(), then you can use yet another clever approach to accomplish this:

;with RandomRowNumsAdded as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,RowNum=row_number() over (order by newid())
  from AdventureWorks.Sales.SalesOrderDetail 
)
,SourceData as
(
  select SalesOrderID
        ,ProductID
        ,UnitPrice
        ,PriceDenseRank=dense_rank() over (partition by ProductID order by UnitPrice)
  from RandomRowNumsAdded
  where RowNum<=50
)
select SalesOrderID
      ,ProductID
      ,UnitPrice
      ,NumDistinctPrices=max(PriceDenseRank) over (partition by ProductID)
from SourceData
/*
SalesOrderID ProductID UnitPrice NumDistinctPrices
------------ --------- --------- -----------------
       67354       707     34.99                 1
       51653       707     34.99                 1
       58908       708    20.994                 2
       66632       708     34.99                 2
       . . .
       51140       975  1020.594                 1
       68191       976   1700.99                 1
       72706       984    564.99                 1
       53606       987   338.994                 1
(50 Rows Total)
*/

This applies a DENSE_RANK() value to each different UnitPrice found within each ProductID, and we can then calculate the COUNT(DISTINCT) by just getting the MAX() of that PriceDenseRank value.

Since the DENSE_RANK() and the MAX() both do a PARTITION BY ProductID, their sorting requirements are exactly identical, and this query plan only requires a single sort as opposed to two:

Final Wrapup

I hope you enjoyed this in-depth look at Aggregations and the OVER Clause. To review what was illustrated in this article…

The OVER clause is very convenient for calculating aggregations, but using a JOIN/GROUP BY approach may very well perform more efficiently, especially on source data that has a large number of rows and/or has a wide row size and/or is unsorted.

If you want to calculate DISTINCT Aggregates, use the JOIN/GROUP BY approach; however, …

If you, for some reason, only want to scan the source data once, assign sequence numbers using ROW_NUMBER() and then aggregate only on those rows that received a sequence number of 1; however, …

If you only require COUNT(DISTINCT), then use DENSE_RANK() and MAX(), since that only requires a single sort instead of two.

Tuesday, March 9, 2010

FeedBurner Mystery

For some reason that I can't explain, it seems that one of my blog posts did not get through to anyone who uses Google Reader or perhaps other readers as well.

My feed shows the post is there, but Google Reader just throws it out the window and ignores it completely... I have no clue as to why.

Anyway, for those of you who missed my Spotlight On UNPIVOT Part 2 post from February 25, you can read it here.

This Article On Recurson Is Entitled “This Article On Recursion Is Entitled “This Article… ∞” ”

This blog entry is participating in T-SQL Tuesday #004, hosted this month by Mike Walsh. You are invited to visit his blog to join the party and read more blogs participating in this month’s theme: IO.

So, get a stiff drink (you’ll need it) and perhaps a Motrin or two, and then sit back and read on…

My head was spinning as I entered the local Starbucks. I guess it had been a mistake to watch The Matrix and The Thirteenth Floor back-to-back last night. I vowed never to do anything like that again.

My briefcase felt as heavy as an anchor on the Titanic, which seemed appropriate because I also had a feeling of dread in addition to my dizziness. Instead of standing in line for a drink, I decided to find a table first. I saw an open one in the corner, and started heading toward it. But vertigo hit me like a truck and…

Everything went black.

. . . . . . . .

“Whoa, take it easy there,” a man said. “Good thing I caught you… I happened to be standing right next to you when you collapsed. Are you okay?”

“Jeez, I guess so,” I said, trying to recover, “I don’t know what came over me.”

“Well, you got the partners kind of worried.”

“Partners?” I asked.

“That’s what Starbucks calls their employees,” the man said. “Just like Disneyland calls their employees cast members… like everybody there is in a movie or something… but who’s watching the movie I wonder?”

“Sorry I asked,” I said.

The kind stranger led me to his table and sat me down and he sat opposite me. There was a laptop computer open in front of him, and next to it was a book entitled Gödel, Escher, Bach.

“Sorry about all the trouble,” I said. “I guess I’ve been overworked lately.”

“It’s no trouble. Do you want to talk about it?”

“Er… I don’t even know you.”

“I’m Rick,” he said, extending a hand.

“Hi Rick. I’m Brad”, I said, shaking it.

“Do you want something to drink?” Rick asked. “I’m having a Droste.”

“A what?” I asked.

“Droste. It’s a Dutch cocoa.”

“Is it any good?”

“Well, to be honest, I don’t really care about the taste. What fascinates me is their logo. Take a look at the picture on the box.”

“Where? What box?”

“Up there,” said Rick.

“What? The ceiling?”

“No, you’re not looking at the right place. It’s up there, next to the beginning of our conversation.”

I shook my head. “I don’t know what the hell you’re talking about. I guess I really am out of it today.”

“How about a glass of water?” Rick offered. “I just happen to have one right here.” He produced a really odd-shaped container partly filled with ice water.

“How am I supposed to drink water out of that thing?” I asked. “What the heck is it anyway?”

“It’s a Klein Bottle,” said Rick.

“Er… no thanks,” I said. “Actually, I’ve just decided I’m not thirsty.”

“You mentioned overwork,” said Rick. “Sometimes it helps to talk things out, even to a total stranger.”

I looked at him. He seemed genuinely concerned. “Well, maybe you’re right,” I said. “I’m under a lot of pressure… I have a blog that I have to write by next Tuesday.”

“What’s the blog about?” asked Rick.

“It’s about SQL Server,” I said.

“Oh yeah, sure, I know SQL Server,” said Rick. “And SQL is one of my favorite TLA’s.”

“TLA?” I asked.

“TLA is a three-letter acronym for Three-Letter Acronym,” said Rick.

I digested that comment. “Er… uh… okay.”

“Please continue,” said Rick, “You were talking about SQL.”

“Yeah, well, I want to participate in this T-SQL Tuesday Blog Party, and I have to come up with a blog about a certain subject. This month’s subject is IO.”

“That’s weird,” said Rick. “You have to write a SQL-related blog about one of Jupiter’s moons?”

“No, not the proper noun Io… the acronym IO.”

“Oh, sorry, I didn't notice the capitalized ‘O’,” said Rick. “IO, huh? As in Information Overload? Is that why you collapsed?”

“No,” I said. “IO stands for Input/Output. And that subject seems more hardware or admin related. I’m more on the developer side of the spectrum.”

“Hey, I have an idea,” Rick said. “Since we were talking about TLA’s earlier, why don’t you blog about CTE’s? Specifically, Recursive CTE’s.”

“What do those have to do with IO?” I asked.

“Well, for one thing, their output is the same as their input,” said Rick. “Kind of like the Klein Bottle there, whose outside is the same as its inside.”

Ignoring the Klein Bottle, I said, “That’s true about Recursive CTE’s, but it doesn’t seem like enough to blog about.”

“I’ll show you what I mean,” said Rick. He scooted his chair closer to me and turned his laptop so that we could both see the screen. The topmost window was a browser that was on Microsoft’s search site Bing.

“I’ve never tried Bing,” I said. “How does it compare to Google?”

“Funny you should mention that,” he said. “I don’t have any opinion of Bing one way or the other. I just like what it stands for.”

“What do you mean?”

“Bing is a recursive acronym that stands for Bing Is Not Google,” he said.

“I didn’t know that.”

“Well, it’s a rumor, but it’s interesting anyway,” said Rick. He ALT+TABbed to SSMS 2008, which was already running on his computer. “Take a look at this Recursive CTE query.”

with RecursiveCTE(String)
as
(
  /* Anchor Portion */
  select cast('AA' as varchar(10)) 
  union all 
  select cast('BB' as varchar(10))
  
  union all 
  
  /* Recursive Portion */
  select cast(String+'X' as varchar(10))
  from RecursiveCTE
  where len(String)<5
)
select String 
from RecursiveCTE 
/* option (maxrecursion 100) (This is the default) */
/*
String
------
AA
BB
BBX
BBXX
BBXXX
AAX
AAXX
AAXXX
*/

“Okay,” I said. “What about it?”

“Don’t you wonder why the rows come out in the order they did? You’ll notice I have no ORDER BY in the query.”

“Yeah,” I said. “I’ve seen that kind of thing before, but never really investigated it. I would have thought that AA and BB would come out first, followed by AAX and BBX, and then AAXX and BBXX, and so on.”

“Here’s why,” said Rick. “Take a look at the execution plan, which I’ve amended with some number labels next to each operator. And here’s the same query with some comments that indicate what parts relate to which operators in the plan.”

with RecursiveCTE(String) /* Spool Operator (#2) */
as
(
  /* Anchor Portion: Constant Scan Opeartor (#5) */
  select cast('AA' as varchar(10)) 
  union all 
  select cast('BB' as varchar(10))
  
  union all  /* Concatenation Operator (#3) */
  
  /* Recursive Portion */
  select cast(String+'X' as varchar(10))  /* Compute Scalar Operator (#10) */
  from RecursiveCTE     /* Spool Operator (#9) */
  where len(String)<5   /* Filter Operator (#11) */
)
select String      /* SELECT Operator (#1) */
from RecursiveCTE  /* Spool Operator (#2) */
/* option (maxrecursion 100) (This is the default... enforced by Assert (#6) */

“There’s your output/input right there,” said Rick. He pointed to both the Spool operators #2 and #9. “Every row that #2 hands over to the SELECT (#1) for output gets stored into a temporary table (or spool) so that it can act as input over here at #9.”

“Okay,” I said.

“Well, you won’t find this anywhere in the graphical plan, but if you look at the XML or Text plan, you’ll see a new clue. Here’s the Text plan:”

/*
with RecursiveCTE(String)...select String from RecursiveCTE
  |--Index Spool(WITH STACK)
     |--Concatenation
        |--Compute Scalar(DEFINE:([Expr1006]=(0)))
        |  |--Constant Scan(VALUES:(('AA'),('BB')))
        |--Assert(WHERE:(CASE WHEN [Expr1008]>(100) THEN (0) ELSE NULL END))
           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Recr1003]))
              |--Compute Scalar(DEFINE:([Expr1008]=[Expr1007]+(1)))
              |  |--Table Spool(WITH STACK)
              |--Compute Scalar(DEFINE:([Expr1004]=CONVERT(varchar(10),[Recr1003]+'X')))
                 |--Filter(WHERE:(STARTUP EXPR(len([Recr1003])<(5))))
                    |--Constant Scan
*/

“Oh, I see it,” I said. “The Spool is actually a Stack… it’s a LIFO kind of thing… Last In First Out. But in this case it’s more like LOFI… Last Output First Input. So the Anchor part of the query immediately outputs the AA and the BB, which both go into the stack, and then the Recursive part of the query pops the Last-Out value of BB off the stack and processes it as input, which in turn produces and outputs BBX, which gets popped off the stack and processed as input, which produces BBXX, and so on. And when all the BB productions are done, the AA is still in the stack waiting to be processed as input, and it goes through the same cycle… AAX and AAXX and AAXXX.”

“Exactly,” said Rick. “The key here is the Stack Spool… but the important part is that by using this logic, it forces the engine to process the rows one at a time, which is not very ideal, and that can sometimes be disastrous. Let me illustrate… Look at this simple aggregate query, which just produces Monthly Sales by Product in AdventureWorks.”

select ProductID
      ,Yr
      ,Mth     
      ,Sales=sum(LineTotal)
from Sales.SalesOrderHeader soh
cross apply (select Yr=year(OrderDate)
                   ,Mth=month(OrderDate)) F
join Sales.SalesOrderDetail sod on soh.SalesOrderID=sod.SalesOrderID 
group by ProductID
        ,Yr
        ,Mth
order by ProductID
        ,Yr
        ,Mth
/*
ProductID   Yr Mth        Sales
--------- ---- --- ------------
      707 2001   7   484.476000
      707 2001   8  1170.817000
      707 2001   9  1110.257500
      707 2001  10   827.646500
      707 2001  11  1554.360500
    . . .
      999 2004   2 35639.340000
      999 2004   3 44063.184000
      999 2004   4 48275.106000
      999 2004   5 52703.024000
      999 2004   6 50679.357476
(3890 Rows Total)
*/

“As you can see, that produced 3890 rows. Now let’s say that you want to show running totals for each Product as the months progress. The traditional way of doing that is to do a self-join and a SUM() aggregate, but just for the sake of demonstration, we’ll do it via a Recursive CTE. Here’s what I mean.”

with MonthlyStats as
(
  select ProductID
        ,Yr
        ,Mth     
        ,Sales=sum(LineTotal)
        ,RowNum=row_number() over (partition by ProductID
                                   order by Yr,Mth)
  from Sales.SalesOrderHeader soh
  cross apply (select Yr=year(OrderDate)
                     ,Mth=month(OrderDate)) F
  join Sales.SalesOrderDetail sod on soh.SalesOrderID=sod.SalesOrderID 
  group by ProductID
          ,Yr
          ,Mth
)
,RunningTotals as
(
  /* Anchor Portion */
  select ProductID
        ,Yr
        ,Mth
        ,Sales
        ,RunningSales=Sales
        ,RowNum
  from MonthlyStats
  where RowNum=1
  union all
  /* Recursive Portion */
  select r.ProductID
        ,m.Yr
        ,m.Mth
        ,m.Sales
        ,r.RunningSales+m.Sales
        ,m.RowNum
  from RunningTotals r
  join MonthlyStats m on r.ProductID=m.ProductID 
                         and r.RowNum+1=m.RowNum
) 
select ProductID 
      ,Yr
      ,Mth
      ,Sales
      ,RunningSales 
from RunningTotals 
order by ProductID
        ,Yr
        ,Mth

“The query uses a CTE called MonthlyStats that does that aggregation I showed you previously, producing those 3890 rows, and it uses ROW_NUMBER() to sequence the data. Then that data is used as the main driving table for the RunningTotals Recursive CTE. The Anchor Portion gets all the RowNums of 1, and then the Recursive Portion uses that as its input getting RowNum+1, and so on and so on, until there are no more RowNums left.”

“Okay, I can see that,” I said.

“Let’s run the query.” He pushed the F5 key.

We waited… and we waited… and we waited.

“Hey, while we’re waiting for that to finish, can I ask you something?” said Rick.

“Sure.”

“What is a question that mentions the word umbrella for no apparent reason?”

I hesitated, and then said, “Er… um… you just asked one.”

“Oh, so I did! You’re right!”

“You’re really weird, Rick. You take a lot of getting used to.”

“Yeah, a lot of people say that. Hopefully the readers of this conversation will appreciate me.”

“Readers?” I asked.

“Hey, you out there… yes, I’m talking to you,” Rick said to no one in particular. “Pay attention now: When you aren’t looking at it, this entire dialog is in Spanish.” He paused for a second, and then said, “Hah! Made you look away for a second, didn’t I?”

“Uh… Rick… Who are you talking to?”

“Here’s another one, dear reader. Look closely: .siht ekil ti gnidaer eb d`uoy ,werbeH ni erew ecnetnes siht fI”

“Rick, what’s going on? You’re talking absolute nonsense.”

“Aww, I’m just having a little fun. I’m passing time waiting for this query to finish.”

He smiled at me. I gave him a guarded look.

“Anyway… back to business,” he said. “Why do you think this query is taking so long?” Rick asked.

“Well,” I said, happy to be talking shop again, “Since all output of a Recursive CTE acts as its own input, and since we know the MonthlyStats CTE produces 3890 rows, that means the Recursive Portion of the CTE is going to process those 3890 rows as input… one at a time.”

“True, but there’s more to it than that,” said Rick. He cancelled the query. “I don’t think you want to wait for that to finish. It will take over an hour, in fact, it takes 69 minutes.”

“You’re kidding,” I said.

“No, I’m serious. Anyway, you’re right in that the Recursive Portion of the query will process those 3890 rows as input, but in order to do that, it actually has to execute the MonthlyStats CTE 3890 separate times. The JOIN and the aggregation and sort and ROW_NUMBER() executions occurring over and over 3890 times! Here, I’ll show you. Here’s a copy of the Actual Execution Plan. It’s probably too tiny for you to see, so just click on it to see a larger version of it in another browser window.”

“Huh?” I said. “Click on it? Browser window?”

“Oh, sorry,” said Rick. “I was talking to somebody else.” He traced a rectangle around the Anchor Portion and the Recursive Portion of the query with his finger. “See… they are identical. Both are doing the JOIN and a sort and the SUM aggregate and the ROW_NUMBER() sequencing. Look at the Index Scan of the Sales.SalesOrderHeader table, for example, in the upper right-hand corner. That table has 31,465 rows in it, and the Anchor Portion of the query executed the Scan once, so it spit out a total of 31,465 rows. But in the Recursive Portion, those same 31,465 rows of the table got scanned 3890 times, spitting out a grand cumulative total of 122,398,850 rows. “

“Yuck,” I said.

“The Merge Join is even worse. In the Anchor Portion, you can see that it processes 121,317 rows. In the Recursive Portion, it processes 121,317 rows 3890 times, for a grand total of 471,923,130 rows.”

“Sheesh!”

“All of this work, just to produce 3890 lousy rows in the final result. By the way, here are the Profiler statistics for the query.”

/*
CPU........ 2,833,093ms 
Reads...... 9,197,296
Writes..... 315
Duration... 4,124,269ms = 68.74min
*/

“Well,” I suggested, “It’s obviously a problem using a CTE as the driving table of the Recursive CTE. I’ll bet we could improve performance by putting the MonthlyStats information into a temporary table. And not only that… We could also create an index on the RowNum and ProductID columns and then use that indexed table as the driver.”

“Exactly right. I’ve already done that. Here’s the code to create the temp table and create an index.”

if object_id('tempdb..#MonthlyStats','U') is not null drop table #MonthlyStats
 
select ProductID
      ,Yr
      ,Mth     
      ,Sales=sum(LineTotal)
      ,RowNum=row_number() over (partition by ProductID
                                 order by Yr,Mth)
into #MonthlyStats
from Sales.SalesOrderHeader soh
cross apply (select Yr=year(OrderDate)
                   ,Mth=month(OrderDate)) F
join Sales.SalesOrderDetail sod on soh.SalesOrderID=sod.SalesOrderID 
group by ProductID
        ,Yr
        ,Mth
 
create clustered index Ix_RowProd on #MonthlyStats (RowNum,ProductID)

“And here’s the query that uses that table to drive the Recursive CTE.”

with RunningTotals as
(
  /* Anchor Portion */
  select ProductID
        ,Yr
        ,Mth
        ,Sales
        ,RunningSales=Sales
        ,RowNum
  from #MonthlyStats
  where RowNum=1
  union all
  /* Recursive Portion */
  select r.ProductID
        ,m.Yr
        ,m.Mth
        ,m.Sales
        ,r.RunningSales+m.Sales
        ,m.RowNum
  from RunningTotals r
  join #MonthlyStats m on r.ProductID=m.ProductID 
                         and r.RowNum+1=m.RowNum
) 
select ProductID 
      ,Yr
      ,Mth
      ,Sales
      ,RunningSales 
from RunningTotals 
order by ProductID
        ,Yr
        ,Mth

He pushed the F5 key and it finished in less than a second. “Much better than the last query,” said Rick. “This approach ran in just a fractal of the time.”

“Fractal?”

“Pardon?” asked Rick.

“I think you meant to say fraction… the approach ran in just a fraction of the time.”

“Oh… perhaps I did. Anyway, Profiler shows only 37962 reads… way better than 9 million reads, wouldn’t you say? The Recursive Portion of the query still executed 3890 times, but because of our index on RowNum and ProductID, it only had to do a SEEK 3890 times into the #MonthlyStats temp table… much better than 3890 SCANs (on 2 tables) and 3890 JOINs and 3890 Sorts and 3890 Aggregations and 3890 Sequencing Calculations.”

“Excellent,” I said.

“But, as much as I like the concept of Recursive CTE’s, the fact that they process one row at a time off the Stack Spool is not very efficient. Take a closer look at the makeup of our #MonthlyStats table.”

select Level=coalesce(str(RowNum,5),'Total')
      ,NumRows=count(*)
from #MonthlyStats
group by RowNum with rollup
/*
Level NumRows
----- -------
    1     266
    2     265
    3     261
    4     252
    5     251
    6     251
    7     245
    8     244
    9     240
   10     220
   11     219
   12     219
   13     107
   14      70
   15      67
. . .
   35      12
   36      12
   37       7
Total    3890
*/

“Our Anchor, which gets all the RowNums of 1, will end up outputting 266 rows… immediately. Then the Recursive Portion will do the RowNums of 2 and 3 and so on. But it will not do them in groups… it will do them one agonizing row at a time via the Spool Stack.”

“True,” I said.

“Well, what if we did do it in groups instead?”

“I see,” I said. “We could process the 266 Anchor rows into a temp table, then use it to process the 2nd level of 265 rows, and then use that to process the 3rd level of 261 rows, and so on and so on, until we’ve done all 37 levels. It’s just a loop.”

“Yes,” said Rick. “An intelligent loop that processes sets of rows instead if individual rows. That’s what SQL does best. Here’s the query that does just that.”

if object_id('tempdb..#RunningTotals','U') is not null drop table #RunningTotals
  
/* Anchor Portion */
select ProductID
      ,Yr
      ,Mth
      ,Sales
      ,RunningSales=Sales
      ,RowNum
into #RunningTotals
from #MonthlyStats
where RowNum=1
 
/* "Pseudo-Recursive" Portion */
declare @r int = 0
while 1=1
begin
  set @r=@r+1
  insert #RunningTotals
  select r.ProductID
        ,m.Yr
        ,m.Mth
        ,m.Sales
        ,r.RunningSales+m.Sales
        ,m.RowNum
  from #RunningTotals r
  join #MonthlyStats m on r.ProductID=m.ProductID 
  where r.RowNum=@r and m.RowNum=@r+1
  if @@rowcount=0 break
end
 
select ProductID
      ,Yr
      ,Mth
      ,Sales
      ,RunningSales 
from #RunningTotals 
order by ProductID
        ,Yr
        ,Mth

Rick hit the F5 key. The query finished in half a second. “This query is just as fast as the recursive query, but it has a third of the reads… 13716 vs 37962.”

“That’s great!” I said.

“We’re doing essentially the same thing as the recursive query, using a table’s output as its own input, but we’re just doing it more efficiently.”

“So it seems our lesson here is that Recursive CTE’s are cool and convenient, but if you use them, then you should only process very very small tables or appropriately-indexed tables. And if you’re a real stickler for maximum efficiency and keeping down the number of Reads, you should throw the Recursive CTE out the window and instead code a loop to intelligently process groups of rows at a time.”

“Sad, but true,” said Rick. “If you abandon Recursion, it will set you free… So to speak.”

I looked at him with admiration. “Hey, Rick, I really appreciate you taking the time to show me this stuff.”

“Oh, it’s no problem. You contributed to this conversation just as much as I did… more than you realize. I just initiated it.”

“Well, you certainly seem to know your stuff.”

“A lot of it is just repetition… I like repeating things over and over again. And I like challenging the brain… I enjoy mind-bending stuff.” He paused and continued. “Oh yeah, and I also read about all of this somewhere.”

“Oh, swell, I guess that means that I can’t use it for my blog then.”

“Don’t worry, you won’t have a problem with that. I have a printed copy of the article where I read about this. Would you like to see it?”

“Sure.”

“Hold on. It’s in my briefcase.”

He leaned over and reached for something on the floor that I hadn’t noticed before.

“That’s the weirdest-looking briefcase I’ve ever seen,” I said.

“It’s a tesseract.”

“What’s a tesseract?”

“Just click on the link in my last sentence and you can learn all about it,” he said.

“Huh? Click on what?”

“Ah, here it is,” he said. He pulled a sheet of paper out of his weird briefcase. But it was no ordinary sheet of paper. Its ends were joined together forming a loop with a half-twist… It was a Möbius strip. He pointed to a location on the strip. “You can start reading here,” he indicated.

I started reading:

“Wait a minute,” I said. “This is too weird.”

I stood up…. perhaps too fast… I started to feel dizzy. “What’s going on here? Who the hell are you?”

“I told you, my name is Rick,” he said with a wink and a smile. “Rick Cursion.”

Everything went black.

“Whoa, take it easy there,” a man said. “Good thing I caught you… I happened to be standing right next to you when you collapsed. Are you okay?”

“Jeez, I guess so,” I said, trying to recover, “I don’t know what came over me.”

“Well, you got the partners kind of worried.”

“Partners?” I asked.

“That’s what Starbucks calls their employees,” the man said. “Just like Disneyland calls their employees cast members…

SELECT Blog FROM Brad.Schulz CROSS APPLY SQL.Server()

Thursday, March 25, 2010

OVER Exposed

Tuesday, March 9, 2010

FeedBurner Mystery

This Article On Recurson Is Entitled “This Article On Recursion Is Entitled “This Article… ∞” ”

Search This Blog

Subscribe

About Me

Keywords

Blog Archive

Followers

# # #

SELECT Blog FROM Brad.Schulz CROSS APPLY SQL.Server()

Thursday, March 25, 2010

OVER Exposed

Tuesday, March 9, 2010

FeedBurner Mystery

This Article On Recurson Is Entitled “This Article On Recursion Is Entitled “This Article… ∞” ”

Search This Blog

Subscribe

Subscribe To...

About Me

Keywords

Blog Archive

Followers

# # #