Friday, 22 November 2013

Ranking Functions

Ranking Functions (Transact-SQL)
Ranking functions return a ranking value for each row in a partition. Depending on the function that is used, some rows might receive the same value as other rows. Ranking functions are nondeterministic.
Transact-SQL provides the following ranking functions:



The following shows the four ranking functions used in the same query.
USE AdventureWorks2012;
GO
SELECT p.FirstName, p.LastName
    ,ROW_NUMBER() OVER (ORDER BY a.PostalCode) AS "Row Number"
    ,RANK() OVER (ORDER BY a.PostalCode) AS Rank
    ,DENSE_RANK() OVER (ORDER BY a.PostalCode) AS "Dense Rank"
    ,NTILE(4) OVER (ORDER BY a.PostalCode) AS Quartile
    ,s.SalesYTD
    ,a.PostalCode
FROM Sales.SalesPerson AS s
    INNER JOIN Person.Person AS p
        ON s.BusinessEntityID = p.BusinessEntityID
    INNER JOIN Person.Address AS a
        ON a.AddressID = p.BusinessEntityID
WHERE TerritoryID IS NOT NULL AND SalesYTD <> 0;
Here is the result set.

FirstName   LastName  Row Number         Rank    Dense Rank       Quartile SalesYTD   PostalCode
Michael       Blythe          1          1          1          1                       4557045.0459          98027
Linda          Mitchell        2          1          1          1                       5200475.2313           98027
Jillian          Carson        3          1          1          1                       3857163.6332           98027
Garrett        Vargas        4          1          1          1                       1764938.9859           98027
Tsvi            Reiter          5          1          1          2                       2811012.7151           98027
Shu            Ito  6           6          2          2          3                       018725.4858             98055
José           Saraiva        7          6          2          2                       3189356.2465           98055
David         Campbell     8          6           2          3                      3587378.4257            98055
Tete           Mensa          9          6          2          3                       1931620.1835            98055
Lynn          Tsoflias        10         6          2          3                       1758385.926              98055
Rachel       Valdez          11         6          2          4                      2241204.0424             98055
Jae           Pak               12         6          2          4                      5015682.3752             98055
Ranjit        Varkey          13         6          2          4                     3827950.238                98055

RANK (Transact-SQL)
Returns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question.

RANK ( ) OVER ( [ partition_by_clause ] order_by_clause )

OVER ( [ partition_by_clause ] order_by_clause)
Partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. Order_by_clause determines the order of the data before the function is applied. Theorder_by_clause is required. The <rows or range clause> of the OVER clause cannot be specified for the RANK function.

bigint

If two or more rows tie for a rank, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.
The sort order that is used for the whole query determines the order in which the rows appear in a result set.

A. Ranking rows within a partition
The following example ranks the products in inventory the specified inventory locations according to their quantities. The result set is partitioned byLocationID and logically ordered by Quantity. Notice that products 494 and 495 have the same quantity. Because they are tied, they are both ranked one.
USE AdventureWorks2012;
GO
SELECT i.ProductID, p.Name, i.LocationID, i.Quantity
    ,RANK() OVER
    (PARTITION BY i.LocationID ORDER BY i.Quantity DESC) AS Rank
FROM Production.ProductInventory AS i
INNER JOIN Production.Product AS p
    ON i.ProductID = p.ProductID
WHERE i.LocationID BETWEEN 3 AND 4
ORDER BY i.LocationID;
GO
Here is the result set.
ProductID   Name                   LocationID   Quantity Rank
----------- ---------------------- ------------ -------- ----
494         Paint - Silver         3            49       1
495         Paint - Blue           3            49       1
493         Paint - Red            3            41       3
496         Paint - Yellow         3            30       4
492         Paint - Black          3            17       5
495         Paint - Blue           4            35       1
496         Paint - Yellow         4            25       2
493         Paint - Red            4            24       3
492         Paint - Black          4            14       4
494         Paint - Silver         4            12       5
 (10 row(s) affected)
B. Ranking all rows in a result set
The following example returns the top ten employees ranked by their salary. Because a PARTITION BY clause was not specified, the RANK function was applied to all rows in the result set.
USE AdventureWorks2012
SELECT TOP(10) BusinessEntityID, Rate,
       RANK() OVER (ORDER BY Rate DESC) AS RankBySalary
FROM HumanResources.EmployeePayHistory AS eph1
WHERE RateChangeDate = (SELECT MAX(RateChangeDate)
                        FROM HumanResources.EmployeePayHistory AS eph2
                        WHERE eph1.BusinessEntityID = eph2.BusinessEntityID)
ORDER BY BusinessEntityID;
Here is the result set.
BusinessEntityID Rate                  RankBySalary
---------------- --------------------- --------------------
1                125.50                1
2                63.4615               4
3                43.2692               8
4                29.8462               19
5                32.6923               16
6                32.6923               16
7                50.4808               6
8                40.8654               10
9                40.8654               10
10               42.4808               9

DENSE_RANK (Transact-SQL)
Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.

DENSE_RANK ( ) OVER ( [ <partition_by_clause> ] < order_by_clause > )

<partition_by_clause>
Divides the result set produced by the FROM clause into partitions to which the DENSE_RANK function is applied.
<order_by_clause>
Determines the order in which the DENSE_RANK function is applied to the rows in a partition.

bigint

If two or more rows tie for a rank in the same partition, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number two. This is one more than the number of distinct rows that come before this row. Therefore, the numbers returned by the DENSE_RANK function do not have gaps and always have consecutive ranks.
The sort order used for the whole query determines the order in which the rows appear in a result. This implies that a row ranked number one does not have to be the first row in the partition.

A. Ranking rows within a partition
The following example ranks the products in inventory the specified inventory locations according to their quantities. The result set is partitioned byLocationID and logically ordered by Quantity. Notice that products 494 and 495 have the same quantity. Because they are tied, they are both ranked one.
USE AdventureWorks2012;
GO
SELECT i.ProductID, p.Name, i.LocationID, i.Quantity
    ,DENSE_RANK() OVER
    (PARTITION BY i.LocationID ORDER BY i.Quantity DESC) AS Rank
FROM Production.ProductInventory AS i
INNER JOIN Production.Product AS p
    ON i.ProductID = p.ProductID
WHERE i.LocationID BETWEEN 3 AND 4
ORDER BY i.LocationID;
GO
Here is the result set.
ProductID   Name                               LocationID Quantity Rank
----------- ---------------------------------- ---------- -------- -----
494         Paint - Silver                     3          49       1
495         Paint - Blue                       3          49       1
493         Paint - Red                        3          41       2
496         Paint - Yellow                     3          30       3
492         Paint - Black                      3          17       4
495         Paint - Blue                       4          35       1
496         Paint - Yellow                     4          25       2
493         Paint - Red                        4          24       3
492         Paint - Black                      4          14       4
494         Paint - Silver                     4          12       5

(10 row(s) affected)
B. Ranking all rows in a result set
The following example returns the top ten employees ranked by their salary. Because a PARTITION BY clause was not specified, the DENSE_RANK function was applied to all rows in the result set.
USE AdventureWorks2012;
GO
SELECT TOP(10) BusinessEntityID, Rate,
       DENSE_RANK() OVER (ORDER BY Rate DESC) AS RankBySalary
FROM HumanResources.EmployeePayHistory;
Here is the result set.
BusinessEntityID Rate                  RankBySalary
---------------- --------------------- --------------------
1                125.50                1
25               84.1346               2
273              72.1154               3
2                63.4615               4
234              60.0962               5
263              50.4808               6
7                50.4808               6
234              48.5577               7
285              48.101                8
274              48.101                8

ROW_NUMBER (Transact-SQL)
Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
Topic link icon 

ROW_NUMBER ( )
    OVER ( [ PARTITION BY value_expression , ... [ n ] ] order_by_clause )

PARTITION BY value expression
Divides the result set produced by the FROM clause into partitions to which the ROW_NUMBER function is applied. Value_ expression specifies the column by which the result set is partitioned. If PARTITION BY is not specified, the function treats all rows of the query result set as a single group.
order_by_clause
The ORDER BY clause determines the sequence in which the rows are assigned their unique ROW_NUMBER within a specified partition. It is required. For more information..

bigint

There is no guarantee that the rows returned by a query using ROW_NUMBER() will be ordered exactly the same with each execution unless the following conditions are true.
  1. Values of the partitioned column are unique.
  2. Values of the ORDER BY columns are unique.
  3. Combinations of values of the partition column and ORDER BY columns are unique.

A. Returning the row number for salespeople
The following example calculates a row number for the salespeople in Adventure Works Cycles based on their year-to-date sales ranking.
USE AdventureWorks2012;
GO
SELECT ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS Row,
    FirstName, LastName, ROUND(SalesYTD,2,1) AS "Sales YTD"
FROM Sales.vSalesPerson
WHERE TerritoryName IS NOT NULL AND SalesYTD <> 0;
Here is the result set.
Row FirstName    LastName               SalesYTD
--- -----------  ---------------------- -----------------
1   Linda        Mitchell               4251368.54
2   Jae          Pak                    4116871.22
3   Michael      Blythe                 3763178.17
4   Jillian      Carson                 3189418.36
5   Ranjit       Varkey Chudukatil      3121616.32
6   José         Saraiva                2604540.71
7   Shu          Ito                    2458535.61
8   Tsvi         Reiter                 2315185.61
9   Rachel       Valdez                 1827066.71
10  Tete         Mensa-Annan            1576562.19
11  David        Campbell               1573012.93
12  Garrett      Vargas                 1453719.46
13  Lynn         Tsoflias               1421810.92
14  Pamela       Ansman-Wolfe           1352577.13
B. Returning a subset of rows
The following example calculates row numbers for all rows in the SalesOrderHeader table in the order of the OrderDate and returns only rows 50 to 60inclusive.
USE AdventureWorks2012;
GO
WITH OrderedOrders AS
(
    SELECT SalesOrderID, OrderDate,
    ROW_NUMBER() OVER (ORDER BY OrderDate) AS RowNumber
    FROM Sales.SalesOrderHeader
)
SELECT SalesOrderID, OrderDate, RowNumber 
FROM OrderedOrders
WHERE RowNumber BETWEEN 50 AND 60;
C. Using ROW_NUMBER() with PARTITION
The following example uses the PARTITION BY argument to partition the query result set by the column TerritoryName. The ORDER BY clause specified in the OVER clause orders the rows in each partition by the column SalesYTD. The ORDER BY clause in the SELECT statement orders the entire query result set by TerritoryName.
USE AdventureWorks2012;
GO
SELECT FirstName, LastName, TerritoryName, ROUND(SalesYTD,2,1),
ROW_NUMBER() OVER(PARTITION BY TerritoryName ORDER BY SalesYTD DESC) AS Row
FROM Sales.vSalesPerson
WHERE TerritoryName IS NOT NULL AND SalesYTD <> 0
ORDER BY TerritoryName;
Here is the result set.
FirstName  LastName             TerritoryName        SalesYTD      Row
---------  -------------------- ------------------   ------------  ---
Lynn       Tsoflias             Australia            1421810.92    1
José       Saraiva              Canada               2604540.71    1
Garrett    Vargas               Canada               1453719.46    2
Jillian    Carson               Central              3189418.36    1
Ranjit     Varkey Chudukatil    France               3121616.32    1
Rachel     Valdez               Germany              1827066.71    1
Michael    Blythe               Northeast            3763178.17    1
Tete       Mensa-Annan          Northwest            1576562.19    1
David      Campbell             Northwest            1573012.93    2
Pamela     Ansman-Wolfe         Northwest            1352577.13    3
Tsvi       Reiter               Southeast            2315185.61    1
Linda      Mitchell             Southwest            4251368.54    1
Shu        Ito                  Southwest            2458535.61    2
Jae        Pak                  United Kingdom       4116871.22    1

No comments:

Post a Comment