Saturday, 20 July 2024

LAG function in SQL server

LAG provides access to a row at a given physical offset that comes before the current row. Accesses data from a previous row in the same result set. The LAG() function is used to get value from row that precedes the current row.The LAG functions can be helpful when performing various data analysis tasks that require comparing values from different rows. For instance, we can calculate the difference between the current and previous values of a column, such as sales, revenue, or profit. We can also find the percentage change or growth rate of a column over time or identify the first or last value of a column within a group.

See the below example

We have Sales.CurrencyRate table in that we are getting the currency rate

SELECT currencyratedate,

       fromcurrencycode,

       tocurrencycode,

       averagerate,

       endofdayrate

FROM   sales.currencyrate cr

WHERE  cr.fromcurrencycode = 'USD'

       AND cr.tocurrencycode = 'ARS'

  
We have requirement like we need to show the previous day and previous_endofdayrate. For example we need to get the result as below
  

See here currencyRateDate : 2011-05-31 is having EndOfRate : 1.0002 so for the next day is 2011-06-01 and previous day is 2011-05-31 and rate is 1.002.

To get this result we can using sub query as below.

 

SELECT

  currencyratedate,

  fromcurrencycode,

  tocurrencycode,

  averagerate,

  endofdayrate,

  (

    SELECT

      TOP(1) pcr.currencyratedate

    FROM

      sales.currencyrate pcr

    WHERE

      cr.fromcurrencycode = pcr.fromcurrencycode

      AND cr.tocurrencycode = pcr.tocurrencycode

      AND cr.currencyratedate > pcr.currencyratedate

    ORDER BY

      pcr.currencyratedate DESC

  ) AS Previous_day_CurrencyRateDate,

  (

    SELECT

      TOP(1) endofdayrate

    FROM

      sales.currencyrate pcr

    WHERE

      cr.fromcurrencycode = pcr.fromcurrencycode

      AND cr.tocurrencycode = pcr.tocurrencycode

      AND cr.currencyratedate > pcr.currencyratedate

    ORDER BY

      pcr.currencyratedate DESC

  ) AS Previous_day_EndOfDayRate

FROM

  sales.currencyrate cr

WHERE

  cr.fromcurrencycode = 'USD'

  AND cr.tocurrencycode = 'ARS'

Getting the below output. 

Getting the expected result. 

Think about query, it is complex one. We can get the same result with the help of LAG() window function.

SELECT currencyratedate,

       fromcurrencycode,

       tocurrencycode,

       averagerate,

       endofdayrate,

       Lag(currencyratedate)

         OVER(

           ORDER BY currencyratedate) AS Previous_day_CurrencyRateDate,

       Lag(endofdayrate)

         OVER(

           ORDER BY currencyratedate) AS Previous_day_EndOfDayRate

FROM   sales.currencyrate

WHERE  fromcurrencycode = 'USD'

       AND tocurrencycode = 'ARS'

  

Got the same result.

LAG() function is very useful in the analysis related queries.

Syntax 

Use of

LAG (<column you need>) OVER ([PARTITION BY <expression>] ORDER BY <expression>)

See below example

SELECT customerid,

       salesorderid,

       Cast(orderdate AS DATE)    AS OrderDate,

       Lag(Cast(orderdate AS DATE))

         OVER(

           partition BY customerid

           ORDER BY salesorderid) AS PrevOrderDate

FROM   sales.salesorderheader;

  
Here we are seeing several NULL values in the results. Row 1 is the first row of the partition. We cannot find an earlier row, so NULL is returned for PrevOrderDate. In the above example we have not used offset. By default offset value is 1. If we want to access columns from rows farther away than one we can use the offset. See below

LAG(<column> [,<offset>]) OVER ([PARTITION BY <expression>] ORDER BY <expression>)

 See the below example

SELECT customerid,

       salesorderid,

       Cast(orderdate AS DATE)    AS OrderDate,

       Lag(Cast(orderdate AS DATE),2)

         OVER(

           partition BY customerid

           ORDER BY salesorderid) AS PrevOrderDate

FROM   sales.salesorderheader;

 
Here we are skipping 2 row. In the above example we are seeing multiple null. We can replace these null with default value. See the below syntax.

LAG(<column> [,<offset>] [,<default>]) OVER([PARTITION BY <expression>] ORDER BY <expression>)

 

SELECT customerid,

       salesorderid,

       Cast(orderdate AS DATE)    AS OrderDate,

       Lag(Cast(orderdate AS DATE),2,'1900-01-01')

         OVER(

           partition BY customerid

           ORDER BY salesorderid) AS PrevOrderDate

FROM   sales.salesorderheader;


No comments:

Post a Comment

If you have any doubt, please let me know.

Popular Posts