All Forums General
ericsun2 44 posts Joined 06/10
05 Jun 2012
How to write FIRST_VALUE() and LAST_VALUE() window function in Teradata?

ORACLE's FIRST_VALUE() and LAST_VALUE analytic function is pretty handy. In TD 13 and 14, how can we achieve the same result with limited effort?

The old-fashioned way is to achieve FIRST_VALUE(PROMOTION_ID), LAST_VALUE(SHIP_TO_ADDR_ID):

select
  customer_id, order_month, 
  last_order_item_id, last_order_date,
  cast( substr(promotion_id_first, 5) as int ) first_promotion_id,
  cast( substr(ship_to_addr_id_last, 5) as int ) last_ship_to_addr_id
from (
select
  customer_id, order_month,
  max(order_item_id) last_order_item_id, -- assuming it is auto incremental
  max(order_date) last_order_date, 
  min(cast( (sort_id format '9(4)') as char(4) )) || promotion_id_first,
  max(cast( (sort_id format '9(4)') as char(4) )) || ship_to_addr_id_last
from (
  select customer_id, to_char(order_date, 'YYYY-MM') as order_month,
  order_date, order_item_id, product_id, promotion_id, ship_to_addr_id,
  rank() over (partition by customer_id order by order_date, order_item_id) sort_id
  from fact_order_tx
) t1
group by 1,2
) t2
order by 1,2

This code works, but it's not pretty.

Does anybody has a better way? Thanks for sharing...

 

ulrich 816 posts Joined 09/09
06 Jun 2012

also not pretty but would require no derived tables

Simple example from sys_calendar.

Give first and last calendar_week per calendar_date - (can be done differenty in this particular case but should be a good illustartion).

select year_of _calendar,
          month_of _year, 
          calendar_date,
          cast(substr(min(calendar_date!!week_of _year) over (partition by year_of _calendar, month_of _year order by calendar_date asc),11) as smallint) firstweek,
         cast(substr(max(calendar_date!!week_of _year) over (partition by year_of _calendar, month_of _year order by calendar_date asc),11)as smallint) lastweek
from sys_calendar.calendar
where year_of _calendar = 2012;

So for your code I would expect something like - not tested and data types might be wrong, as I don't have your ddls and data:

  select customer_id, 
        to_char(order_date, 'YYYY-MM') as order_month,
        max(order_item_id) over (partition by customer_id, order_month) as last_order_item_id,
        max(order_date) over (partition by customer_id, order_month) as last_order_date,
        cast(substr(min(cast(order_date as char(10) !! cast(order_item_id as char(10)) !! promotion_id) over (partition by customer_id, order_month),21) as integer) as promotion_id_first,
        cast(substr(max(cast(order_date as char(10) !! cast(order_item_id as char(10)) !! ship_to_addr_id) over (partition by customer_id, order_month),21) as integer) as ship_to_addr_id_last
  from fact_order_tx
  qualify row_number() over (partition by order_date, order_item_id order by order_date,order_item_id) = 1
  ;

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ericsun2 44 posts Joined 06/10
06 Jun 2012

Thanks Ulrich.

Your code is using the same method as mine, the difference is to do the min() / max() and to peel off the prefix directly inside the main query. 

+1 point for you!

The reason I raise this question is - the string prefix used to generate the proper sort order in MIN()/MAX() is not consistent - DATE is different from TIMESTAMP, differenct from NUMERIC, which means we have to come up with various CAST() with various length. And we are using one assumption, if order_item_id is null, then cast(order_date as char(10) || order_item_id) is NULL, hence it will be ignored in MIN()/MAX().

 

 

 

ulrich 816 posts Joined 09/09
06 Jun 2012

as I said, not pretty,

you can cast all fields separatly into fix length chars before concat and nulls can be handled via coalesce(cast(colx as char(10)),'NULLNULLNU')) - which will not make the code more readable ;-(

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

dnoeth 4628 posts Joined 11/04
09 Jun 2012

To avoid typecasts nested OLAP functions will work, too:

SELECT year_of_calendar,
       month_of_year,
       calendar_date,
       MIN(week_of_year_first) OVER (PARTITION BY year_of_calendar, month_of_year) AS firstweek,
       MIN(week_of_year_last) OVER (PARTITION BY year_of_calendar, month_of_year) AS lastweek
FROM
 (
   SELECT year_of_calendar,
          month_of_year,
          calendar_date,
          CASE WHEN ROW_NUMBER() OVER (PARTITION BY year_of_calendar, month_of_year 
                                       ORDER BY calendar_date) 
                    = 1 
               THEN week_of_year END AS week_of_year_first,
          CASE WHEN ROW_NUMBER() OVER (PARTITION BY year_of_calendar, month_of_year 
                                       ORDER BY calendar_date) 
                    = COUNT(*) OVER (PARTITION BY year_of_calendar, month_of_year) 
               THEN week_of_year END AS week_of_year_last
   FROM sys_calendar.CALENDAR
   WHERE year_of_calendar = 2012
 ) dt;

The Derived Table runs with a single STAT step, but it's the RESPECT NULLS version.

I don't know if it's possible to get the IGNORE NULLs without additional step, i never tried it, because i avoid NULLs :-)

Dieter

Dieter

vinaywani 10 posts Joined 11/11
12 Nov 2013

Hi,
I have a query as follows,
SELECT DISTINCT CD_MES, CD_CONTRATO,
                                FIRST_VALUE (ST_TARJETA) OVER (PARTITION BY CD_MES, CD_CONTRATO
                                            ORDER BY FH_ACTIVIACION_PLASTICO ASC) AS ST_TARJETA,
                                FIRST_VALUE (FH_ACTIVIACION_PLASTICO) OVER (PARTITION BY CD_MES, CD_CONTRATO
                                            ORDER BY FH_ACTIVIACION_PLASTICO ASC) AS FECHA_ACTIVIACION_PLASTICO,
                                FIRST_VALUE (NU_BIN_TARJETA) OVER (PARTITION BY CD_MES, CD_CONTRATO
                                            ORDER BY FH_ACTIVIACION_PLASTICO ASC) AS CD_BIN
                          FROM  GORAPR.TBO351_DETALLE_PLASTICO
                          WHERE CD_MES = TO_NUMBER(TO_CHAR(ADD_MONTHS(CURRENT_DATE, 1),'YYYYMM'))
 
and I have converted it as
SELECT  DISTINCT CD_MES, CD_CONTRATO,
                                
                                ST_TARJETA AS ST_TARJETA
                                ,
                                FH_ACTIVIACION_PLASTICO AS FECHA_ACTIVIACION_PLASTICO,
                                
                                NU_BIN_TARJETA AS CD_BIN
                          FROM  KIDSMXPGC.TBO351_DETALLE_PLASTICO
                          WHERE CD_MES = TO_NUMBER(TO_CHAR(ADD_MONTHS(CURRENT_DATE, 1),'YYYYMM'))
                        QUALIFY ROW_NUMBER()  OVER (PARTITION BY CD_MES, CD_CONTRATO
                                            ORDER BY FH_ACTIVIACION_PLASTICO ASC)=1;
Is it correct?

Raja_KT 1246 posts Joined 07/09
13 Nov 2013

Hi,
I am not pretty sure of the way first_value function works with NULL etc. There maybe Null VALUES too.
If it is me to be on the safer side and to meet business requirements, then I would take on each field as modeled by Dieter's in a derived  and then take them one by one out.
like CASE WHEN ROW_NUMBER() OVER (PARTITION BY CD_MES, CD_CONTRATO ORDER BY FH_ACTIVIACION_PLASTICO ASC)=1 then ST_TARJETA end as ST_TARJETA and so on and so forth.
 
I also feel that by the name of the function itself, an OLAP order by asc and desc and filtering =1 is enough to get the first value, last value.
Please let me know.
Cheers,
Raja

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

You must sign in to leave a comment.