All Forums Database
proxyaya 7 posts Joined 04/15
02 Apr 2015
Classification of dates in different columns

Hello,
I have 4 different products (1=having the product, 0 = not having the product) with each, a corresponding start date, in 4 different columns:
For ex.
Customer_id  Prod1     Prod2     Prod 3     Prod4      Start_dt_prod1    Start_dt_prod2     Start_dt_prod3   Start_dt_prod4
100              1            0           1             1           04/02/2002                                         01/02/2015       10/03/2015
200              1             1          1            1             02/03/1978           10/11/1999             05/08/1962         10/10/2004
300              1             1          0             0            10/10/2014            10/10/2014
 
I have to create a new variable, with 4 positions, each corresponding to a product (1st position=prod1, 2nd position=prod2, etc...) and the value from 0 to 4 would be defined according to the start date.
For ex,
for customer '100': it would be '1023'  (because prod 1 has the first start_dt, there is no prod 2 for this customer, prod 3 has the second start_dt and prod 4 was the last to be started)
for customer '200', it would be '2314'
for customer '300', it would be '1100' because the 2 products have the same start_dt
 
I am trying to solve this with an endless number of 'case when' clauses.  However, I am wondering if there would be an easier and more effecient way to tackle this issue.
 
Thank you very much for your help !

dnoeth 4628 posts Joined 11/04
07 Apr 2015

This is a horrible data model, as a result you need horrible SQL :-)
 
Following approach normalizes the columns into rows and the RANKs the data. Maybe you need a DENSE_RANK instead of a RANK, this is supported in TD14.10: 

SELECT dt.*
, CASE WHEN Prod1 = '0' THEN '0' ELSE TRIM(rnk) END ||
  CASE WHEN Prod2 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n
                                                            ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)) end ||
  CASE WHEN Prod3 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n
                                                            ROWS BETWEEN 2 FOLLOWING AND 2 FOLLOWING)) end ||
  CASE WHEN Prod4 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n
                                                            ROWS BETWEEN 3 FOLLOWING AND 3 FOLLOWING)) end
FROM
 (
   SELECT t.*,
     n,
     RANK() OVER (PARTITION BY customer_id ORDER BY COALESCE( CASE n 
        WHEN 1 THEN Start_dt_prod1
        WHEN 2 THEN Start_dt_prod2
        WHEN 3 THEN Start_dt_prod3
        WHEN 4 THEN Start_dt_prod4
     END, DATE '9999-12-31')) AS rnk
   FROM yourtable AS t
   CROSS JOIN nums -- this is a table with a single column and four rows in it: 1,2,3,4
 ) AS dt
QUALIFY n = 1

I don't know if this is actually easier/more efficient, at least it could be easily extended to more than 4 columns...

Dieter

proxyaya 7 posts Joined 04/15
09 Apr 2015

Thank you soooooo much, it works !!!  I am really impressed :-)

proxyaya 7 posts Joined 04/15
09 Apr 2015

oups.... Answered too fast...
I found such results:
Customer_id    Prod_1    Prod_2    Prod_3   Prod_4    Start_dt_prod_1  Start_dt_prod_2   Start_dt_prod_3   Start_dt_prod_4
888                      1           1            1           1          05/11/1985         02/12/2009        15/09/2009             05/11/1985
 
and this results in '1431' instead of '1321'...
 
Any idea ?
 
Thanks a lot !!!!

dnoeth 4628 posts Joined 11/04
09 Apr 2015

That's why I wrote "maybe you need a DENSE_RANK instead of a RANK" :-)
Are you on TD14.10?

Dieter

proxyaya 7 posts Joined 04/15
09 Apr 2015

No, I have version TD13.00 and DENSE_RANK is not recognized...
 

dnoeth 4628 posts Joined 11/04
10 Apr 2015

For DENSE_RANK it's a bit more complicated, see Missing Functions: DENSE_RANK

SELECT dt.*
, CASE WHEN Prod1 = '0' THEN '0' ELSE TRIM(rnk) END ||
  CASE WHEN Prod2 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n NULLS LAST
                                                            ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)) end ||
  CASE WHEN Prod3 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n NULLS LAST
                                                            ROWS BETWEEN 2 FOLLOWING AND 2 FOLLOWING)) end ||
  CASE WHEN Prod4 = '0' THEN '0' ELSE TRIM(MIN(rnk) OVER (PARTITION BY customer_id ORDER BY n NULLS LAST
                                                            ROWS BETWEEN 3 FOLLOWING AND 3 FOLLOWING)) end
FROM
 (
   SELECT dt.*,
     SUM(flag) OVER (PARTITION BY customer_id ORDER BY dat, n ROWS UNBOUNDED PRECEDING) AS rnk
   FROM 
     ( 
       SELECT t.*, n
         ,COALESCE( CASE n 
                       WHEN 1 THEN Start_dt_prod1
                       WHEN 2 THEN Start_dt_prod2
                       WHEN 3 THEN Start_dt_prod3
                       WHEN 4 THEN Start_dt_prod4
                    END, DATE '9999-12-31') AS dat,
          CASE WHEN ROW_NUMBER() 
                    OVER (PARTITION BY customer_id, dat
                          ORDER BY n) = 1 
               THEN 1 
               ELSE 0 
          END AS flag
       FROM your table as t 
       CROSS JOIN nums -- this is a table with a single column and four rows in it: 1,2,3,4
     ) dt
 ) AS dt
 QUALIFY n = 1

 

Dieter

proxyaya 7 posts Joined 04/15
10 Apr 2015

Thanks, Dieter.
I tried this but get the error message 3706 'expected the word RESET or ')' after ORDER BY clause.
Could sth be missing ? 

dnoeth 4628 posts Joined 11/04
10 Apr 2015

My syntax was ok, maybe you removed a ")"?

Dieter

proxyaya 7 posts Joined 04/15
10 Apr 2015

Triple checked the SQL ...and can not find the issue (the only thing I know is that the issue comes from the top 4 case when lines....because the rest of the sql runs fine... maybe due to my sql editor ???).  Anyway, thanks a lot, you helped me a great deal.

dnoeth 4628 posts Joined 11/04
11 Apr 2015

Now I see, remove the "NULLS LAST", it's TD14.10 syntax and not needed in your case, it was a cut&paste error...

Dieter

proxyaya 7 posts Joined 04/15
13 Apr 2015

Big thank you ! It works !

You must sign in to leave a comment.