All Forums Database
td_admirer 36 posts Joined 07/12
23 Oct 2012
Help with SQL

Hello Experts,
Hope you are doing well.
 
I need some help determining the logic for the below scenario. The sample data is as follows
record    col1        dt             col3  col4      col5
===============================
 1        3202     09/27/2012      2     abc       ?
 2        3202     09/27/2012      4     abc       ?
 3        3202     09/27/2012      5     abc       ?
 4        3202     10/03/2012      5     abc       ?
 5        3202     10/13/2012      2     abc       ?
 6        3202     10/13/2012      3     abc       ?
 7        3202     10/13/2012      4     abc       ?
 8        3202     10/13/2012      5     abc       ?
 9        3202     10/14/2012      5     abc       ?
 10       3202     10/15/2012      4     abc      ?
 11       3202     10/16/2012      5     abc      ?
 
The desired output is as below. The logic to update col5 is as follows
(1) Only the first record (col5) for a combination of col1,dt,col3,col4 is updated and the remaining records for the same combination will always have NULL.
(2) Since there is no previous record for col4 for any day prior to 9/27, col5( record#1 ) is set to NULL.  Record #'s 2 and 3 will always be null.
(3) Looking at records 3 and 4, since the date difference is 6 days and the diff between col3 on those records ( 5 and 5) are same (or lower), it will be 6-1=5. If the record 3 had a col3 value of 4, it would have been 6. Again records 5,6,7,8 will be NULL based on rule 1.
(4) Looking at records 8 and 9, the dt difference is 1, but since col3 is same, the col5 value will be 0. Had col3 on record 8 were a 4 or lower, col5 on record 9 would have been set to 1
(5) Looking at records 9 and 10, the date difference is 1, but since col3 on record 10 is lower than the col3 value on 9, it will have a 0.
(6)  Looking at records 10 and 11, the date difference is 1, but since col3 on record 11 is greater than the col3 value on 10, it will have a the date differece which is 1.
record#   col1        dt               col3  col4      col5
===============================
1            3202     09/27/2012      2     abc       ?
2            3202     09/27/2012      4     abc       ?
3            3202     09/27/2012      5     abc       ?
4            3202     10/03/2012      5     abc       5
5            3202     10/13/2012      2     abc       9
6            3202     10/13/2012      3     abc       ?
7            3202     10/13/2012      4     abc       ?
8            3202     10/13/2012      5     abc       ?
9            3202     10/14/2012      5     abc       0
10          3202     10/15/2012      4     abc       0
11          3202     10/16/2012      5     abc       1
 
 
I am sorry, this is confusing, I wish I could explain it better. Any and all help on this would be greatly appreciated.
 
Thanks much.

WAQ 158 posts Joined 02/10
23 Oct 2012

Its always good to provide the DDL's and DML's in order to get the resolution quickly.

td_admirer 36 posts Joined 07/12
24 Oct 2012

Hello,
Good Morning!!
Thank you very much for looking into this.
Here is the DDL/DML . Kindly let me know if you need any further info.
CREATE volatile TABLE test  ,FALLBACK ,
      NO BEFORE JOURNAL,
      NO AFTER JOURNAL,
      CHECKSUM = DEFAULT,
      DEFAULT MERGEBLOCKRATIO
      (
       COL1 BIGINT NOT NULL,
       DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,
       COL3 SMALLINT NOT NULL,
       COL4 VARCHAR(84) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
       COL5 INTEGER
       )
 PRIMARY INDEX ( COL3 )
 on commit preserve rows;
insert into test values(3202, date '2012-10-03',5,'abc',);
insert into test values(3202, date '2012-10-13',3,'abc',);
insert into test values(3202, date '2012-10-13',2,'abc',);
insert into test values(3202, date '2012-09-27',2,'abc',);
insert into test values(3202, date '2012-09-27',4,'abc',);
insert into test values(3202, date '2012-10-13',4,'abc',);
insert into test values(3202, date '2012-10-15',4,'abc',);
insert into test values(3202, date '2012-09-27',5,'abc',);
insert into test values(3202, date '2012-10-13',5,'abc',);
insert into test values(3202, date '2012-10-14',5,'abc',);
insert into test values(3202, date '2012-10-16',5,'abc',);
sel * from test
order by 1,2,3;

ulrich 816 posts Joined 09/09
24 Oct 2012

I guess rule 1 is not correct
as

sel COL1,dt,col3,col4 from vt_test
group by 1,3,4,2 
having count(*) > 1; 

gives null rows.
-> all rows are first occurance...
So try to explain it in different words...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

td_admirer 36 posts Joined 07/12
24 Oct 2012

Hi Ulrich,
For a combination of col1,dt,col4 the record with col3 value "2" is the first (which is record# 1) for dt 9/27. For dt 10/13, record # 5 is the first.
In other words, these are the records that should be updated.
sel * from test
qualify row_number() over (partition by col1,dt,col4 order by col3)=1;
Please help.

dnoeth 4628 posts Joined 11/04
24 Oct 2012

Based on your narrative:

SELECT
  test.*,
  CASE
     WHEN ROW_NUMBER() OVER (PARTITION BY col1,dt,col4 ORDER BY col3) = 1 
     THEN (dt-MIN(dt) OVER (PARTITION BY col1 ORDER BY dt,col4,col3 ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING))
          - CASE
              WHEN col3-MIN(col3) 
                        OVER (PARTITION BY col1 
                              ORDER BY dt,col4,col3 
                              ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) <= 0 
              THEN 1 
              ELSE 0 
            END
  END
FROM test

Dieter

Dieter

td_admirer 36 posts Joined 07/12
24 Oct 2012

Thank you very much Dieter. This is exactly what I've wanted. I am running out of spool when when run on a table with a couple million records. Can you think of something that can give me a better performance? I will try it on my side as well.
Very much appreciate your help.
 
 

You must sign in to leave a comment.