All Forums Database
Indyan 4 posts Joined 06/15
30 Jun 2015
Replace NULL Values in Table with Most Recent NOT NULL values

I have a table with the following layout

cust_acct_id (PK)| effect_dt | expiry_dt | curr_rec_ind | field_a | field_b | field_c | field_d

I have to populate any NULL values in field_a, field_b, field_c, or field_d with the most recent value received for that field for the same cust_acct_id.

Sample Input:

cust_acct_id    effect_dt   expiry_dt  curr_rec_ind   field_a  field_b  field_c  field_d
           1   2013-01-01  2013-01-31             N   abc      NULL     a        NULL
           2   2013-01-01  9999-12-31             Y   NULL     xy       a        abc
           3   2013-01-01  2013-02-28             N   a        b        NULL     d
           1   2013-02-01  9999-12-31             Y   NULL     NULL     b        NULL
           4   2013-02-01  9999-12-31             Y   x        NULL     a        NULL
           3   2013-03-01  2013-03-31             N   NULL     c        yy       NULL
           3   2013-04-01  9999-12-31             Y   x        NULL     NULL     NULL

Sample Output

cust_acct_id    effect_dt   expiry_dt  curr_rec_ind   field_a  field_b  field_c  field_d
           1   2013-01-01  2013-01-31             N   abc      NULL     a        NULL
           2   2013-01-01  9999-12-31             Y   NULL     xy       a        abc
           3   2013-01-01  2013-02-28             N   a        b        NULL     d
           1   2013-02-01  9999-12-31             Y   abc      NULL     b        NULL
           4   2013-02-01  9999-12-31             Y   x        NULL     a        NULL
           3   2013-03-01  9999-12-31             Y   a        c        yy       d
           3   2013-04-01  9999-12-31             Y   x        c        yy       d

Please let me know how to construct the update query to achieve this. I am not sure whether this can be done through 1 SQL or it needs a separate update query for each of the fields (field_1/field_2...).
 

Indyan 4 posts Joined 06/15
30 Jun 2015

I'm sorry. I am new to the forum. The sample data above looks all messed up. I'm not sure how to publish it with correct formatting.

dnoeth 4628 posts Joined 11/04
30 Jun 2015

You need to put tabular data as a "code snippet".
Regarding your question, this has been asked several times during the last day :-)
Problem with history table
Compare current row with previous NOT NULL row in Teradata.
 
If you're on TD14.10 you can use LAST_VALUE:

LAST_VALUE(field_a IGNORE NULLS)
OVER (PARTITION BY cust_acct_id
      ORDER BY effect_dt
      ROWS UNBOUNDED PRECEDING)

 Before you need to use RESET WHEN, which is way less performant.

Dieter

Indyan 4 posts Joined 06/15
30 Jun 2015

I had tried using LAST_VALUE in DEV and that was working. Unfortunately, in PROD we still use TD 14.0, and the upgrade to TD 15.0 is still a few months away. So I was stumped. Thanks for the tip about RESET WHEN.

Now, we have a couple of more conditions:

- ZZ in string fields and 0 in numeric fields are equivalent to NULL.

Here's how I was trying to do it with LAST_VALUE
 

SELECT cust_acct_id, effect_dt, exp_dt, field_a
   ,LAST_VALUE (case field_a  when 'ZZ' then NULL ELSE field_a END IGNORE NULLS)  OVER (PARTITION by CUST_ACCT_ID ORDER BY effect_dt) 
   FROM adwdmd_car_work1.last_value;

With RESET WHEN, this is what I am thinking of doing.

     select
        cust_acct_id, effect_dt, exp_dt,field_a, 
        min(field_a)
        over (partition by CUST_ACCT_ID
              order by effect_dt
              reset when field_a is not null and  field_a <> 'ZZ'
              rows unbounded preceding)
     from adwdmd_car_work1.last_value;

 
- The second part is that sometimes a cust_acct_id changes. In such cases, we get the old cust_acct_id in a field called old_cust_acct_id. After we have updated field_a with the above queries, we need to run another update query which will attempt to replace NULL values in field_a with last value obtained where cust_acct_id=old_cust_acct_id.
Example is shown below where cust_acct_id 3 changes to 33.
INPUT

cust_acct_id 	old_cust_acct_id	 effect_dt 	 expiry_dt 	 curr_rec_ind 	 field_a 	 field_b 	 field_c 	 field_d
1	NULL	2013-01-01	2013-01-31	N	abc	NULL	a	NULL
2	NULL	2013-01-01	9999-12-31	Y	NULL	xy	a	abc
3	NULL	2013-01-01	2013-02-28	N	a	b	NULL	d
1	NULL	2013-02-01	9999-12-31	Y	NULL	NULL	b	NULL
4	NULL	2013-02-01	9999-12-31	Y	x	NULL	a	NULL
3	NULL	2013-03-01	2013-03-31	N	NULL	c	yy	NULL
3	NULL	2013-04-01	9999-12-31	Y	x	NULL	NULL	NULL
33	3	2013-05-05	9999-12-31	Y	NULL	NULL	a	NULL

 
 
OUTPUT

cust_acct_id 		 effect_dt 	 expiry_dt 	 curr_rec_ind 	 field_a 	 field_b 	 field_c 	 field_d
1		2013-01-01	2013-01-31	N	abc	NULL	a	NULL
2		2013-01-01	9999-12-31	Y	NULL	xy	a	abc
3		2013-01-01	2013-02-28	N	a	b	NULL	d
1		2013-02-01	9999-12-31	Y	abc	NULL	b	NULL
4		2013-02-01	9999-12-31	Y	x	NULL	a	NULL
3		2013-03-01	9999-12-31	Y	a	c	yy	d
3		2013-04-01	9999-12-31	Y	x	c	yy	d
33	3	2013-05-05	9999-12-31	Y	x	c	a	c

This part has me stumped.
 
 
 
 
 

dbillingslea 1 post Joined 06/15
30 Jun 2015

would the old_cust_acct_id always be 3 for cust_acct_id = 33?  could you just coalesce the value in the partition?
 
i.e.
 

SELECT
cust_acct_id
,effect_dt
,expiry_dt
,LAST_VALUE(CASE WHEN field_a = 'ZZ' THEN NULL ELSE field_a END IGNORE NULLS)
 OVER (PARTITION BY COALESCE(old_cust_acct_id,cust_acct_id)
       ORDER BY effect_dt
       ROWS UNBOUNDED PRECEDING) AS new_field_a
FROM tst_cust_t
ORDER BY 2,1
;
CUST_ACCT_ID	EFFECT_DT	EXPIRY_DT	new_field_a
1	1/1/2013	1/31/2013	abc
2	1/1/2013	12/31/9999	?
3	1/1/2013	2/28/2013	a
1	2/1/2013	12/31/9999	abc
4	2/1/2013	12/31/9999	x
3	3/1/2013	3/31/2013	a
3	4/1/2013	12/31/9999	x
33	5/5/2013	12/31/9999	x

 

Indyan 4 posts Joined 06/15
01 Jul 2015

No.
The source will always send us value in old_cust_acct_id when the cust_acct_id is getting converted. But, henceforth they will send is NULL in the value.
If I use COALESCE, it wont work for the converted records in which old_cust_acct_id is now NULL.
 
However, I can achieve what I want, by running 2 queries - first time with partition on CUST_ACCT_ID and second time with partition on COALESCE(OLD_CUST_ACCT_ID, UST_ACCT_ID).
 
Please let me know if there's any way to do this with a single query, otherwise I'll adopt the above 2-step approach.
 
Thanks for the help.

You must sign in to leave a comment.