All Forums Database
Jakob.E_SuperG 5 posts Joined 01/12
18 Jan 2012
Best way to update single column in a large table.

Hi.

I have a large table with 40+ columns. (1 billion rows)

The table looks something like this: (altered).

CREATE MULTISET TABLE CORE.SALES_TRANS_LINE ,
NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO
   (
   SK_SALES_TRANS_LINE_ID DECIMAL(18,0) NOT NULL,
   SK_TIME_DAY_ID INTEGER NOT NULL,
   VAL_STATUS_CD VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC,
   .
   .
   .
   40 other colomns.
   .
   .
   .
CONSTRAINT PK_SALES_TRANS_LINE PRIMARY KEY ( SK_SALES_TRANS_LINE_ID ))
PRIMARY INDEX ( SK_SALES_TRANS_LINE_ID )
PARTITION BY RANGE_N(SK_TIME_DAY_ID  BETWEEN
20100101  AND 20100131  EACH 1 ,
20100201  AND 20100228  EACH 1 ,
.
.
.
20131201  AND 20131231  EACH 1 , NO RANGE, UNKNOWN)
UNIQUE INDEX SI_ID123 ( BK_STORE_ID ,BK_POS_REG_ID ,BK_SALES_TRANS_ID ,BK_SALES_TRANS_LINE_NUM )
INDEX FI_122 ( SK_TIME_DAY_ID )
INDEX SI_124 ( VAL_STATUS_CD )
.
.
10 other INDEX's
.
.
;

The situation is now that 75% of the rows has to be updated in one of the columns:

UPDATE CORE.SALES_TRANS_LINE
SET VAL_STATUS_CD = 'NY' 
WHERE VAL_STATUS_CD = 'VM'  ;

But as you probably already have figured out, this will take for ever. An attemp by one of the developers took 7 hours, not completing and 12 hours to rollback :-)

What is the best and fastest way to do this update?

Hope you can help.

 

ulrich 816 posts Joined 09/09
18 Jan 2012

Do you have enough space for a copy of the table?

So insert select into an empty table having the same table definition and the Select is applying the update logic via a case statement...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
18 Jan 2012

and afterwards rename tables and drop old table etc...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
18 Jan 2012

P.S. Did you drop the

INDEX SI_124 ( VAL_STATUS_CD )

before the update?

P.S.S. -not advisable for a long term solution but can give you some time for the implementation. Did you consider to create a view where you present the new value via a case statement.

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

Jakob.E_SuperG 5 posts Joined 01/12
19 Jan 2012

Thanks Ulrich for your comments.

By the way we have Teradata 13.10 installed, so it is a quite new system.

Last thing first. As far as I know the INDEX on (VAL_STATUS_CD) where not removed before the update where attempted, we will test if that makes any significant change.

When it comes to the view, it will not be a long term solution, since the VM value will be used later on.

So if we have enough space you suggest;

- create temp table,
- insert correct values into temp table,
- drop target table
- rename temp table.

How about

- create temp table,
- insert only affected rows, with correct values into temp table,
- delete affected rows from target table,
- insert rows from temp table to target table.

Or will that not make any difference.

ulrich 816 posts Joined 09/09
19 Jan 2012

delete and insert will result in journal where insert/select into an empty table will not. 

As you are going to update 75% of the table I would expect that insert / select into empty table will be much faster...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

Balamurugan B 81 posts Joined 09/07
19 Jan 2012

Hi,

 

If space is not a constraint, you can go for Insert/Select option, as explained by ulrich.

 

But if you have space constraint and looking for some other option, then try teradata MERGE option:

 - Create a work table with PK and column which you would like to update.

 - Insert into work table with new value

 - Perform Merge between work table and Target table

 

This would minimize downtime of Target table and faster than direct update.

 

Regards,

Balamurugan

 

Regards,
Balamurugan

You must sign in to leave a comment.