All Forums Database
Siddesh 1 post Joined 07/12
04 Mar 2013
How to update the duplicate records in Teradata

I have a requirement to update the duplicate records in teradata. For E.G. if col, col, col, col have same values in the table, those records are regarded as duplicates and duplicate flag needs to be updated as '1' for all those records.

Any suggestions on how implement this logic?

05 Mar 2013

Hi Siddesh,
Is this your requirement - if there are 3 rows in a table having the same data in all the columns then you need
to mark only 2 of them as duplicates and one remains unchanged or you want to mark all 3 rows as
duplicates ?
 
if possible please provide a sample example
 
Regards
R.Rajeev

simhadrijk 18 posts Joined 04/10
06 Mar 2013

Hi Siddesh,
As commented by rajeev, sample would provide more clarity on your requirement.
on a brief note:
You can use a self left join to table with left joining table to be a derived table on main table with duplicated PI (if you are checking for complete duplicated row) and you can use case statements to compare the columns to mark the duplicate flag.
cheers 
Jagdish

Jagdish Kumar Simhadri

dnoeth 4628 posts Joined 11/04
06 Mar 2013

Hi Siddesh,
following sets the duplicate flag for all duplicate rows
UPDATE tab
SET dupflag = 1
WHERE (col1,col2,col3) IN
(SELECT col1,col2,col3 
 FROM tab
 GROUP BY 1,2,3
 HAVING COUNT(*) > 1
)

 

Of course those columns should be NOT NULL.

 

Depending on the percentage of duplicate rows it might be more efficient to the the calculation during an insert/select using an OLAP function:

 

CASE WHEN COUNT(*) OVER (PARTITION BY col1,col2,col3) > 1 THEN 1 ELSE 0 END

Dieter

Dieter

14 May 2013

What should be the UPDATE statement in case of below requirement.
Requirement - if there are 2 perfect duplicate rows in a table having the same data in all the columns then i need to update column1 of one row only. 
 

KS42982 137 posts Joined 12/12
14 May 2013

You can use row_number to assign different numbers to each perfectly duplicate rows and then update 1 columns of 1 row by putting filter condition like row_number = 1

You must sign in to leave a comment.