All Forums Database
Moutusi 38 posts Joined 03/13
06 Nov 2015
Minus Vs Exists

Hi,
I have two large multiset tables and I need to check if they are identical. In that case which one will give better performance: MINUS or NOT EXISTS?

Thanks, Moutusi
Tags:
ulrich 816 posts Joined 09/09
07 Nov 2015

Multiset - do you have duplicates?
If yes: both scenarios might not give you what you need.
Check:

create volatile multiset table vt_a (a integer) no primary index;
create volatile multiset table vt_b (a integer) no primary index;
insert into vt_a values (1);
insert into vt_a values (1);
insert into vt_a values (2);
insert into vt_b values (1);
insert into vt_b values (2);
insert into vt_b values (2);
select * from vt_a
minus 
select * from vt_b;
select * from vt_a a
 where not exists 
 ( select * from vt_b b where a.a = b.a)
;

Both queries give you an empty result set. But I think we agree that both tables are not identical!
You can export the ordered tables via tpt and run a SHA256 checksum and compare the checksum.
Check Example Java UDF for Table Hash Calculations
and
Calculation of Table Hash Values to Compare Table Content
The provided Java UDF will not work in your case as the Java UDF has not a good performance.
My impression is that the problem of "how to compare table content" is not so far taken serioisly in the past. Espacially in case of multi systems...
Ulrich

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

Moutusi 38 posts Joined 03/13
09 Nov 2015

Thanks Ulrich for your reply.
Those two tables are multi set but will not have duplicate rows. We are currenlt using hashrow function for CDC but in past eperienced hash collision.

Thanks,
Moutusi

ulrich 816 posts Joined 09/09
09 Nov 2015

In case of multiset tables your assumption should be that duplicates can exists. The DB will not prevent it...
You need real strong hash functions like SHA-256 to avoid the hash collisions. As stated in the presentation I would not use any hashes below md5 bit length...
A Teradata internal table hash function would be a good feature for future releases ;->

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

You must sign in to leave a comment.