All Forums Database
RandomGuy234 9 posts Joined 05/11
09 Jan 2012
Keep Duplicate Rows

Hello,

We are trying to load a table based on a huge flat-file (225 million rows) and there are several rows in there that are duplicates.  However, we need to keep every one of them.  We have tried using a Multiload but it took an hour to get through one million records.  So we either need a method of using FastLoad and keeping the duplicates or some method of greatly increasing the speed of our Multiload. 

I am not the database admin, so I do not have certain info readily available, but I should be able to find answers to any questions you may have that would help me solve this problem.

Any and all suggestions are greatly appreciated.

dnoeth 4628 posts Joined 11/04
09 Jan 2012

If it's a staging table and you're on TD13 you could use a No Primary Index (NoPI) table, FastLoad is capable of loading duplicates in this case.

Otherwise i wouldn't expect MLoad to be that slow, what's the size of the input records and the table DDL?

Dieter

Dieter

RandomGuy234 9 posts Joined 05/11
09 Jan 2012

The size of the input records is 106

The DDL is :

CREATE MULTISET TABLE DMODOR_STAGE.TB_MITS_ADJ ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      NO_TAX_ID CHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_PRD_FILE CHAR(6) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_BAS CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_PAY CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      ADR_CITY_CODE_PAY CHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC,

      ADR_COUNTY_PAY CHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_SITE_PAY CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_TAX_ITEM_PAY CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_PRD_FILE_ADJ CHAR(6) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_ADJ CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_TYPE_ADJ CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      AMT_FIELD_ADJ DECIMAL(12,2),

      CD_TYPE_TAX_ADJ CHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_LAST_UPD_ADJ CHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC)

PRIMARY INDEX ( NO_TAX_ID );

ulrich 816 posts Joined 09/09
09 Jan 2012

Is NO_TAX_ID skewed?

How many vprocs do you have?

And regarding "it took an hour to get through one million records"

I guess this is referring to the acquisition phase - correct? In case yes, do you have enough bandwidth from your ETL server to the TD system?

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

RandomGuy234 9 posts Joined 05/11
10 Jan 2012

The issue has been resolved using the NoPI option.  Thanks Dieter!

You must sign in to leave a comment.