All Forums Tools
chk32 14 posts Joined 11/13
09 Jul 2015
Why TPT mload is taking long time than bteq for huge volume of data

Hi ,
 
I am deleting 80 millions of records from a table , for this tpt is taking 12 hours to complete where as bteq deleting data in just 2 hours. Actually tpt should be more faster than the bteq.  Please suggest me how can i avoid this long run thrugh tpt.

feinholz 1234 posts Joined 05/08
10 Jul 2015

Are you using the Delete Task to delete the records?
Or the generic DELETE FROM . . . ?

--SteveF

chk32 14 posts Joined 11/13
12 Jul 2015

Hi Steve, I am physically deleting the data from the table.
 

feinholz 1234 posts Joined 05/08
13 Jul 2015

For that many rows, maybe try the DELETE Task.

--SteveF

chk32 14 posts Joined 11/13
13 Jul 2015

Hi Steve, pelase let me know what exactly is delete task.My TPT script includes apply which deletes data, exported by export operator. Is this delete is different to your delte task

feinholz 1234 posts Joined 05/08
13 Jul 2015

Please send me the script you are using currently to delete those 80 million rows.
 

--SteveF

chk32 14 posts Joined 11/13
13 Jul 2015

Hi Steve, I couldnt send the exact script due to some privacy problems. Please find below structure of my tpt script. 

Update operator detials

 

Define Operator Export

Type Export

Schema {tbshort}_Schema

Attributes

(logon details

,varchar SelectStmt = 'Select *

                       From

                         tableA

where conditions

                      ;'

Step Apply_Delta

( DELETE FROM tableA

where key_columns= :key_columns

             ;

            )

   End

 To Operator (Update[1])

 Select * From Operator Export[1]);

 

feinholz 1234 posts Joined 05/08
14 Jul 2015

The only privacy issues probably involve any password and you can always just remove that text.
I need to see all of the other attributes you have set up in the script.
What is this script attempting to do?
You are using the Export operator to read data from a file, in order to provde values for the WHERE clause in the DELETE?
The "Delete Tak" option in the Update operator is probably more what you are looking for. Please look at the "Delete Task Option" in the TPT Reference Manual (Chapter 16) for information.

--SteveF

chk32 14 posts Joined 11/13
14 Jul 2015

Hi Steve,
I am not using using the parameter Deletetask='Y' as I am also inserting few records in to table with the same script. But number of records that are being inserted are minimal compared to the delete records. Pls look below script parameters I am using. 
 

Define Operator Update_Operator

Description 'TPT  Update Operator'

Type Update

Schema Schema

Attributes

(integer maxsessions = 12

,integer minsessions = 2

,integer maxdecimaldigits = 38

,varchar targettable = 'TABLEA'

,varchar tdpid = '{tdp}'

,varchar username = 'username'

,varchar userpassword = 'password'

,varchar errortable1 = 'ETTABLE'

,varchar errortable2 = 'UVTABLE'

,varchar logtable = 'LTTABLE'

,varchar worktable = 'WTTABLE'

,varchar workingdatabase = 'LOGDATABASE'

);

 

Define Operator Export_Operator

Description 'TPT  Export Operator'

Type Export

Schema Schema

Attributes

(integer maxSessions = 12

,integer minSessions = 2

,integer maxdecimaldigits = 38

,varchar tdpid = '{tdp}'

,varchar username = 'username'

,varchar userpassword = 'password' 

,varchar workingdatabase = 'LOGDATABASE'

,varchar SelectStmt = 'Select  *  From  TABLEB Where condtions

                      ;'

);

 

 

Step Apply_Delta

(Apply

   Case

     When (   Delta_Action_Code = 'D'

          )

       Then ('Delete from TABLEA

 where Key=:Key

 and From_Date = :From_Date

 

              ;'

            )

   End

 To Operator (Update_Operator [1])

 Select * From Operator (Export_Operator [1]);

);

 

);

 

feinholz 1234 posts Joined 05/08
15 Jul 2015

I do not see where you specify that rows are being inserted.
The only DML statements I see is a DELETE.
 

--SteveF

chk32 14 posts Joined 11/13
15 Jul 2015

Hi Steve, There are two queries one with only delete and other with both delete and insert. I can add Delete task='Y' for the query with delete statement. But can i add deletetask for TPT script with both Insert and delete?
But here my question is why normal Bteq is taking considerable less time than TPT. Is that because of extra step TPTEXP which exports data before deleting. 
Please help me in understanding this.
Pls find script with both delete and insert statements.

Define Operator Update_Operator
Description 'TPT  Update Operator'
Type Update
Schema Schema
Attributes
(integer maxsessions = 12
,integer minsessions = 2
,integer maxdecimaldigits = 38
,varchar targettable = 'TABLEA'
,varchar tdpid = '{tdp}'
,varchar username = 'username'
,varchar userpassword = 'password'
,varchar errortable1 = 'ETTABLE'
,varchar errortable2 = 'UVTABLE'
,varchar logtable = 'LTTABLE'
,varchar worktable = 'WTTABLE'
,varchar workingdatabase = 'LOGDATABASE'
);
 
Define Operator Export_Operator
Description 'TPT  Export Operator'
Type Export
Schema Schema
Attributes
(integer maxSessions = 12
,integer minSessions = 2
,integer maxdecimaldigits = 38
,varchar tdpid = '{tdp}'
,varchar username = 'username'
,varchar userpassword = 'password' 
,varchar workingdatabase = 'LOGDATABASE'
,varchar SelectStmt = 'Select  *  From  TABLEB Where condtions
                      ;'
);
 
 
Step Apply_Delta
(Apply
   Case
     When (   Delta_Action_Code = 'D'
          )
       Then ('Delete from TABLEA
 where Key=:Key
 and From_Date = :From_Date
 
              ;'
            )
when (   Delta_Action_Code = 'A'
          )
       Then ('Insert into TABLEA
 values (Key=:Key,From_Date = :From_Date and rest...);')
   End
 To Operator (Update_Operator [1])
 Select * From Operator (Export_Operator [1]);
);
 
);

ToddAWalter 316 posts Joined 10/11
15 Jul 2015

You haven't described the BTEQ statement that you are comparing to or provided the explain plan for it but I assume since you have the list of rows to delete in tableb that you are doing a join delete.
The join delete will run completely parallel inside Teradata. If both tables have the same primary index/PPI then a spool file will not even be required.
The TPT job will have to spool tableA (80M), prepare it into an export spool (80M), export 80M rows to a client incurring serialization in the network and client, load 80M back into Teradata into an Update operator work table (80M) also experiencing serialization in the client and network, sort the work table (80M row sort), then use the list to delete from TableB.
If you have the list of rows to delete already prepared in Teradata, it is much more efficient to use a direct set based SQL operation than to unload and load/apply.

feinholz 1234 posts Joined 05/08
16 Jul 2015

Thanks Todd!
For a pure delete, I was going to steer this user into invoking the MultiLoad-protocol Delete Task (an application phase only operation) which is must faster than the approach currently taken, but still not sure how that would compare to a simple BTEQ Delete.
Does Journaling also play a factor?
 

--SteveF

chk32 14 posts Joined 11/13
16 Jul 2015

Thanks Steve and Todd, 

My Bteq query looks like below. 

 

Delete from TableA A

,TableB B

Where

A.Key=B.Key

And A.From_Date = B.From_Date

And  B.Action_Code = 'D'

;

 

.If Errorcode <> 0 Then .Quit 16

 

Insert into TableA

ColumnList

Select

Columnlist

 From TableB  B

 Where

      B.Action_Code = 'A'

;

 

I have verified the teradata view point for my TPT run, 12 sessoins are invoking in that 6 sessons are idle , one session is in responding state which does TPTEXP, other 5 does the TPTUPD which are active. The session with TPTEXP uses partition DBC/SQL. Is this correct? I expected that TPTEXP should use FEXP rather than the DBC/SQl Where TPTUPD uses parition as Mload.
 

Is my teradata configuration is wrong for the TPTEXP? Please comment on it .

THanks in advance.

feinholz 1234 posts Joined 05/08
16 Jul 2015

Based on your job scripts, the Export operator will connect 12 data sessions (and 1 SQL session for executing the SELECT statement).
The Update operator will also connect 12 data sessions (and 2 additional SQL sessions for submitting the DML statements and for managing the restart log table).
You should check your output log to see if TASM changed the number of sessions that were actually connected.
Is it possible that you saw idle sessions for the Export operator while the spool was being built, and prior to the data sessions actually extracting the data from the DBS?
(Also, you can remove the "maxdecimaldigits=38" from the Update operator definition; it serves no purpose.)
 

--SteveF

dnoeth 4628 posts Joined 11/04
16 Jul 2015

Why do you want to use TPT UPDATE if you know it's faster with SQL DELETE/INSERT?
There's only one advantage of TPT UPDATE:
It needs only a single scan of the table vs. two scans for two SQLs.
 
The PIs of source and target must match for TPT UPDATE to run successfully, so this is also best case for SQL.
If you switch to a MERGE instead of the INSERT, it will apply direct inserts without spool (similar to the SQL DELETE Todd explained):

merge into TableA a using TableB b
on A.Key=B.Key -- needed for MERGE
and B.Action_Code = 'A'
and 1=0 -- dummy conditon for unconditional inserts
when not matched then
insert (key, ...)
values (b.key, ...)
;

 
Now put both DELETE/MERGE in a TPT DDL operator and you're still using TPT, just faster :-)

Dieter

chk32 14 posts Joined 11/13
19 Jul 2015

Thanks Dieter, Steve . It helped me a lot.

You must sign in to leave a comment.