All Forums Database
apoorv_jain 3 posts Joined 02/15
23 Feb 2015
Optimization of Insert....Select statement

Hi All,
We are inserting data in an empty table through 2 Insert...Select statements in our Bteq scripts One of which is Consuming a lot of CPU Time (2 hours )though it is inserting 3.6 Million rows.
We reached our DBA's for any possibility of Query Optimization.
DBA advised that 'second insert is inserting into a populated table. If this was supposed to be a multi-statement insert/select, which would make sense and run much faster, then the syntax needs to be very specific to that need'.
And advised us to change our Statements from:
Insert Into Select...... ; 
Insert Into Select......;

Insert Into Select...... ; Insert Into Select......;
The reason given by them is 'Once you remove the spaces optimizer thinks it’s one connected insert statement, not 2 separate inserts.' 
Now I am not sure how is this going to help because as soon as Parser encounters ';' It will send that part to Parser and a Explain will be created.

Can anyone please explain if I am correct or wrong.
Your help is much appreciated.
Thanks and Regards,
Apoorv Jain

Glass 225 posts Joined 04/10
23 Feb 2015

try running explain, you DBA is correct.
From Manual:

BTEQ does not submit any of the statements to Teradata Database until it

encounters a semicolon as the last nonblank character of a line.




apoorv_jain 3 posts Joined 02/15
24 Feb 2015

Hi Glass,
Thanks for your reply.
I generated the explain for the queries and it generated 2 separate explains for 
Insert Into Select...... ; Insert Into Select......;
Please Advise.

ravimans 54 posts Joined 02/14
24 Feb 2015

Can you give some more details?
-- What is the difference between the 2 inserts?
-- 2 inserts are selecting data from different base tables?
-- Usually which insert will have more volume of data?
-- Have you checked for collect stats in PI & JOIN columns?
-- As suggested by DBA is there any improvement in your performance?

apoorv_jain 3 posts Joined 02/15
25 Feb 2015

Please find below the answers to your questions:
1. The difference is in number of source tables, also we have a Union operation being performed in the Select statement of second Insert.
2. Yes 2nd Insert statement is selecting data from more number of tables.
3. the 2nd Insert statement.
4. Yes, the stats are collected before executing the Insert statements.
5. No, I haven't executed the query till now.

eejimkos 73 posts Joined 01/12
11 Mar 2015

Even if you do not provide a lot of details you must take under consideration three basic things.
1) If the target table is SET or not , if it is the result is normal - try to change it to multiset-
2)Before running the insert statement , calculate with hasbucket.....  as a select statement the skew of the result set.
3)Try to check and compare the two phase statements ,
i) which calculate the Cpu on calculating the result set
ii)the step where  you try to insert this result set on the table.
-do this either from viewpoint , or from dbqlog-step-.
Please may you send us at least , CPU , IO , SPOOL , CPU - IO SKEW 
thank you.

You must sign in to leave a comment.