All Forums General
Soumyatg 9 posts Joined 07/08
08 Aug 2016
Improve performance on an analytical query , mulitple columns using same set of partitions
Hi,


I have a large query where there are two window sets, one over year and another over quarter. I think that since teh parttion by and order by clause is same for 1 set of column and another set of columns, this query can be re-written to avoid the partioning on rows being done again and again. Below is the query:

 
Sel yrmthId,
acctId,
CustId,
CurrencyId,
PartyId,
MthId,
---------
---------
---------,
sum(abc1) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C1,
sum(abc2) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C2,
sum(abc3) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C3,
sum(abc4) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C4,

-----------
sum(abc10) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C10,
------------------
sum(abc1) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C11,
sum(abc2) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C21,
sum(abc3) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C31,
sum(abc4) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C41,

-----------
sum(abc10) over (partition by yrId,qtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C10
from <volatile table >
join calendar table
 
How can we rwrite this query to avoid the partition being done again. As I udnertsand that the optimizer must have to repeat this step extensively.any suggestions?
 
Thanks in advance!
 
 
 
 
 

Soumyatg
ToddAWalter 316 posts Joined 10/11
08 Aug 2016

Can you provide explain please?

Johannes Vink 28 posts Joined 08/14
08 Aug 2016

And what would a simple sum/group by do? I sometimes think that an analytical fuction is executed seperately. It is cool stuff, sometimes performs like crazy. But in some case it doesn't.

Soumyatg 9 posts Joined 07/08
08 Aug 2016
We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by
     way of an all-rows scan into Spool 6 (Last Use), which is assumed
     to be redistributed by value to all AMPs.  The result rows are put
     into Spool 4 (all_amps) (compressed columns allowed), which is
     built locally on the AMPs.  The size is estimated with no
     confidence to be 2 rows (562 bytes).
  5) We do an all-AMPs STAT FUNCTION step from Spool 4 (Last Use) by
     way of an all-rows scan into Spool 9 (Last Use), which is assumed
     to be redistributed by value to all AMPs.  The result rows are put
     into Spool 1 (group_amps), which is built locally on the AMPs.
     The size is estimated with no confidence to be 2 rows (690 bytes).

Because of security reasons and lack of access I cannot pull the actual explain but in the DEv env with no data, it seems like the optimizer is using STAT function twice with an all AMP operation. I know in production this step takes close to 1 hour along with insert into target.

Soumyatg

ToddAWalter 316 posts Joined 10/11
09 Aug 2016

As you can see, optimizer combines multiple ordered analytic functions with the same partition and order into a single execution.   Two different ones are required because there are two different partition lists.
 
I can't comment on the time without data on size of the source data and more details on what steps take the time. If the source data is very large, then this query will be expensive because it will create a result as large as the source which has to be sorted and computed upon.

yuvaevergreen 93 posts Joined 07/09
11 Aug 2016

Whats the volume of the data in the table and what is the join type

dnoeth 4628 posts Joined 11/04
12 Aug 2016

This is the best plan you can get, the optimizer combines all OLAP steps with the same partition/order.
 
Standard SQL syntax supports a WINDOW clause (not supported in Teradata) which avoids repeating the window definition, but it's just syntactic sugar.

Dieter

You must sign in to leave a comment.