All Forums Database
cmedved 24 posts Joined 02/14
22 Aug 2014
DBQL Join Efficiency

Hi all. We use PDCR to archive our DBQL tables - the primary index on the tables is (logdate,procid,queryid) and they are partitioned by logdate.

ON logtbl.LogDate = sqltbl.LogDate
AND logtbl.procid = sqltbl.procid
AND logtbl.queryid = sqltbl.queryid

We have a lot of single-amp queries at my site that we typically exclude when doing query analysis... we'll add a clause like:

WHERE logtbl.AMPCPUTime > (HASHAMP()+1) AND NumOfActiveAMPs > 1 AND logtbl.LogDate BETWEEN DATE-30 AND DATE

This drastically cuts down on the number of rows selected from the log table. However, Teradata will scan both of the tables at the same time and perform the join, selecting only rows that match the criteria. The problem is, the SQL table is significantly larger than the log table. It would make the most sense to scan the log table and then do a PI join on the sql table instead of scanning it.
Is it possible to write a query that would cause this behavior? The optimizer seems to be "fixing" it for me :( I would like to avoid using a multi-step process (such as a volatile table) as those are difficult or impossible to implement with certain reporting tools we use. Using a volatile table to look up rows in the SQL table shaves minutes off of the query.

You must sign in to leave a comment.