All Forums Database
Saarang84 16 posts Joined 08/14
06 Sep 2014
Optimization of a Join Query

Hi,
I've a join query statement in Teradata which I'm trying to optimize inorder to create a report. This query runs on a daily basis. There are 2 parts in the query (by two parts, I means two subqueries). Both subqueries are the same except for one date condition which is different in them. During business hours, this query as a whole takes about ~15 mins to run (I arrived at this time by calculating the individual run times of both the subqueries separately) and it is captured & aborted by the admin / SQL team. During off-business hours, this query runs faster within 2 or 3 mins.
Is there a way to optimize this query to run it during business hours ? Can the subqueries be combined to avoid the Left Outer Join? I'm using Teradata 13.11. The structure of the query is as follows :

Select X.field1, X.field2, X.field3, X.field4, …
Case when Z.field1 is null then ‘Y’
Else ‘N’
End as Flag
 
from
(
Select  A.field1, A.field2, B.field3, C.field4, …
from table1 A 
left join 
(select field1, field2, … from Table_B
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
) as B
on A.field1 = B.field1 and A.date_field = B.date_field

left join 
(select field1, field2, … from Table_C
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
) as C
on A.field1 = C.field1 and A.date_field = C.date_field

left join 
(select field1, field2, … from Table_D
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
qualify over ….
) as D
on A.field1 = D.field1 and A.date_field = D.date_field

where a.date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
and <condition_3>
and <condition_4>
and <condition_5>
) as X

left join
(
Select  A.field1
From table1 A 
left join 
(select field1, field2, … from Table_B
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
) as B
on A.field1 = B.field1 and A.date_field = B.date_field

left join 
(select field1, field2, … from Table_C
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
) as C
on A.field1 = C.field1 and A.date_field = C.date_field

left join 
(select field1, field2, … from Table_D
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
qualify over ….
) as D
on A.field1 = D.field1 and A.date_field = D.date_field

where a.date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
and <condition_3>
and <condition_4>
and <condition_5>
) as Z

on X.field1 = Z.field1

 

Sarang

Raja_KT 1246 posts Joined 07/09
06 Sep 2014

It 's difficult to say without explain plan. Is this just a query or a JI exists.... stats collected ....partitioning...PIs info.... and many more to look into.

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

Saarang84 16 posts Joined 08/14
06 Sep 2014

I'll try to post the explain plan information. All I can say now is that the data is fetched from views which fetches huge volumes of data if the date field is not used.
 
How can this query be modified? Can someone help me out?

Sarang

dnoeth 4628 posts Joined 11/04
06 Sep 2014

Hi Sarang,
seems like you only want a flag to detemine if data from the previous week exists.
You could extract rows for both dates in a single query and then use an OLAP function to check if there's a row a week ago.

SELECT
   ...
   -- check if rows from both weeks exist
   CASE WHEN COUNT(*) 
             OVER (PARTITION BY field1) = 1 
        THEN 'N' 
        ELSE 'Y' 
   end
FROM 
  ( SELECT -- returns two rows if both dates exist
        date_field
    ...
    where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end   
       or date_field = case when day_of_week(date) = 2 then date-3 else date -1 end   
  ) AS dt
QUALIFY -- only rows from the current week
   date_field = case when day_of_week(date) = 2 then date-3 else date -1 end

As Raja said DDL plus Explain and/or data from QryLogStepsV might help.

Dieter

Saarang84 16 posts Joined 08/14
06 Sep 2014

Hi Dieter,
 
Thanks for your logic. In the tables that are used in my original query, the fields from the subquery Z are extracted based on the below date condition along with other conditions which are applied before the flag value is calculated.
 

where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end

 
Since the same set of tables are used in both the subqueries X and Z, field1 (which is applied in the join condition) will exist in both subqueries and the data extracted for the previous week may or may not occur based on the other conditions applied.
 
This being the case, how can the flag be calculated without the join being applied ?

Sarang

Saarang84 16 posts Joined 08/14
06 Sep 2014

Here's the original query structure that is use, but for privacy reason, I've masked the real field names...
 

SELECT 
Current.date_field,  Current.field1,  Current. field2, Current. field3, Current. field4, 
Current. field5, Current. field6, Current. field7, Current. field8, Current. field9, Current. field10, 
Current. field11, Current. field12, Current. field13, Current. field14, Current. field15, Current. field16, 
CASE
    WHEN Previous.field1 IS NULL THEN 'Y'
    WHEN Previous.field1 IS NOT NULL THEN 'N'
    ELSE NULL
END AS flag,
 (Current.date_field -Current. date_field2) AS calc_field2, 
CASE
    WHEN flag = 'N' THEN 
CASE WHEN  calc_field2 > 90 THEN ‘Red’
          ELSE calc_field2 between 85 AND 90 THEN  'Yellow'
          ELSE ‘Green’
END
    WHEN flag = 'Y' THEN 
CASE 
       WHEN  calc_field2 >= 85 THEN ‘Yellow’
       ELSE 'Green'
END
END AS calc_field3

FROM
(
SELECT A.date_field, A.field1, A.date_field2, A.field3, A. field4, A. field5, G.field6, H.field7, I.field1, I.field2, I.field3, I.field4, J.field1, F.field2, F.field3, F. field4, F. field5, F. field6, F. field7, D.field2 FROM
A1 AS A
LEFT JOIN (
SELECT field1, field2, field3, field4, field5, field6, field7 FROM B1
QUALIFY ROW_NUMBER() OVER (PARTITION BY  field1, field2, field3, field4, field5, field6 ORDER BY field7) = 1 ) AS F
ON A. field7 = F. field1
LEFT JOIN (
SELECT * FROM C1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, date_field ORDER BY field3) =1 ) AS D
ON A.field1 = D.field1 AND A.date_field = D.date_field
LEFT JOIN
      G1 AS G
      ON A.field1 = G.field1 AND A.date_field = G.date_field 
LEFT JOIN
      H1 AS H
      ON A.field1 = H.field1 
LEFT JOIN
      I1 AS I		
      ON A.field1 = I.field1
LEFT JOIN
      J1 AS J
      ON A.field1 = J.field1 
WHERE
A.date_field =
CASE 
       WHEN day_of_week(DATE) = 2 THEN DATE - 3
       ELSE DATE - 1
END
AND field4 > 0
AND COALESCE(A.field10,'') NOT IN ('A')
AND F.field2 IN ('Arnold Schwarzenegger')  
AND (A.date_field – A.date_field2) <= 60  --Not over 60 days 
AND A.field1 IN  (
    SELECT field1 FROM (
       SELECT * FROM (  
        SELECT  field1, field2, field3, field4, date_field FROM  X WHERE
        date_field =  ( CASE  WHEN day_of_week(DATE) = 2 THEN DATE - 3 ELSE DATE - 1 END )
        AND field2 IN ('RECRCV', 'RECFCL', 'RECRTN')
        QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2 ORDER BY field3 DESC, field4 DESC) =1  
    ) AS BA
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1 ORDER BY field3 DESC, field4 DESC) = 1  ) AS BAA
WHERE field 2IN ('RECRCV')  )
) AS Current

LEFT JOIN

(
SELECT A.date_field, A.field1, A.date_field2, A.field3, A. field4, A. field5, G.field6, H.field7, I.field1, I.field2, I.field3, I.field4, J.field1, F.field2, F.field3, F. field4, F. field5, F. field6, F. field7, D.field2 FROM
A1 AS A
LEFT JOIN (
SELECT field1, field2, field3, field4, field5, field6, field7 FROM B1
QUALIFY ROW_NUMBER() OVER (PARTITION BY  field1, field2, field3, field4, field5, field6 ORDER BY field7) = 1 ) AS F
ON A. field7 = F. field1
LEFT JOIN (
SELECT * FROM C1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, date_field ORDER BY field3) =1 ) AS D
ON A.field1 = D.field1 AND A.date_field = D.date_field
LEFT JOIN
      G1 AS G
      ON A.field1 = G.field1 AND A.date_field = G.date_field 
LEFT JOIN
      H1 AS H
      ON A.field1 = H.field1 
LEFT JOIN
      I1 AS I		
      ON A.field1 = I.field1
LEFT JOIN
      J1 AS J
      ON A.field1 = J.field1 
WHERE
A.date_field =
CASE 
       WHEN day_of_week(DATE) = 2 THEN DATE - 10
       ELSE DATE - 8
END
AND field4 > 0
AND COALESCE(A.field10,'') NOT IN ('A')
AND F.field2 IN ('Arnold Schwarzenegger ')  
AND (A.date_field – A.date_field2) <= 60  --Not over 60 days 
AND A.field1 IN  (
    SELECT field1 FROM (
       SELECT * FROM (  
        SELECT  field1, field2, field3, field4, date_field FROM  X WHERE
        date_field =  ( CASE  WHEN day_of_week(DATE) = 2 THEN DATE - 10 ELSE DATE - 8 END )
        AND field2 IN ('RECRCV', 'RECFCL', 'RECRTN')
        QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2 ORDER BY field3 DESC, field4 DESC) =1  
    ) AS BA
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1 ORDER BY field3 DESC, field4 DESC) = 1  ) AS BAA
WHERE field 2IN ('RECRCV')  )
) AS Previous

ON Current.field1 = Previous.field1  ;

 

Sarang

dnoeth 4628 posts Joined 11/04
07 Sep 2014

As both Derived Table use exactly the same conditions (besides dat_col) you can use the approach I showed. Fold both Derived Tables into one using an or-ed condition on date_col and add the date_col to your QUALIFY to get both rows:

QUALIFY ROW_NUMBER()
        OVER (PARTITION BY field1, date_col 
              ORDER BY field3 DESC, field4 DESC) = 1  ) AS BAA

 
 

Dieter

You must sign in to leave a comment.