All Forums General
td_user001 16 posts Joined 06/15
04 Jun 2015
Tuning query

select
         c.call_start_dt
        ,c.call_start_tm
        ,c.call_dir_ind
        ,coalesce(c.originating_num,9999999999)
        ,c.originating_prefix
        ,coalesce(c.terminating_num,9999999999)
        ,c.charged_duration
        ,c.used_duration
        ,c.imei_nbr
        ,c.call_completion_cd
        ,c.subscription_id
        ,c.account_id
        ,c.account_category_type
        ,c.product_item_id
        ,c.network_service_id
        ,c.period_cd
        ,c.bill_year
        ,c.bill_month
        ,c.bill_cycle
        ,c.run_dt
        ,c.tariff
        ,c.charge_amt
        ,coalesce(c.postpaid_seq,0)
        ,coalesce(c.prepaid_seq,0)
        ,'xrx' as source
        ,case
        when c.postpaid_seq is not null then 'post'
        when c.prepaid_seq is not null then 'prepaid'
        end as source_system
     ,case when c.prepaid_seq is not null then c.postpaid_seq else c.prepaid_seq end
    from
    VW_XEROX.CALL_DETAILS c                            
where c.call_start_dt between cast('2015-01-28' as date) and (cast('2015-03-29'  as date) + 31)
and c.st_ind <> 'Q';

  1) First, we lock XEROX.CALL_DETAILS in view VW_XEROX.CALL_DETAILS
     for access.
  2) Next, we do an all-AMPs RETRIEVE step from 92 partitions of
     XEROX.CALL_DETAILS in view VW_XEROX.CALL_DETAILS with a
     condition of ("(XEROX.CALL_DETAILS in view
     VW_XEROX.CALL_DETAILS.Call_Start_Dt >= DATE '2015-01-28') AND
     ((XEROX.CALL_DETAILS in view VW_XEROX.CALL_DETAILS.Call_Start_Dt
     <= DATE '2015-04-29') AND (XEROX.CALL_DETAILS in view
     VW_XEROX.CALL_DETAILS.st_Ind <> 'Q'))") into Spool 1
     (group_amps), which is built locally on the AMPs.  The input table
     will not be cached in memory, but it is eligible for synchronized
     scanning.  The result spool file will not be cached in memory.
     The size of Spool 1 is estimated with low confidence to be
     1,246,442,443 rows (479,880,340,555 bytes).  The estimated time
     for this step is 6 minutes and 27 seconds.
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 6 minutes and 27 seconds.

CALL_DETAILS table have ..
PRIMARY INDEX ( Call_Start_Dt ,Call_Start_Tm )
PARTITION BY RANGE_N(Call_Start_Dt  BETWEEN DATE '2014-12-01' AND DATE '2015-06-07' EACH INTERVAL '1' DAY );

sel count(*) from call_details ;  --2,864,035,358
sel count(*) from Call_details  where st_ind <> 'Q' ;  -- 2,863,285,765
sel count(*) from call_details  where st_ind ='Q' ;  --749,594

Problem:
insert into xerox.tmp_call_dtl
(
...
)
select
         c.call_start_dt
        ,c.call_start_tm
        ,c.call_dir_ind
        ,coalesce(c.originating_num,9999999999)
        ,c.originating_prefix
        ,coalesce(c.terminating_num,9999999999)
        ,c.charged_duration
        ,c.used_duration
        ,c.imei_nbr
        ,c.call_completion_cd
        ,c.subscription_id
        ,c.account_id
        ,c.account_category_type
        ,c.product_item_id
        ,c.network_service_id
        ,c.period_cd
        ,c.bill_year
        ,c.bill_month
        ,c.bill_cycle
        ,c.run_dt
        ,c.tariff
        ,c.charge_amt
        ,coalesce(c.postpaid_seq,0)
        ,coalesce(c.prepaid_seq,0)
        ,'xrx' as source
        ,case
        when c.postpaid_seq is not null then 'POST'
        when c.prepaid_seq is not null then 'PREPAID'
        end as source_system
     ,case when c.prepaid_seq is not null then c.postpaid_seq else c.prepaid_seq end
    from
    vw_xerox.call_details c                            
where c.call_start_dt between cast('2015-01-28' as date) and (cast('2015-03-29'  as date) + 31)
and (
c.st_ind <> 'Q'
and (src_system,src_batch_nr)  in
            (
                select src_system,src_batch_nr from xer.xerox_load_status_hist
                where (run_dt <= date '2015-03-29'  and process_dt is not null)
                or ( active_ind = 'Y' and run_dt = cast('2015-03-29' as date) and process_dt is null)
            )
and (src_system,src_batch_nr, ins_id,upd_id) not in
            (
                select src_system,src_batch_nr,  ins_id,upd_id from xer.xerox_loaded_usg
            )
 )    
;

---Insert completed. 13760293 rows added.
---Total elapsed time was 23 minutes and 27 seconds.

The above query is taking too much time. Need your suggestion to tune it.

additional info:
CALL_DETAILS table have ..
PRIMARY INDEX ( Call_Start_Dt ,Call_Start_Tm )
PARTITION BY RANGE_N(Call_Start_Dt  BETWEEN DATE '2014-12-01' AND DATE '2015-06-07' EACH INTERVAL '1' DAY );

sel count(*) from call_details ;  --2,864,035,358
sel count(*) from Call_details  where st_ind <> 'Q' ;  -- 2,863,285,765
sel count(*) from call_details  where st_ind ='Q' ;  --749,594

      

dnoeth 4628 posts Joined 11/04
04 Jun 2015

What's the Explain of the Insert/Select?
Did you check the QueryLog which steps are slow?

Dieter

td_user001 16 posts Joined 06/15
05 Jun 2015

Thanks for your response. Here's the plan. Your suggestion will really help. Thanks once again in advance for giving your valuable time.
Problem:
insert into xerox.tmp_call_dtl
(
...
)
SELECT
         C.CALL_START_DT
        ,C.CALL_START_TM
        ,C.CALL_DIR_IND
        ,COALESCE(C.ORIGINATING_NUM,9999999999)
        ,C.ORIGINATING_PREFIX
        ,COALESCE(C.TERMINATING_NUM,9999999999)
        ,C.CHARGED_DURATION
        ,C.USED_DURATION
        ,C.IMEI_NBR
        ,C.CALL_COMPLETION_CD
        ,C.SUBSCRIPTION_ID
        ,C.ACCOUNT_ID
        ,C.ACCOUNT_CATEGORY_TYPE
        ,C.PRODUCT_ITEM_ID
        ,C.NETWORK_SERVICE_ID
        ,C.PERIOD_CD
        ,C.BILL_YEAR
        ,C.BILL_MONTH
        ,C.BILL_CYCLE
        ,C.RUN_DT
        ,C.TARIFF
        ,C.CHARGE_AMT
        ,COALESCE(C.POSTPAID_SEQ,0)
        ,COALESCE(C.PREPAID_SEQ,0)
        ,'XRX' AS SOURCE
        ,CASE
        WHEN C.POSTPAID_SEQ IS NOT NULL THEN 'POST'   -- INTEGER COLUMN CAN USE >= 0
        WHEN C.PREPAID_SEQ IS NOT NULL THEN 'PREPAID' -- INTEGER COLUMN CAN USE >= 0
        END AS SRC_SYSTEM
     ,CASE WHEN C.PREPAID_SEQ IS NOT NULL THEN C.POSTPAID_SEQ ELSE C.PREPAID_SEQ END -- INTEGER COLUMN CAN USE >= 0
    FROM
    VW_XEROX.CALL_DETAILS C                            
WHERE C.CALL_START_DT BETWEEN CAST('2015-01-28' AS DATE) AND (CAST('2015-03-29'  AS DATE) + 31)
AND (
C.ST_IND <> 'Q' --- SCOPE FOR IMPROVMENT BY USING IN ()
AND (SRC_SYSTEM,SRC_BATCH_NR)  IN
            (
                SELECT SRC_SYSTEM,SRC_BATCH_NR FROM XER.XEROX_LOAD_STATUS_HIST
                WHERE (RUN_DT <= DATE '2015-03-29'  AND PROCESS_DT IS NOT NULL)
                OR ( ACTIVE_IND = 'Y' AND RUN_DT = CAST('2015-03-29' AS DATE) AND PROCESS_DT IS NULL)
            )
AND (SRC_SYSTEM,SRC_BATCH_NR, INS_ID,UPD_ID) NOT IN --REPLACE BY NOT EXISTS
            (
                SELECT SRC_SYSTEM,SRC_BATCH_NR,  INS_ID,UPD_ID FROM XER.XEROX_LOADED_USG
            )
 )    
;

---Insert completed. 13760293 rows added.
---Total elapsed time was 23 minutes and 27 seconds.

PLAN FOR THE SELECT:

1) First, we lock XER.XEROX_LOADED_USG for access,
     we lock XER.XEROX_LOAD_STATUS_HIST for access, and we
     lock XEROX.CALL_DETAILS in view VW_XEROX.CALL_DETAILS for access.
  2) Next, we execute the following steps in parallel.
       1) We do an all-AMPs RETRIEVE step from 92 partitions of
          XEROX.CALL_DETAILS in view VW_XEROX.CALL_DETAILS with a
          condition of ("(XEROX.CALL_DETAILS in view
          VW_XEROX.CALL_DETAILS.CALL_START_DT >= DATE '2015-01-28')
          AND ((XEROX.CALL_DETAILS in view
          VW_XEROX.CALL_DETAILS.CALL_START_DT <= DATE '2015-04-29')
          AND (XEROX.CALL_DETAILS in view
          VW_XEROX.CALL_DETAILS.ST_IND <> 'Q'))") into Spool 2
          (all_amps) (compressed columns allowed), which is built
          locally on the AMPs.  Then we do a SORT to order Spool 2 by
          the hash code of ((CASE WHEN (NOT
          (XEROX.CALL_DETAILS.postpaid_seq IS NULL )) THEN
          (XEROX.CALL_DETAILS.postpaid_seq) ELSE
          (XEROX.CALL_DETAILS.Prepaid_Seq) END) (FLOAT),
          TRANSLATE((( CASE WHEN (NOT
          (XEROX.CALL_DETAILS.postpaid_seq IS NULL )) THEN
          ('POST') WHEN (NOT (XEROX.CALL_DETAILS.Prepaid_Seq
          IS NULL )) THEN ('PREPAID') ELSE (NULL) END ))USING
          UNICODE_TO_LATIN)(VARCHAR(100), CHARACTER SET LATIN, NOT
          CASESPECIFIC)).  The input table will not be cached in memory,
          but it is eligible for synchronized scanning.  The result
          spool file will not be cached in memory.  The size of Spool 2
          is estimated with low confidence to be 1,246,442,443 rows (
          386,397,157,330 bytes).  The estimated time for this step is
          21 minutes and 50 seconds.
       2) We do an all-AMPs RETRIEVE step from
          XER.XEROX_LOAD_STATUS_HIST by way of an all-rows
          scan with a condition of (
          "((XER.XEROX_LOAD_STATUS_HIST.RUN_DT <= DATE
          '2015-03-29') AND (NOT
          (XER.XEROX_LOAD_STATUS_HIST.PROCESS_DT IS NULL
          ))) OR
          ((XER.XEROX_LOAD_STATUS_HIST.ACTIVE_IND =
          'Y') AND ((XER.XEROX_LOAD_STATUS_HIST.RUN_DT =
          DATE '2015-03-29') AND
          (XER.XEROX_LOAD_STATUS_HIST.PROCESS_DT IS NULL
          )))") into Spool 4 (all_amps), which is built locally on the
          AMPs.  Then we do a SORT to order Spool 4 by the sort key in
          spool field1 (
          XER.XEROX_LOAD_STATUS_HIST.SRC_SYSTEM,
          XER.XEROX_LOAD_STATUS_HIST.SRC_BATCH_NR (FLOAT, FORMAT
          '-9.99999999999999E-999')) eliminating duplicate rows.  The
          size of Spool 4 is estimated with low confidence to be 8,847
          rows (1,468,602 bytes).  The estimated time for this step is
          0.17 seconds.
  3) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of
     an all-rows scan into Spool 3 (all_amps) (compressed columns
     allowed), which is duplicated on all AMPs.  Then we do a SORT to
     order Spool 3 by the hash code of (
     XER.XEROX_LOAD_STATUS_HIST.SRC_SYSTEM, SRC_BATCH_NR
     (FLOAT, FORMAT '-9.99999999999999E-999')(FLOAT)).  The result
     spool file will not be cached in memory.  The size of Spool 3 is
     estimated with low confidence to be 1,061,640 rows (177,293,880
     bytes).
  4) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of
          an all-rows scan, which is joined to Spool 3 (Last Use) by
          way of an all-rows scan.  Spool 2 and Spool 3 are joined
          using an inclusion merge join, with a join condition of (
          "((TRANSLATE((( CASE WHEN (NOT (postpaid_seq IS NULL
          )) THEN ('POST') WHEN (NOT (Prepaid_Seq IS NULL ))
          THEN ('PREPAID') ELSE (NULL) END ))USING UNICODE_TO_LATIN))=
          SRC_SYSTEM) AND ((( CASE WHEN (NOT (postpaid_seq
          IS NULL )) THEN (postpaid_seq) ELSE
          (Prepaid_Seq) END) )= Field_3)").  The result goes
          into Spool 9 (all_amps) (compressed columns allowed), which
          is built locally on the AMPs.  The size of Spool 9 is
          estimated with low confidence to be 41,495 rows (12,821,955
          bytes).  The estimated time for this step is 6.34 seconds.
       2) We do an all-AMPs RETRIEVE step from
          xer.xerox_loaded_usg by way of an all-rows
          scan with no residual conditions into Spool 12 (all_amps),
          which is redistributed by the hash code of (
          XER.XEROX_LOADED_USG.UPD_ID,
          XER.XEROX_LOADED_USG.INS_ID,
          XER.XEROX_LOADED_USG.SRC_BATCH_NR,
          XER.XEROX_LOADED_USG.SRC_SYSTEM) to
          all AMPs.  Then we do a SORT to order Spool 12 by row hash
          and the sort key in spool field1 eliminating duplicate rows.
          The size of Spool 12 is estimated with high confidence to be
          9,015 rows (784,305 bytes).  The estimated time for this step
          is 0.05 seconds.
  5) We do an all-AMPs SUM step to aggregate from Spool 9 by way of an
     all-rows scan.  Aggregate Intermediate Results are computed
     globally, then placed in Spool 7.
  6) We do an all-AMPs SUM step to aggregate from Spool 12 by way of an
     all-rows scan.  Aggregate Intermediate Results are computed
     globally, then placed in Spool 13.
  7) We execute the following steps in parallel.
       1) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by
          way of an all-rows scan into Spool 5 (all_amps) (compressed
          columns allowed), which is duplicated on all AMPs.
       2) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by
          way of an all-rows scan into Spool 6 (all_amps) (compressed
          columns allowed), which is duplicated on all AMPs.
  8) We do an all-AMPs RETRIEVE step from Spool 9 by way of an all-rows
     scan into Spool 11 (all_amps) (compressed columns allowed), which
     is redistributed by the hash code of (XEROX.CALL_DETAILS.ins_id,
     XEROX.CALL_DETAILS.UPD_ID, (CASE WHEN (NOT
     (XEROX.CALL_DETAILS.POSTPAID_SEQ IS NULL )) THEN
     (XEROX.CALL_DETAILS.POSTPAID_SEQ) ELSE
     (XEROX.CALL_DETAILS.PREPAID_SEQ) END )(INTEGER),
     TRANSLATE((( CASE WHEN (NOT
     (XEROX.CALL_DETAILS.POSTPAID_SEQ IS NULL )) THEN
     ('POST') WHEN (NOT (XEROX.CALL_DETAILS.PREPAID_SEQ IS
     NULL )) THEN ('PREPAID') ELSE (NULL) END ))USING
     UNICODE_TO_LATIN)(CHAR(20), CHARACTER SET LATIN, NOT CASESPECIFIC))
     to all AMPs.  Then we do a SORT to order Spool 11 by row hash, and
     null value information in Spool 6 and Spool 5.  Skip this retrieve
     step if null exists.  The size of Spool 11 is estimated with low
     confidence to be 41,495 rows (13,651,855 bytes).  The estimated
     time for this step is 6.34 seconds.
  9) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way
          of an all-rows scan, which is joined to Spool 12 by way of an
          all-rows scan.  Spool 11 and Spool 12 are joined using an
          exclusion merge join, with a join condition of (
          "((TRANSLATE((( CASE WHEN (NOT (POSTPAID_SEQ IS NULL
          )) THEN ('POST') WHEN (NOT (PREPAID_SEQ IS NULL ))
          THEN ('PREPAID') ELSE (NULL) END ))USING UNICODE_TO_LATIN))=
          SRC_SYSTEM) AND (((( CASE WHEN (NOT
          (postpaid_seq IS NULL )) THEN (postpaid_seq)
          ELSE (PREPAID_SEQ) END ))= SRC_BATCH_NR) AND
          ((ins_id = INS_ID) AND (UPD_ID = UPD_ID )))"), and null value
          information in Spool 6 and Spool 5.  Skip this join step if
          null exists.  The result goes into Spool 1 (group_amps),
          which is built locally on the AMPs.  The size of Spool 1 is
          estimated with index join confidence to be 41,495 rows (
          15,975,575 bytes).  The estimated time for this step is 0.04
          seconds.
       2) We do an all-AMPs RETRIEVE step from Spool 9 (Last Use) by
          way of an all-rows scan into Spool 15 (all_amps) (compressed
          columns allowed), which is built locally on the AMPs.  Then
          we do a SORT to order Spool 15 by the hash code of (
          TRANSLATE((( CASE WHEN (NOT
          (XEROX.CALL_DETAILS.POSTPAID_SEQ IS NULL )) THEN
          ('POST') WHEN (NOT (XEROX.CALL_DETAILS.PREPAID_SEQ
          IS NULL )) THEN ('PREPAID') ELSE (NULL) END ))USING
          UNICODE_TO_LATIN)(CHAR(20), CHARACTER SET LATIN, NOT
          CASESPECIFIC)), and null value information in Spool 6 and
          Spool 5.  Skip this retrieve step if there is no null.  The
          size of Spool 15 is estimated with low confidence to be
          41,495 rows (13,651,855 bytes).  The estimated time for this
          step is 0.01 seconds.
 10) We do an all-AMPs RETRIEVE step from Spool 12 (Last Use) by way of
     an all-rows scan into Spool 16 (all_amps) (compressed columns
     allowed), which is duplicated on all AMPs.  Then we do a SORT to
     order Spool 16 by the hash code of (
     XER.XEROX_LOADED_USG.SRC_SYSTEM), and
     null value information in Spool 6 and Spool 5.  Skip this retrieve
     step if there is no null.  The size of Spool 16 is estimated with
     high confidence to be 1,081,800 rows (94,116,600 bytes).  The
     estimated time for this step is 0.10 seconds.
 11) We do an all-AMPs JOIN step from Spool 15 (Last Use) by way of an
     all-rows scan, which is joined to Spool 16 (Last Use) by way of an
     all-rows scan.  Spool 15 and Spool 16 are joined using an
     exclusion merge join, with a join condition of ("((TRANSLATE(((
     CASE WHEN (NOT (POSTPAID_SEQ IS NULL )) THEN ('POST')
     WHEN (NOT (PREPAID_SEQ IS NULL )) THEN ('PREPAID') ELSE
     (NULL) END ))USING UNICODE_TO_LATIN))= SRC_SYSTEM) AND
     (((( CASE WHEN (NOT (POSTPAID_SEQ IS NULL )) THEN
     (postpaid_seq) ELSE (PREPAID_SEQ) END ))=
     SRC_BATCH_NR) AND ((INS_ID = INS_ID) AND (UPD_ID = UPD_ID )))"),
     and null value information in Spool 6 (Last Use) and Spool 5 (Last
     Use).  Skip this join step if there is no null.  The result goes
     into Spool 1 (group_amps), which is built locally on the AMPs.
     The size of Spool 1 is estimated with index join confidence to be
     41,495 rows (15,975,575 bytes).  The estimated time for this step
     is 0.04 seconds.
 12) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.

   

dnoeth 4628 posts Joined 11/04
05 Jun 2015

There're no matching PI, so the optimizer needs to do some preparation steps and the estimated time is quite close to the actual runtime, so maybe you can't speed it up...
 
The estimated rowcounts after the joins are quite wrong (probably be due to the COALESCE), did you check DIAGNOSTIC HELPSTATS if stats are missing?
 
You should change the NOT IN into a NOT EXISTS, which will result in a simplified plan (you might also change IN to EXISTS)
 
Btw, this calculation is probably wrong:
CASE WHEN C.PREPAID_SEQ IS NOT NULL THEN C.POSTPAID_SEQ ELSE C.PREPAID_SEQ END

Dieter

You must sign in to leave a comment.