All Forums Database
lgscheurich 27 posts Joined 09/06
23 Mar 2016
Performance of the We do a MERGE into table HUB_ENCOUNTER_MESSAGE from Spool 18767. step of an insert

Team,
 
We have an insert statement that is inserting around 92 million records into a table based on a select statement from another table.  The select piece runs fine, but the following step takes a long time.
We do a MERGE into table HUB_ENCOUNTER_MESSAGE from Spool 18767.
The table definition is below.  The only things I can think of are the fact that the table is Multiset, and the primary index is an identity column.  Would either of these two things be the problem or should I be looking at something else?
CREATE MULTISET TABLE dw_itg_t.hub_encounter_virtual_visit ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      HUB_ENCOUNTER_VIRTUAL_VISIT_KEY INTEGER TITLE 'HUB Encounter Virtual Visit Key' NOT NULL GENERATED BY DEFAULT AS IDENTITY
           (START WITH 1
            INCREMENT BY 1
            MINVALUE -2147483647
            MAXVALUE 2147483647
            NO CYCLE),
      PATIENT_ENCOUNTER_CSN_ID DECIMAL(18,0) TITLE 'Patient Encounter CSN ID',
      EDW_DATA_SOURCE_CODE VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC TITLE 'EDW Data Source',
      EDW_INSERT_DTM TIMESTAMP(6) TITLE 'EDW INSERT DTM',
      EDW_JOB_SEQUENCE_NBR INTEGER TITLE 'Edw Job Sequence Nbr',
      EDW_INSERT_PROCESS_ID INTEGER TITLE 'Edw Insert Process Id')
UNIQUE PRIMARY INDEX HUB_ENC_VIRTUAL_VISIT_IDX_PR ( HUB_ENCOUNTER_VIRTUAL_VISIT_KEY );

dnoeth 4628 posts Joined 11/04
23 Mar 2016

What exactly is a long time?
How many rows are in that table before the insert?
This will probably read/modify/write all existing datablocks due to the sequence.
 
What's the setting for dbscontrol General Field 27: IdCol Batch Size?
 
MultiSet on a table with an Identity-UPI is useless, but should be no problem.
 

Dieter

lgscheurich 27 posts Joined 09/06
23 Mar 2016

Thanks Dieter!  There are no rows in the table before the insert.  What's interesting is that I can take the SQL from Viewpoint and run it as myself and it runs in a matter of 2 minutes.  When it runs through our scheduler (different user id), it takes about an hour.  That's the part that is confusing to me.
Is there a way for me to see the dbscontrol value without contacting our DBA team?
I did take multiset off of it, and that did seem to help a bit.

lgscheurich 27 posts Joined 09/06
23 Mar 2016

I should also clarify that the insert statement that is taking a long time is running via a stored procedure with parameters as values going into two of the columns

dnoeth 4628 posts Joined 11/04
25 Mar 2016

If the SQL runs fast standalone it can't be based on Batch Size.
Might be a different transaction handling (within the SP or different session mode), using Transient Journal within the SP and FastPath standalone.
Can you share the SP?

Dieter

lgscheurich 27 posts Joined 09/06
25 Mar 2016

This is generated by Wherescape, so anything in [] is replaced with the databasename.  I didn't see anything in it that would cause it to run slower, but this has been running for over 4 hours in the stored procedure and when I run the insert by hand in SQL Assitant, it takes about 15 minutes
 
CREATE PROCEDURE [METABASE].UPD_HUB_ENCOUNTER_BILLING_
  (
  IN  p_sequence         integer,
  IN  p_job_name         varchar(256),
  IN  p_task_name        varchar(256),
  IN  p_job_id           integer,
  IN  p_task_id          integer,
  OUT p_return_msg       varchar(256),
  OUT p_status           integer
  )
BEGIN

  DECLARE v_msgtext           varchar(255);  -- Text for audit_trail
  DECLARE v_sql               varchar(255);  -- Text for SQL statements
  DECLARE v_set               integer;       -- commit set
  DECLARE v_analyze_flag      integer;       -- analyze flag
  DECLARE v_step              integer;       -- return code
  DECLARE v_insert_count      integer;       -- no of records inserted
  DECLARE v_count             integer;       -- General counter
  DECLARE v_sql_code          integer;       -- SQL Error Code for Audit Trail
  DECLARE v_sql_error         varchar(255);  -- SQL Error Code for Audit Trail as varchar

  --============================================================================
  -- General Variables
  --============================================================================
  DECLARE v_return_status     integer;       -- Update result status
  DECLARE v_row_count         integer;       -- General row count
  DECLARE v_status            integer;       -- General status field
  DECLARE v_current_timestamp timestamp;     -- Used for create/update dates
  DECLARE v_current_date      date;          -- Current Date used for start and end dates
  --============================================================================
  -- Exceptions
  --============================================================================
  DECLARE EXIT HANDLER
    FOR SQLEXCEPTION
    BEGIN
      SET v_sql_code = SQLCODE;
      LOCK ROW FOR ACCESS
      SELECT SUBSTR(ErrorText,1,255)
      INTO   v_sql_error
      FROM   dbc.ErrorMsgs
      WHERE  ErrorCode = v_sql_code;
      SET v_msgtext = SUBSTR('Unhandled Exception in HUB_ENCOUNTER_BILLING_SEQ. '||
          ' Step ' || CAST(v_step AS VARCHAR(64)) ||
          '   SQL Error Code: ' || CAST(v_sql_code AS VARCHAR(10)) || ' - ' || v_sql_error,1,255);
      SET p_return_msg = v_msgtext;
      CALL [METABASE].WsWrkAudit('F', p_job_name, p_task_name, p_sequence
          , v_msgtext, v_sql_code, v_sql_error, p_task_id, p_job_id);
      SET p_status = -3;
    END;

  --============================================================================
  -- Main
  --============================================================================
  SET v_step = 100;
  SET v_insert_count      = 0;
  SET v_current_timestamp = CURRENT_TIMESTAMP;
  SET v_current_date      = CURRENT_DATE;

  --============================================================================
  -- Include 0 key row for when lookup to this table is null
  --============================================================================
  SET v_step = 200;
  SET v_count = 0;

  SELECT COUNT(*) INTO v_count
  FROM  [HUB_ENCOUNTER_BILLING_SEQ]
  WHERE HUB_ENCOUNTER_BILLING_KEY = 0;

  IF v_count = 0
  THEN
    SET v_step = 300;

    INSERT INTO [HUB_ENCOUNTER_BILLING_SEQ]
    ( HUB_ENCOUNTER_BILLING_KEY
    , ENCOUNTER_TDL_ID
    , EDW_DATA_SOURCE_CODE
    , EDW_JOB_SEQUENCE_NBR
    , EDW_INSERT_PROCESS_ID
    , EDW_INSERT_DTM
    )
    VALUES
    ( 0
    , 0
    , SUBSTR('UNKNOWN',1,8)
    , 0
    , 0
    , CURRENT_TIMESTAMP
    );

    SET v_row_count = ACTIVITY_COUNT;

  END IF;

  --============================================================================
  -- Insert new records
  --============================================================================
  INSERT INTO [HUB_ENCOUNTER_BILLING_SEQ]
  ( HUB_ENCOUNTER_BILLING_KEY
  , ENCOUNTER_TDL_ID
  , EDW_DATA_SOURCE_CODE
  , EDW_JOB_SEQUENCE_NBR
  , EDW_INSERT_PROCESS_ID
  , EDW_INSERT_DTM
  )
  SELECT
         ROW_NUMBER() OVER (ORDER BY TDL_ID)
       , S_CLY_CLARITY_TDL_TRAN.TDL_ID
       , S_CLY_CLARITY_TDL_TRAN.EDW_DATA_SOURCE_CODE
       , p_sequence
       , p_task_id
       , v_current_timestamp
  FROM [S_CLY_CLARITY_TDL_TRAN] S_CLY_CLARITY_TDL_TRAN
  LEFT OUTER JOIN  [HUB_ENCOUNTER_BILLING_SEQ] HUB_ENCOUNTER_BILLING_SEQ
  ON    HUB_ENCOUNTER_BILLING_SEQ.ENCOUNTER_TDL_ID = S_CLY_CLARITY_TDL_TRAN.TDL_ID
  WHERE HUB_ENCOUNTER_BILLING_SEQ.ENCOUNTER_TDL_ID IS NULL
  GROUP BY S_CLY_CLARITY_TDL_TRAN.TDL_ID, S_CLY_CLARITY_TDL_TRAN.EDW_DATA_SOURCE_CODE
  ;

  SET v_row_count = ACTIVITY_COUNT;

  SET v_insert_count = v_insert_count + v_row_count;

  SET v_step = 400;

  --============================================================================
  -- Final settings
  --============================================================================

  -- WsWrkTask(job,task,seq,insert,update,replace,delete,discard,reject,error)
  CALL [METABASE].WsWrkTask(p_job_id, p_task_id, p_sequence,
    v_insert_count, 0, 0, 0, 0, 0, 0);

  SET v_step = 500;

  -- Work out the return message

  SET p_status = 1;
  SET p_return_msg = 'HUB_ENCOUNTER_BILLING_SEQ updated. '
    || CAST(v_insert_count AS VARCHAR(64)) || ' records added. ';
END;
 

dnoeth 4628 posts Joined 11/04
28 Mar 2016

It's not the same table you mentioned in your first post, but the code looks ok. 
If you got access to DBQL (Log & Steps) you might check the CALL and the rows for the SQL submitted within the SP.

Dieter

ToddAWalter 316 posts Joined 10/11
28 Mar 2016

In the SP above, there is a single row insert into the table before the INSERT SELECT. That makes the IS a non-empty table IS rather than an empty table one. That means teradata has to TJ all the rows being inserted and has to insert row by row per AMP. This makes it take longer than tne empty table form would. I sent a note to the Wherescape folks pointing them to this topic.

jwh_ws 15 posts Joined 04/11
28 Mar 2016

Thank you all for bringing this issue to our attention.  We regard this as a bug and have created a JIRA for its repair.
The excellent performance is seen due to fastpath insert - the database knows "rollback" is really easy, so it turns off transient journalling.  I've also heard transient journalling described as 'implicit commitment control' which is provided when the query runs in Teradata mode.

The solution, as Todd points out, is simple.  We'll move the single row insert after the heaving lifting logic.  It's a good exercise to look at the EXPLAIN plan on a query with and without transient journalling.  The EXPLAIN plan nearly yells at you what is happening when TJ is in effect or not.

dnoeth 4628 posts Joined 11/04
29 Mar 2016

Explain will not tell if TJ is used or not, at Explain time it might be empty, but later populated (it's determined when the actual insert starts). 
Life would be much easier if there was a SP debugger, hint, hint :)

Dieter

ToddAWalter 316 posts Joined 10/11
29 Mar 2016

An SP debugger doesn't help if it can't be seen in Explain (not that an SP debugger isn't a good idea). I haven't looked recently to see if it can be determined from DBQL?

dnoeth 4628 posts Joined 11/04
29 Mar 2016

In DBQL there's no TJ/Fastpath info.
Of course in that special case a debugger wouldn't help much, I was just nagging, waiting for a SP debugger since V2R3 :)

Dieter

lgscheurich 27 posts Joined 09/06
04 May 2016

After working with Wherescape, we discovered the problem was in the priocedure.  The first step inserts one row, and then it runs an insert of close to 500 million rows.  It looks lik e logging/journalling kicks in to slow it down.  When we reverse the order to load the one row after, the performance is good

You must sign in to leave a comment.