All Forums General
kalyan3in 3 posts Joined 07/12
05 Jul 2012
Tuning the query

Hi ,

I have a qry which is taking more cpu and skew is more. pls suggest how to tune this query.

Explain:

 

Explain SELECT

      WATR.BK_AS_PROJECT_TASK_NUM ,

      WATR.BK_AS_PROJECT_CD ,

      WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,

      WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,

      WATR.FISCAL_YEAR_WEEK_NUM_INT ,

      WATR.AS_TASK_RES_TIMECARD_HOURS ,

      WATR.AS_TASK_RES_LABOR_COST_USD_AMT ,

      WATR.AS_TASK_RES_LABOR_COST_PRJ_AMT ,

      WATR.AS_TASK_RES_LABOR_COST_LOC_AMT ,

      WATR.AS_RSRC_LABOR_RT_USD_AMT ,

      CURRENT_TIMESTAMP(0),

      USER ,

      CURRENT_TIMESTAMP(0),

      USER

     FROM WORKDB_dv4.WI_MT_AS_TIMECARD_RESOURCE WATR

     WHERE DML_TYPE = 'I'

AND (WATR.BK_AS_PROJECT_TASK_NUM ,

WATR.BK_AS_PROJECT_CD ,

WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,

      WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,

      WATR.FISCAL_YEAR_WEEK_NUM_INT )

NOT IN

( SELECT  MATR.BK_AS_PROJECT_TASK_NUM ,

      MATR. BK_AS_PROJECT_CD ,

      MATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,

      MATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,

      MATR.FISCAL_YEAR_WEEK_NUM_INT

       FROM ASBIVWDB_dv4.MT_AS_TIMECARD_RESOURCE MATR)

 

  1) First, we lock WORKDB_dv4.WATR for access, and we lock

     ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE for access.

  2) Next, we do an all-AMPs SUM step to aggregate from WORKDB_dv4.WATR

     by way of an all-rows scan with a condition of (

     "WORKDB_dv4.WATR.DML_TYPE = 'I'").  Aggregate Intermediate Results

     are computed globally, then placed in Spool 5.

  3) We do an all-AMPs SUM step to aggregate from

     ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan with

     no residual conditions.  Aggregate Intermediate Results are

     computed globally, then placed in Spool 7.

  4) We execute the following steps in parallel.

       1) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by

          way of an all-rows scan into Spool 3 (all_amps) (compressed

          columns allowed), which is duplicated on all AMPs.

       2) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by

          way of an all-rows scan into Spool 4 (all_amps) (compressed

          columns allowed), which is duplicated on all AMPs.

  5) We execute the following steps in parallel.

       1) We do an all-AMPs JOIN step from WORKDB_dv4.WATR by way of an

          all-rows scan with a condition of ("WORKDB_dv4.WATR.DML_TYPE

          = 'I'"), which is joined to

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan

          with no residual conditions.  WORKDB_dv4.WATR and

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE are joined using an

          exclusion merge join, with a join condition of (

          "(WORKDB_dv4.WATR.BK_AS_PROJECT_TASK_NUM =

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_TASK_ NUM)

          AND ((WORKDB_dv4.WATR.BK_AS_PROJECT_CD =

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_CD) AND

          ((WORKDB_dv4.WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD =

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_TASK_RSRC_EXP NDTR_TYP_CD)

          AND ((WORKDB_dv4.WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY =

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_RSRC_CISCO_WR KR_PRTY_KEY)

          AND (WORKDB_dv4.WATR.FISCAL_YEAR_WEEK_NUM_INT =

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.FISCAL_YEAR_WEEK_NU M_INT

          ))))"), and null value information in Spool 4 and Spool 3.

          Skip this join step if null exists.  The result goes into

          Spool 2 (group_amps), which is built locally on the AMPs.

          The size of Spool 2 is estimated with no confidence to be

          184,396 rows (37,063,596 bytes).  The estimated time for this

          step is 0.99 seconds.

       2) We do an all-AMPs RETRIEVE step from WORKDB_dv4.WATR by way

          of an all-rows scan with a condition of (

          "WORKDB_dv4.WATR.DML_TYPE = 'I'") into Spool 9 (all_amps)

          (compressed columns allowed), which is built locally on the

          AMPs.  Then we do a SORT to order Spool 9 by the hash code of

          (WORKDB_dv4.WATR.BK_AS_PROJECT_TASK_NUM), and null value

          information in Spool 4 and Spool 3.  Skip this retrieve step

          if there is no null.  The size of Spool 9 is estimated with

          no confidence to be 184,396 rows (21,943,124 bytes).  The

          estimated time for this step is 0.33 seconds.

       3) We do an all-AMPs RETRIEVE step from

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan

          with no residual conditions into Spool 10 (all_amps)

          (compressed columns allowed), which is duplicated on all AMPs.

          Then we do a SORT to order Spool 10 by the hash code of (

          ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_TASK_ NUM).

          The result spool file will not be cached in memory, and null

          value information in Spool 4 and Spool 3.  Skip this retrieve

          step if there is no null.  The size of Spool 10 is estimated

          with low confidence to be 146,184,000 rows (6,870,648,000

          bytes).  The estimated time for this step is 39.67 seconds.

  6) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an

     all-rows scan, which is joined to Spool 10 (Last Use) by way of an

     all-rows scanSpool 9 and Spool 10 are joined using an exclusion

     merge join, with a join condition of ("(BK_AS_PROJECT_TASK_NUM =

     BK_AS_PROJECT_TASK_NUM) AND ((BK_AS_PROJECT_CD = BK_AS_PROJECT_CD)

     AND ((BK_AS_TASK_RSRC_EXPNDTR_TYP_CD =

     BK_AS_TASK_RSRC_EXPNDTR_TYP_CD) AND

     ((BK_AS_RSRC_CISCO_WRKR_PRTY_KEY = BK_AS_RSRC_CISCO_WRKR_PRTY_KEY)

     AND (FISCAL_YEAR_WEEK_NUM_INT = FISCAL_YEAR_WEEK_NUM_INT ))))"),

     and null value information in Spool 4 (Last Use) and Spool 3 (Last

     Use).  Skip this join step if there is no null.  The result goes

     into Spool 2 (group_amps), which is built locally on the AMPs.

     The size of Spool 2 is estimated with no confidence to be 184,396

     rows (37,063,596 bytes).  The estimated time for this step is 0.99

     seconds.

  7) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 2 are sent back to the user as the result of

     statement 1.

 

dnoeth 4628 posts Joined 11/04
06 Jul 2012

First thing you should try is to rewrite the NOT IN on NULLable columns to a NOT EXISTS:

AND NOT EXISTS
 ( 
   SELECT * FROM ASBIVWDB_dv4.MT_AS_TIMECARD_RESOURCE MATR 
   WHERE WATR.BK_AS_PROJECT_TASK_NUM = MATR.BK_AS_PROJECT_TASK_NUM
   AND WATR.BK_AS_PROJECT_CD = MATR. BK_AS_PROJECT_CD
   AND WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD = MATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD
   AND WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY = MATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY
   AND WATR.FISCAL_YEAR_WEEK_NUM_INT = MATR.FISCAL_YEAR_WEEK_NUM_INT
 )

This should change Explain to a single step doing an Exclusion Join on the PIs.

Dieter

Dieter

dnoeth 4628 posts Joined 11/04
06 Jul 2012

Of course it's still All-AMP, could you show the new Explain and some info about the number of rows and Indexes/Statistics?

Dieter

Dieter

dnoeth 4628 posts Joined 11/04
06 Jul 2012

This is already the most efficient plan you can get, a direct join using the PIs of both tables.
Btw, this Explain is the result of an EXISTS instead of NOT EXISTS.

Regarding skew, how skewed is that table?
Can you provide COUNT/MIN/AVG/MAX(CurrentPerm) for this table from dbc.TableSizeV?

HELP STATS ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE INDEX NUPI_MT_AS_TIMECARD_RESOURCE;
will tell you about the rowcount, distinct values and max rows per value. Usually a max rows per value up to a few hundred is considered acceptable, otherwise you might consider a different PI.

But changing the PI might affect other queries in a positive or negative way, so you need some analysis to find out.

Dieter

Dieter

dnoeth 4628 posts Joined 11/04
06 Jul 2012

If this is the SkewFactor from Teradata Administrator it's large (the max AMP got approx. 3 times the number of rows than the average AMP), but it's small table. What's the size of the other table?

How many AMPs do you have?

I usually run a query like this to get info about skew:

SELECT
  DatabaseName
  ,TableName
  ,CAST(SUM(currentperm) / 1024**3 AS DEC(9,2)) AS GB
  ,CAST((MAX(CurrentPerm) / AVG(CurrentPerm)) AS DEC(5,3))AS SkewFactor
  ,CAST((HASHAMP() + 1) * MAX(currentperm) / 1024**3 AS DEC(9,2)) AS GBSkewed
  ,HASHAMP() + 1 AS NumAMPs
FROM dbc.TableSizeV
WHERE DatabaseName = ...
AND TableName = ...
GROUP BY 1,2

Dieter

Dieter

maheshteradata 14 posts Joined 06/12
06 Jul 2012

HELLO DIETER,

 

WILL YOU PLEASE SUGGEST ME ...AS MY  KNOWLEDGE I WILL CONCERN THE FOLLOWING STEPS

IF TABLE CONTAIN MULTISET .. SO IT ALLOWS THE DUPLICATES

SO TRY TO AVOID THE SKENESS "SET" ON CREATION OF TABLE...

AND PARIMARY INDEX ON MULTIPLE COLUMNS..

 

dnoeth 4628 posts Joined 11/04
07 Jul 2012

Hi Kalyan,
this table is not highly skewed, but number of rows per PI value might be high:

select
  hashrow(BK_AS_PROJECT_TASK_NUM , BK_AS_PROJECT_CD ,FISCAL_YEAR_WEEK_NUM_INT),
  count(*) as cnt
from MT_AS_TIMECARD_RESOURCE
group by 1
qualify row_number() over (order by cnt desc) <= 10
 

 

Regarding skew:
SET or MULTISET will not change skew (just avoid the overhead of duplicate row checking), only a different PI can help.

Dieter

Dieter

maheshteradata 14 posts Joined 06/12
08 Jul 2012

 

 

CAN ANY ONE SEND ME A QUERY... I WILL TUNE THAT QUERY AS PER MY KNOWLEDGE

 

REGARDS,

MAHESH

Qaisar Kiani 337 posts Joined 11/05
16 Jul 2012

You need to consider few things like number of records in the table, how many rows are there on average per index value etc. If there are two many rows against few of the index values then the skew factor will increase causing the performance degradation.

May be you need to re-think about your index choices.

Regards,

Qaisar Kiani 337 posts Joined 11/05
16 Jul 2012

This SQL should give you idea;


SELECT HASHAMP(HASHBUCKET(HASHROW(<<NUPI Columns))) AS "AMP #",COUNT(*)
FROM <<table>>
GROUP BY 1
ORDER BY 2 DESC;

Cheers!

Qaisar Kiani 337 posts Joined 11/05
18 Jul 2012

Skewness is related to the way the data is distributed across the AMPS, and its the INDEX selection that defines the data distribution...

PARTITION doesn't have to do anything with the distribution of the data, it only defines the way the data is looked in blocks on the AMPS...

Qaisar Kiani 337 posts Joined 11/05
18 Jul 2012

Use the query that I mentioned above

 

SELECT HASHAMP(HASHBUCKET(HASHROW(<<NUPI Columns))) AS "AMP #",COUNT(*)
FROM <<table>>
GROUP BY 1
ORDER BY 2 DESC

 

And try to Include/Exclude columns in the <<NUPI COLUMNS>>, and look at the results of COUNT(*) to determine the even distribution of the data... If the values get closer to each other then this means the data distribution is more even across the AMPS....

 

For further details, may be you can start digging into Teradata Physical Design documentation...

 

kalyan3in 3 posts Joined 07/12
18 Jul 2012

Thank u very much for the response. I will try this .

ulrich 816 posts Joined 09/09
09 Aug 2012

No, because you didn't share any valuable input! 

DDLs, Explains etc...

What do you expect????

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
09 Aug 2012

Overall the plan seems reasonable for this SQL assuming the stats infos are correct.

The PI for COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR is missing.

How many rows does this table have?

How many amps does your table have?

Is 24,261,154 rows in step 4 an accruate estimate?

Are stats collected for DV_BOOKINGS_TYPE?

How long does the query run? And what is expected SLA for this query?

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
09 Aug 2012

The table? Which?

Can you run?

select count(*) from COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR 

select count(*) from SLSORDDB_DV1.R_BOOKINGS_FMTH where DV_BOOKINGS_TYPE = 'CORE';

 

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
09 Aug 2012

I don't think you can do much here.

Stats infos seems to be acruate. And with these the plan seems to be a good one.

The majority of the time seems to be consumed by the aggregation which seems also be reasonable due to the long grouping list.

If you really want to see the impact of the join you can try the following

submit 

select 'when RBF.FISCAL_YEAR_MONTH_INT = ' !! trim(FISCAL_YEAR_MONTH_INT) !! ' then ' !! trim(FISCAL_YEAR_NUMBER_INT) 
from COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR 

and paste the result into the <<<>>> section of 

SELECT
Case 
<<<>>>
else null end AS FISCAL_YEAR_INT
,RBF.PRODUCT_KEY 
,RBF.END_CUSTOMER_KEY 
,RBF.BILL_TO_CUSTOMER_KEY 
,RBF.SHIP_TO_CUSTOMER_KEY 
,RBF.SOLD_TO_CUSTOMER_KEY 
,RBF.PARTNER_SITE_PARTY_KEY 
,RBF.SALES_TERRITORY_KEY 
,RBF.SALES_REP_NUMBER 
,RBF.ADJUSTMENT_TYPE_CODE 
,RBF.SALES_CHANNEL_CODE 
,RBF.SALES_CREDIT_TYPE_CODE 
,RBF.IDE_ADJUSTMENT_CODE 
, RBF.ADJUSTMENT_CODE 
,RBF.BKGS_MEASURE_TRANS_TYPE_CODE 
,RBF.CANCELLED_FLG 
,RBF.ACQUISITION_FLG 
,RBF.DISTRIBUTOR_OFFSET_FLG 
,RBF.CORPORATE_BOOKINGS_FLG 
,RBF.CHANNEL_BOOKINGS_FLG 
,RBF.OVERLAY_FLG 
, RBF.IC_REVENUE_FLG 
,RBF.CHARGES_FLG 
,RBF.SALESREP_FLG 
,RBF.MISC_FLG 
,RBF.SERVICE_FLG 
,RBF.INTERNATIONAL_DEMO_FLG 
,RBF.REPLACEMENT_DEMO_FLG 
,RBF.REVENUE_FLG 
,RBF.RMA_FLG 
,RBF.WIPS_ORIGINATOR_ID_INT 
,RBF.DV_DISCOUNT_BAND_CD
,RBF.DV_ORDER_BAND_CD
,SUM( RBF.DD_COMP_US_NET_PRICE_AMOUNT )
,SUM( RBF.DD_COMP_US_LIST_PRICE_AMOUNT )
,SUM( RBF.DD_COMP_US_COST_AMOUNT )
,SUM( RBF.DD_EXTENDED_QUANTITY )
,SUM( RBF.DD_COMP_US_HOLD_NET_PRICE_AMT )
,SUM( RBF.DD_COMP_US_HOLD_LIST_PRICE_AMT)
,SUM( RBF.DD_COMP_US_HOLD_COST_AMOUNT )
,SUM( RBF.DD_EXTENDED_HOLD_QUANTITY )
,SUM( RBF.DD_COMP_US_STANDARD_PRICE_AMT )
,SUM(RBF.TRADE_IN_AMOUNT) ,
RBF.DV_CA_BOOKINGS_TYPE_CD
FROM SLSORDVWDB_DV1.R_BOOKINGS_FMTH RBF 
WHERE RBF.DV_BOOKINGS_TYPE = 'CORE'
GROUP BY 
FISCAL_YEAR_INT
,RBF.PRODUCT_KEY 
,RBF.END_CUSTOMER_KEY 
,RBF.BILL_TO_CUSTOMER_KEY 
,RBF.SHIP_TO_CUSTOMER_KEY 
,RBF.SOLD_TO_CUSTOMER_KEY 
,RBF.PARTNER_SITE_PARTY_KEY 
,RBF.SALES_TERRITORY_KEY 
,RBF.SALES_REP_NUMBER 
,RBF.ADJUSTMENT_TYPE_CODE 
,RBF.SALES_CHANNEL_CODE 
,RBF.SALES_CREDIT_TYPE_CODE 
,RBF.IDE_ADJUSTMENT_CODE 
, RBF.ADJUSTMENT_CODE 
,RBF.BKGS_MEASURE_TRANS_TYPE_CODE 
,RBF.CANCELLED_FLG 
,RBF.ACQUISITION_FLG 
,RBF.DISTRIBUTOR_OFFSET_FLG 
,RBF.CORPORATE_BOOKINGS_FLG 
,RBF.CHANNEL_BOOKINGS_FLG 
,RBF.OVERLAY_FLG 
, RBF.IC_REVENUE_FLG 
,RBF.CHARGES_FLG 
,RBF.SALESREP_FLG 
,RBF.MISC_FLG 
,RBF.SERVICE_FLG 
,RBF.INTERNATIONAL_DEMO_FLG 
,RBF.REPLACEMENT_DEMO_FLG 
,RBF.REVENUE_FLG 
,RBF.RMA_FLG 
,RBF.WIPS_ORIGINATOR_ID_INT
,RBF.DV_DISCOUNT_BAND_CD
,RBF.DV_ORDER_BAND_CD
,RBF.DV_CA_BOOKINGS_TYPE_CD

Check the explain and the runtimes...

If the runtime is not drastic better you won't have a chance to speed this up.

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
10 Aug 2012

No - you didn't again provide any valuable input before dumping your questions.

And open new threads for new questions.

 

 

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
10 Aug 2012

If information is confidential you should not share it in this forum. 

Beside this I still don't know what your issue is.

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

lavi 1 post Joined 06/11
20 Sep 2012

 
Hi,
I have a query which is specifically user/BO query…
It is taking lot of time ..
How to improve the performance of the same..
 
Below is the query and index/stats information..
 
LOCKING tab1 FOR ACCESS
SELECT
  tab2.TODAY_DATE,
  tab3.CIRCLE_NAME,
  COALESCE(tab4.CALL_DIRECTION_ID,'SMS_MO','SMMO','SMS_MT','SMMT','DIAMETER','GPRS',tab4.CALL_DIRECTION_ID),
   COUNT(DISTINCT tab1.CP_ACCOUNT_KEY),
  SUM(tab1.CALL_COUNT),
  SUM(tab1.CALL_DURATION)/60
FROM
  tab2,
  tab3,
  tab4,
  tab1
WHERE
  ( tab4.CALL_DIRECTION_KEY=tab1.CALL_DIRECTION_KEY  )
  AND  ( tab3.GEOGRAPHY_KEY=tab1.CIRCLE_KEY  )
  AND  ( tab2.TIME_KEY=tab1.DAY_KEY  )
  AND  ( tab4.CALL_DIRECTION_ID  NOT IN  ('TRANSIT-O','TRANSIT-I','TRANS','TRANSIT','TRA','TRAN','-901','-902','0','50','1','2','3')  )
  AND
  tab2.TODAY_DATE  BETWEEN  {d '2012-07-01'}  AND  {d '2012-07-07'}
GROUP BY
  1,
  2,
  3
 
tab1
PRIMARY INDEX ( DAY_KEY ,CIRCLE_KEY ,CALL_SCENARIO_KEY ,CALL_DIRECTION_KEY ,CALL_TYPE_KEY ,CP_ACCOUNT_KEY )
PARTITION BY RANGE_N(CAST((SUBSTR(TRIM(BOTH  FROM DAY_KEY ),1 ,8 )) AS DATE FORMAT 'YYYYMMDD') BETWEEN '2012-01-01' AND '2017-12-31' EACH INTERVAL '1' DAY ,
 NO RANGE, UNKNOWN)
INDEX idx1 ( DAY_KEY ,CIRCLE_KEY ,CALL_DIRECTION_KEY );
 
 
COLLECT STATS ON tab1 COLUMN (day_key)
COLLECT STATS ON tab1 COLUMN (cp_account_key);
COLLECT STATS ON tab1 COLUMN (day_key,circle_key,call_direction_key);
Thanks

Lavi

teradatauser2 236 posts Joined 04/12
17 Jul 2013

Hi,
We have some tables with more than 90% skew. Could anyone help me on this. what is the query to find table skew and how much is agreeable limit ?
I want list of steps to perform to fix this. As per my understading goes :
1. Changing the PI is solution of this
2. I will list down queries that are run against this table for 3 months.
3. find out the cpu consumed by them for each run
4. index analysis (ti check in how many queries the existing PI is used ?)
4. change PI in test area
5. check data distrubution for new table
6. check cpu consumtion for same queries listed above
7. if there is significant improvement, then go for it
Please let me know if these steps are correct. how can we do index analysis? what query to use ? any other queries which may be helpful
 

teradatauser2 236 posts Joined 04/12
18 Jul 2013

hi everyone,
No help yet, could someone help me here ?
 
-Thanks

VBurmist 96 posts Joined 12/09
19 Jul 2013

Hi,
you should definitely start a new thread for the new question.
Table skew can be calculated via dbc.tablesize view.   
The steps look good.  On step 3 you might want to measure both CPU and I/O, not just CPU.
Regards,
Vlad.

Shelley 28 posts Joined 09/10
19 Jul 2013

Also the guideline for allowable  skew is dependent on the size of the table.

Small tables  can have a larger % skew but that is not usually  a problem.

For larger tables you might want to look at anything over 20%.

Teradata Administrator will show skew factor in the Space Summary if you don't want to do the calculations.

 

--Shelley--

teradatauser2 236 posts Joined 04/12
27 Jul 2013

Hi VBurmist,
Could you share the query for calculating skew ?
 
-Thanks

03 Aug 2014

Hi teradatauser2,
PFR..

SELECT

  DatabaseName

  ,TableName

  ,CAST(SUM(currentperm) / 1024**3 AS DEC(9,2)) AS GB

  ,CAST((MAX(CurrentPerm) / AVG(CurrentPerm)) AS DEC(5,3))AS SkewFactor

  ,CAST((HASHAMP() + 1) * MAX(currentperm) / 1024**3 AS DEC(9,

2)) AS GBSkewed

  ,HASHAMP() + 1 AS NumAMPs

FROM dbc.TableSizeV

WHERE DatabaseName = 'retail'

AND TableName = 'item'

GROUP BY 1,2

 

Regards,

Abhijeet

You must sign in to leave a comment.