All Forums Database
GianTD 47 posts Joined 11/14
17 Nov 2014
How to decide about Stats for new queries

Hello All !
 
I am bit new to Teradata and started working on it couple of months back.
 
I working on performance tuning suggestions for some new queries.
 
Wondering if anybody can help on:
How we can decide for the stats for new queries. I mean, the queries which are not in production yet.
What one should consider to decide about stats, Tips, best practices etc?
 
Thanks for your help!
Gian

Rupesh_rsb 4 posts Joined 01/09
17 Nov 2014

hi,
For stats ensure they are laways up to date.As per teradata whenever there is 10% data change then we are supposed to update stats.
to find out missing stats for querries use below
EXPLAIN DIAGNOSTIC HELPSTATS ON FOR SESSION;
Mostly the columns which are used in "WHERE" clause should have stats collected.
Other points for performance improvements is avoid casting/dataconversion/string functions in joining.
Ensure tables used in joining have same PI in most cases.
 
 
 

Harpreet Singh 101 posts Joined 10/11
18 Nov 2014

Adding to what Rupesh said, 
From explain daignostinc helpstats result. You should choose which stats to collect, not all are needed. Also more important are one with igh confidence.

GianTD 47 posts Joined 11/14
18 Nov 2014

Thanks for your responses !
 
So, stats should be collected for:
- columns in WHERE clause,
- PI columns,
- What else?
 
I was asking for the new tables & new queries when there is no existing stats.
Is it good to use daignostinc helpstats in this case?

18 Nov 2014

Below is an extended list of candiate columns of stats collection in general -
where clause columns
Index columns (Preferebly NUPI, NUSI, partitions, JI),
columns which are involved in join conditions, case statements
Once apply to above columsn, see the recommondations of "diagnostic helpstats on for session;" and apply them one by one see the change in the plan in terms of number of steps, confidence levels, counts if you find any improvment keep it or delete and go for next recommondation. that way we should be good in many cases (still it is not 100%  )
Refresh the stats always when 10% of data changed.

Satyanarayana

dnoeth 4628 posts Joined 11/04
GianTD 47 posts Joined 11/14
19 Nov 2014

Great !
Thanks Satyanarayana & Dieter!!

GianTD 47 posts Joined 11/14
19 Nov 2014

Could you please elaborate more on:
 "and apply them one by one see the change in the plan in terms of number of steps, confidence levels, counts if you find any improvment keep it or delete and go for next recommondation"
- If improved, it'll reduce the number of steps?
 - Confidence levels anyway will be high (as recently collected). How to decide?
- Counts- Are you refering to row counts?
- What else can help while checking the query performance while testing after making all these changes? 
 
Thanks !

21 Nov 2014

Thanks Dieter for the links. They are informative.
Gian,
What I mean in my above post was  - The number of steps may increase or decrease some times. But need to see if any change and the change is positive or not in terms of confidence. 
It is not high confident always even you collect the recommonded stats.  Observe if there is no change then drop that stas. 
In terms of count see if the counts are coming near to the actual counts. if it coming near by actual count then our stats worked.
I see below link also helpful -
http://www.teradatapro.com/teradata-sql-tuning-top-10/
Best of luck !!
 

Satyanarayana

GianTD 47 posts Joined 11/14
21 Nov 2014

Thanks Satya.
 
Wondering if we need to collect stats on the complete tables also (tables in FROM clause) (i hope not but just want to make sure)
OR Just columns involved in the joins?

22 Nov 2014

"complete table" you mean all the columns of the table ?
No it is not required to collect all the columns of any table.
Only those columns of any table invloved in where , joins..etc
 
or probably you are asking for table level collect stat statment. This statement is just to refresh the already collected stats.
 

Satyanarayana

GianTD 47 posts Joined 11/14
22 Nov 2014

How to decide if (& which) multi column of a table stats need to be collected...Like col1,col2, col3 of a tbl

22 Nov 2014

Through explain plan recommondations after executing "diagnostic helpstats on for session" .
Generally multicolumn stats only used when that combination of columns are used in group by clauses, where and join very frequently.
Many times sinnle colum stats can be used by optimizer in place of multi column but multi column stats can't be used by the optimizer when singl column stats required. 
 

Satyanarayana

GianTD 47 posts Joined 11/14
24 Nov 2014

Thanks Satya!
 
To check existing stats do i have to dig down to base table or is there any other easier way with view itself?

GianTD 47 posts Joined 11/14
24 Nov 2014

As queries use views, can i check from views itself to find for which columns stats are already there or not?

GianTD 47 posts Joined 11/14
25 Nov 2014

To check about the existing stats on the tables...
- Can we check from views itself used in the query?
 
For any new query...
- Wil Running Diagnostic Helpstats give all the stats needed for the query?

Harpreet Singh 101 posts Joined 10/11
26 Nov 2014

Stats can be checked using help stats on base tables or using dbc table like statsV , statsinfo.
running diagnostic helpstats give all stats that MAY improve query plan, but only few need to be selected for collection.

GianTD 47 posts Joined 11/14
26 Nov 2014

Right Harpreet but my question was:
To check about the existing stats on the tables...
- Can we check from views itself used in the query?
(As you need to check definition of each view to find base table & then Help Stats Tablename)

Harpreet Singh 101 posts Joined 10/11
26 Nov 2014

No you cannot check from view directly. 

Kritarth 1 post Joined 11/14
29 Nov 2014

Quite informative!!

GianTD 47 posts Joined 11/14
02 Dec 2014

For any new query on new table when no prior stats collected...
- Wil Running Diagnostic Helpstats give all the stats needed for the query?
 
I mean rather looking in joins/where clause of the query to check for needed stats, can i just use Diagnostic Helpstats  (easy & quick)

dnoeth 4628 posts Joined 11/04
02 Dec 2014

DIAGNOSTIC HELPSTATS will return all stats the optimizer might utilze.
But this doesn't neccessarily mean that all those stats must be collected.

Dieter

GianTD 47 posts Joined 11/14
02 Dec 2014

From those recommendations, Only with High confidence should be collected?

Harpreet Singh 101 posts Joined 10/11
05 Dec 2014

Even high confidence MAY be collected. Its not mandatory. collect once, then check if they are used and improve performance. else remove.

GianTD 47 posts Joined 11/14
05 Dec 2014

Thanks !!
How i can check which stats are being used for the query?
:-) Sorry not expert in Teradata
 
 
What would be the impact of keeping the stats which are not being used?

mail2aks2 5 posts Joined 01/14
25 Dec 2014

Perform an explain on the query. If the stats on one or more cols are not used in your query, drop them since they take up space in the dictionary table(s).

Ashok Sagar

Karam 75 posts Joined 07/09
21 Jan 2015

Stats collection is an ever refining process. Don't get too much overwhelmed about it. Run the queries , do stats , run them again and keep on adding /removing following the basic guidelines and thats preety much you would do.

GianTD 47 posts Joined 11/14
23 Jan 2015

How one can check If the stats on one or more cols were not used in the query ?

ap90792 11 posts Joined 08/14
23 Jan 2015
Check this out : http://developer.teradata.com/database/articles/identifying-used-unused-and-missing-statistics

 

You must sign in to leave a comment.