All Forums Database
member 16 posts Joined 02/08
12 Feb 2008
Collect Stats

Hi all,1: Once a collect stats is done on the table(on index or column) where is this information stored so that the optimizer can refer this?2: How often collect stats has to be made for a table that is frequently updated?3: Once a collect stats has been done on the table how can i be sure that the optimiser is considering this before execution ? i.e; untill the next collect stats has been done will the optimiser refer this?4: How can i know the tables for which the collect stats has been done?5: To what extent will there be performance issues when a collect stats is not done?Can a performance issue be related only due to collect stats? Probably a HOT AMP could be the reason for lack of spool space which is leading to performance degradation !!!6: Also let me know what can lead to lack of spool space apart from HOT AMP !!!Answers awaited!!!Thanks,abc

Someshnr 53 posts Joined 06/07
12 Feb 2008

Here is my answers, hope that helps: 1: Once a collect stats is done on the table(on index or column) where is this information stored so that the optimizer can refer this?Ans: Collected statistics are stored in DBC.TVFields or DBC.Indexes. However, you cannot query these two tables. 2: How often collect stats has to be made for a table that is frequently updated?Answer: You need to refresh stats when 5 to 10% of table's rows have changed. Collect stats could be pretty resource consuming for large tables. So it is always advisable to schedule the job at off peak period and normally after approximately 10% of data changes. 3: Once a collect stats has been done on the table how can i be sure that the optimiser is considering this before execution ? i.e; untill the next collect stats has been done will the optimiser refer this?Ans: Yes, optimizer will use stats data for query execution plan if available. That's why stale stats is dangerous as that may mislead the optimizer. 4: How can i know the tables for which the collect stats has been done?Ans: You run Help Stats command on that table. e.g HELP STATIISTICS TABLE_NAME ; this will give you Date and time when stats were last collected. You will also see stats for the columns ( for which stats were defined) for the table. You can use Teradata Manager too.5: To what extent will there be performance issues when a collect stats is not done?Can a performance issue be related only due to collect stats? Probably a HOT AMP could be the reason for lack of spool space which is leading to performance degradation !!!As: 1stpart: Teradata uses a cost based optimizer and cost estimates are done based on statistics. So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and your performance will suffer. 2nd Part: No, performance could be related to bad selection of indexes ( most importantly PI) and the access path of a particular query.6: Also let me know what can lead to lack of spool space apart from HOT AMP !!!Ans: One reason comes to my mind, a product join on two big data sets may lead to the lack of spool space.

member 16 posts Joined 02/08
12 Feb 2008

Thanks Somesh,Things were clear.can you please tell how to check the Collect stats using Teradata Manager???Regards,abc

member 16 posts Joined 02/08
12 Feb 2008

Also,Let me know...once a collect stats is done are all those records stored somewhere? i.e If a collect stats on a table is resulted with 10 uinique values...are these unique values stored somwhere?What is stored in DBC.tvfields/dbc.indexes is just a count of the values right ? If all this is stored then its using up lots of space ....?Thanks,abc

member 16 posts Joined 02/08
13 Feb 2008

Hi,Can a collect stats be done at the database level?If so how....? Which table will have an entry of the same?Or atleast can a DBA do that?Any previlage required?Thanks,abc

Balamurugan B 81 posts Joined 09/07
15 Feb 2008

I think, we can’t do a collect statistics on database level, It can only be done on indexs and columns in a table…Regards,Balamurugan

Regards,
Balamurugan

Someshnr 53 posts Joined 06/07
15 Feb 2008

See my answers... hope that helpsQuestion: Also,Let me know...once a collect stats is done are all those records stored somewhere? i.e If a collect stats on a table is resulted with 10 unique values...are these unique values stored somwhere?Answer: No. data is still in the base table. Collect stats just derives the data demographics of the table. It collects the information like: total row counts of the table, how many distinct values are there in the column, how many rows per value, is the column indexed, if so unique or non unique etc.Question: What is stored in DBC.tvfields/dbc.indexes is just a count of the values right ? If all this is stored then its using up lots of space ....?Answer: Right. Space requirement is dependent on how many columns, indexes you have defined for collect stats. I donot think this space is a matter of concern.Question: Can a collect stats be done at the database level?If so how....? Which table will have an entry of the same?Or atleast can a DBA do that?Any previlage required?Ans: No. You cannot collect stats at database level. See Balamurugan's answer. Forget at database level, for bigger tables you got to be careful about number of columns and number of Indexes you really want to collect stats on. Because of time and resources this task takes, people sometime resorts to collect stats on a smaller sample of table rows instead of the all the rows in the table.

member 16 posts Joined 02/08
20 Feb 2008

Hi all,Thanks for your answers.Is there any way by which we can estimate the time taken to do the collect stats ..Is it that we can use the collect stats wizard to do the same?Are both of them (collect stats sql statement and collect stats wizard) generating the same result? and consume the same time?Thanks,abc

gjohnson 1 post Joined 06/10
29 Jun 2010

Is there a way in Teradata to only collect stale statistics or stats that are out of date? In Oracle, using dbms_stats package you can tell the optimizer to only collect stale stats...

--Greg

Adeel Chaudhry 773 posts Joined 04/08
29 Jun 2010

Hi,

The time can be estimated over data-volume, data-type of column and count of columns. But off-course they can't be very accurate.

I believe, there isn't any way to do 'collect stale stats' as yet, perhaps because of Teradata's DWH nature and not OLTP.

Regards,

MAC

-- If you are stuck at something .... consider it an opportunity to think anew.

robpaller 159 posts Joined 05/09
30 Jun 2010

Google "dieter noeth stats view" (dnoeth here on these forums) and you will find some very handy SQL that can be used to create your own stats maintenance routine. With it you can easily identify when statistics were last collected on a column or index and determine whether you want to recollect them again.

Typical guideline are when roughly 10% of the data has changed. (We measure by the delta in perm space since last collected.) We also arbitrarily recollect based on stats that have aged 60-90 days.

vishnoiprem 5 posts Joined 03/11
23 Feb 2012

Why we are use  COLLECT STATISTICS in teradata not other database

Prem Vishnoi

Stefans 38 posts Joined 02/12
23 Feb 2012

The purpose of the COLLECT STATISTICS is to gather and store demographic data for one or more columns or indices of a table or join index. This process computes a statistical profile of the collected data, and stores the synopsis in theData Dictionary (DD) for use during the PE’s optimizing phase of SQL statement parsing. The optimizer uses this synopsis data to generate efficient table access and join plans.

Stalin

You must sign in to leave a comment.