All Forums Database
Marisol 2 posts Joined 05/07
14 May 2007
Collect stats index vs multiple columns: advantages?

Hi to all, I would like to know what is the difference in collecting stats for an index as opposed to collecting stats for multiple columns? ______________________________________________________ _______________________COLLECT STATS ON dbname.tablename COLUMN(col1, col2, etc.);They can also be collected on an index:COLLECT STATS ON dbname.tablename INDEX(col1,col2,etc.);Thanks,

Fred 1096 posts Joined 08/04
14 May 2007

If there is a corresponding index, the action taken by the database is identical regardless of which syntax you use. The only time it matters is if you first collect stats on multiple columns and later define an index on that same list of columns. In that case, you would have to DROP the original multi-COLUMN stats before you could COLLECT stats on the INDEX.

TdMan 91 posts Joined 01/07
24 May 2007

Is there a performance difference btw a column in where clause which has collect stats on column or collect stats on index?

joedsilva 505 posts Joined 07/05
24 May 2007

If your question is whether there's a performance difference where you have a column with an index on it and the stats are collected on either the index or column ... the answer is no, they both have the same effect on performance.If you meant to ask whether there's a performance difference where you have a column (with no index) and you collect stats on it versus you have the column with an index on it and have collected stats for it, the answer is yes.

TdMan 91 posts Joined 01/07
24 May 2007

Hi Joe,Thanks for the quick response. My question of the second case. What could be the reason for the difference in performance when using a index column with using a non index column in where clause having collected stats for either of them?Regards,Sakthi

joedsilva 505 posts Joined 07/05
25 May 2007

That's because when there's an index it can be used to reach the record without doing a full table scan. so it's faster. Now if you don't have an index, you have to do a full table scan to fetch the records even if you have stats collected on it.So the question is why collect stats on columns without indexes ?well it kind of helps optimizer at other places say like you have a query (hypothetical)SELECT *FROM A INNER JOIN BON A.COL1 = B.COL1WHERE A.COL2 BETWEEN 10 AND 100000AND B.COL2 BETWEEN 10 AND 100000now assume COL1 is not a PI for neither A or B and there are no indexes on COL2 for both A and B.(and we don't have any stats)So optimizer if going to do a full table scan on both A and B, with a condition of COL2 between 10 and 100000 and then redistribute it based on row hash of COL1 to spool and then do the joins. (so both table A and B gets redistributed)Now let's say Table B had only 10 records which had COL2 values that fell between 10 and 100000. So a better optimizer plan would have been to do a full table scan of Table B with condition COL2 between 10 and 100000 then duplicate it in all AMPS. Table A could be spooled locally and needn't get redistributed.So we have a plan with minimal record redistribution now, which is faster. now optimizer could have made the second decision if it had stats on COL2 of table B.That's one of those uses... I am sure there are more to it... (I haven't warmed up to my morning coffee yet ... :-) )

TdMan 91 posts Joined 01/07
25 May 2007

Hi Joe,That was a gr8 explanation. Thanks for your inputs:-)

You must sign in to leave a comment.