All Forums Database
nag@rjuna 4 posts Joined 03/11
21 Nov 2013
Database Performance: large Table getting redistributed causing performance problems

Hi ,
Good Morning,
I have this performance problem where in I have to join a big table (> 1million records ) on column(which is a part of PI) to get few columns(4) to the semantic layer. I have implemented the code , but the explain plan says the table getting redistributed for every load running. Redistributing 1 Million records for every load(batchly - 1 hr) is causing bottle necks for other queries).
 
Technical details:(Table names and columns are masked for security reasons)
 Table to be joined with : Financial_Statement -- no of columns 98
                                                     --no of rows > 1million
                                                     --PI(col1,col2,col3,col4) all are varchar(100)
                                                     --NUSI(col60,col70) integer and char datatypes
                                                     --Vosi(col45) date datatype
 
here col2 is used in joining condition.
 
 
Would appreciate any help!!!
 
Thanks in advance.
 
Naga.
 
 
 
 

M.Saeed Khurram 544 posts Joined 09/12
21 Nov 2013

Hi Naga,
You do not have mentioned the 2nd table's information. If the 2nd table is small you can create a hashindex on that to avoid redistribution, 
even if you have proper stats collected, the optimizer will redistribute the smaller table. 
Second thing is you are using a column from PI, if you use a column from PI consisting of multiple column then you are not getting the benifit of PI.
You have to use all the columns of PI to take advantage of PI performance.
 

Khurram

VBurmist 96 posts Joined 12/09
22 Nov 2013

Hi Naga,
first thing is to check the statistics.    And information about 2nd table is, indeed, very needed.
Apart from that, if you select only a few columns(4) from a table, a distribution might be a good choice, actually.   Because redistribution takes place only for the columns that you select.  
How do you know that it is causing a bottleneck?   Is there a skew when the table gets redistributed?
p.s.  also, you might want to introduce IDs instead of varchars for PI column, if possible.  But that does not relate to the redistribution, though.
Regards,
Vlad.

You must sign in to leave a comment.