All Forums Analytics
yunfeizhao 6 posts Joined 02/15
12 Feb 2015
How to join faster

I have four tables, each has 1+ million rows, and I only need to pull one row out (where tb3.id=12345). How should I do to get the result faster? I have two ways, either use on, or use where, but I do not know which one is faster. What comes first in the process, on, or where?  Can you please share the rules/guideline?
Thanks
Yunfei
method 1:
select tb1.*
, tb2.*
, tb3.*
, tb4.*
from tb1 left join tb2  on tb1.id=tb2.id
left join tb3   on tb2.id=tb3.id and tb3=12345
left join tb4   on tb2.id=tb4

method 2:
select tb1.*
, tb2.*
, tb3.*
, tb4.*
from tb1 left join tb2  on tb1.id=tb2.id
left join tb3   on tb2.id=tb3.id
left join tb4   on tb2.id=tb4
where tb3=12345

dnoeth 4628 posts Joined 11/04
12 Feb 2015

Hi Yunfei,
#2 will be faster, but returns a totally different result set :-)
In #1 you get all rows from tb1 (it's outer joins without any WHERE), while #2 returns a single row (the result is in fact an inner join between tb1, tb2 and tb3)
Depending on your needs left joins with tb3 as main table might be correct:

select tb1.*
, tb2.*
, tb3.*
, tb4.*
from tb3 left join tb2 on tb2.id=tb3.id
left join tb1 on tb1.id=tb2.id
left join tb4 on tb2.id=tb4
where tb3=12345

There's a nice chapter in the manuals dealing with ON vs. WHERE for Outer Joins:
http://www.info.teradata.com/HTMLPubs/DB_TTU_14_00/SQL_Reference/B035_1146_111A/ch02.033.077.html
 

Dieter

yunfeizhao 6 posts Joined 02/15
13 Feb 2015

Dieter, Thanks for the insight that I had confusion for a long time. The read also provides a full picture to understand thsi issue.

You must sign in to leave a comment.