All Forums Database
benjaminpwarren 1 post Joined 09/12
19 Sep 2012
Accidental Cartesian Join Bug

I've recently noticed that the Teradata parsing engine doesn't pick up on / given an error for 'accidental cartesians' or, to use another term to make it more clear why I think this is a significant bug, 'unreferenced / unqualified tables'.
For example, Teradata allows me to execute a query selecting customers with addresses that are valid as at now:

SELECT
  Customers.CustomerID
FROM
  Customers
    INNER JOIN Addresses
    ON Customers.CustomerID = Addresses.CustomerID
WHERE
  Addresses.End_Date > CURRENT_TIMESTAMP

All well and good.
If I then write another query, for example, to identify customers that have a valid phone number, I might choose to copy and paste the above code and adjust it accordingly:
 

SELECT
  Customers.CustomerID
FROM
  Customers
    INNER JOIN Phone_Numbers
    ON Customers.CustomerID = Phone_Numbers.CustomerID
WHERE
  Addresses.End_Date > CURRENT_TIMESTAMP

NOT all well and good. I missed the reference to "Addresses" in the WHERE condition and in doing so have created an accidental cartesian join and it's returning every customer that has had a phone number recorded (at some point) repeated for every address entry that is valid regardless of whether that address relates to the CustomerID or not!
I'm amazed that Teradata doesn't notify me of the massive error I've made. I should absolutely be getting a notification that I've used a table that hasn't been referenced in a FROM clause. When would the current behaviour ever be desirable?
This bug isn't limited to WHERE conditions, it will occur if you reference a table in the SELECT portion. It will also occur if you use an alias in the FROM clause but fail to use it elsewhere.
I could only find a couple of other references to it:
http://www.scribd.com/doc/78503320/43/Answer (page 93)
http://itpro420.blogspot.com.au/2010/07/teradata- optimizer-anomaly-to-beware-of.html
 
Has anyone else on these forums encountered this issue? How does one go about requesting that it be fixed?

dnoeth 4628 posts Joined 11/04
19 Sep 2012

This is a well known artifact, it's usually encoutered when you forget to use an alias name.
Teradata was implemented before there was Standard SQL, the initial query language was called (AFAIK) TEQUEL (TEradata QUEry Language), whose syntax didn't require to list tables within FROM.
A simple "RETRIEVE Adresses.EndDate" carried enough information for the Parser to resolve tablename and columnname (RETRIEVE is still allowed in BTEQ, only SQL Assistant complains about "invalid query" and refuses to submit it).
When they switched to Standard SQL this old syntax was not removed, IMHO there should be a switch to disable it.
Dieter

Dieter

gboundy 4 posts Joined 05/10
19 Dec 2013

 
Are there any plans to fix this "feature" or provide a way to disable it as Dieter suggests?

graham boundy
consultant
Project X ltd
gboundy@pxltd.ca

You must sign in to leave a comment.