All Forums Database
prakhar 101 posts Joined 05/08
25 Nov 2008
Use of order by

We have created 1 table with following structure CREATE SET TABLE prakhar2 ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( c INTEGER, f INTEGER, d DATE FORMAT 'yyyy-mm-dd', e DATE FORMAT 'yyyy-mm-dd')PRIMARY INDEX ( c ); But when we are executing below queries result sets are coming different ,we are bit confused by use of ORDER BY clause.Please help . 1) sel * from prakhar2 c f d e1 2 01/02/2008 01/01/20081 3 01/02/2008 12/31/99991 4 01/02/2008 12/31/20071 5 01/02/2008 01/22/2008 2) sel * from prakhar2 order by d c f d e1 4 01/02/2008 12/31/20071 5 01/02/2008 01/22/20081 3 01/02/2008 12/31/99991 2 01/02/2008 01/01/2008 3) sel * from prakhar2 where f>2 order by d asc. 1 5 01/02/2008 01/22/20081 4 01/02/2008 12/31/20071 3 01/02/2008 12/31/9999 4) sel * from prakhar2 where f>2 1 3 01/02/2008 12/31/99991 4 01/02/2008 12/31/20071 5 01/02/2008 01/22/2008Not able to understand how order by is working in Teradata.The column on which i am ordering is having all equal valuesPlease help as this is blocking many of my operations

dnoeth 4628 posts Joined 11/04
26 Nov 2008

Hi prakhar,accordign to Relational Data Model/Standard SQL an answer set is an unordered set of rows unless you specify ORDER BY.Teradata is a parallel DBMS based on hashing, so there's no built-in sequential ordering of any kind as other DBMSes might provide."The column on which i am ordering is having all equal values"Then why do you order by it?Simply ORDER BY d,e and everything is fine.Dieter

Dieter

prakhar 101 posts Joined 05/08
27 Nov 2008

HiThat is the case with me .......if column wud have been different even i know how it is going to order .Plaease provide some valid explanation as this sequence is affectiong my work

Adeel Chaudhry 773 posts Joined 04/08
27 Nov 2008

Hello,In case of specifying single column in ORDER BY, it orders the column values of a single column, regardless of other column values. If you apply ORDER BY on a column containing 1, it will always be 1 and hence no purpose of using ORDER BY (but only if you expect changed values other than 1).So, as suggested by Dieter, if you need to order by more than 1 column (or all), please specify them in the ORDER BY clause, otherwise, it will return data just the way you asked for it, not by the way you need it.Regards,Adeel

-- If you are stuck at something .... consider it an opportunity to think anew.

prakhar 101 posts Joined 05/08
27 Nov 2008

Hi Adeel,You have correctly explained your point but all the time i am running this query2) sel * from rtwit_work.prakhar2 order by dc f d e1 4 01/02/2008 12/31/20071 5 01/02/2008 01/22/20081 3 01/02/2008 12/31/99991 2 01/02/2008 01/01/2008I am getting this result set only....I was thinking it will go for primary index col if col specified in order by has equal value ...I am not getting the point y this sequence is chosen by teradata when both order col and primary index col are equal...I it something to do with rowid???collateral sequence.....the way i entered rows in table are1)row with f=2....>f=3...f=4...f=5but results just are out of scope....

Adeel Chaudhry 773 posts Joined 04/08
27 Nov 2008

Again....how can you expect the result to be ordered on something, without mentioning it!As per my knowledge, it returns the rows on the AMP basis, it has nothing to do with rowid or the sequence you inserted those rows or anything else, following example may clear my point:Suppose you have a Table1 with columns Col1, Col2, Col3 and the system is configured with 4 AMPs (AMP1, AMP2, AMP3, AMP4), AMP3 is fastest AMP, and AMP2 is slowest, AMP1 and AMP4 are in middle.Now at time T1, you issue following query:SELECT * FROM Table1;this is what will happen:T1 + x: AMP3 will start returning/putting rows in spool firstT1 + x + y: AMP1 and AMP4 will start on secondT1 + x + y + z: and AMP2 will start the latest from all AMPsT2: AMP3 finishes the taskT2 + x: AMP 1 and AMP4 finishesT2 + x2: AMP2 finishesNow you will have this everytime you run a query, because without adding/deleting/changing more rows i.e. changing data demegraphics, you can't slow down AMP3 or make AMP2 fast. Which means same rows every time you execute the query. This also depends on the current load of the system. In this scenario we also supposed that nothing is happening in the world but what we are doing.Now, what happens with a query having ORDER BY? Following may be helpful:Keep all the same assumptions as above, just change the query as follows:SELECT * FROM Table1 ORDER BY Col1;Now if you suppose that Col1 has only 1 distinct value i.e. 1 it will be same as above, logically, but the rows "may or may not" be in same order, but if the column contains more values, it will sort them first in spool.HTH!Regards,Adeel

-- If you are stuck at something .... consider it an opportunity to think anew.

prakhar 101 posts Joined 05/08
01 Dec 2008

Hi Adeel,Gr8 explanation Sir.......But the point that confused me is that here all rows have same nupi that means they will be mapped to same AMP so whether question of which AMP is faster and all still exists!!!!! can you clarify even if it is a stupid question.....Only point i was asking that what factors drive the resulset when all the rows in a column defined in order by clause has same values....

Adeel Chaudhry 773 posts Joined 04/08
02 Dec 2008

This time without going into details, the data read first will be sent first if it satisfies the given condition/order.Regards,Adeel

-- If you are stuck at something .... consider it an opportunity to think anew.

dnoeth 4628 posts Joined 11/04
02 Dec 2008

All rows with the same PI-value will be stored on the same AMP sequentially in the order of inserts.If there's no ORDER BY the rows are returned in that order.An answer set is returned *after* it's fully created (other DBMSes start returning rows as soon as they're found).Each AMP sends blocks to the BYNET, which merges them (using round-robin).You can watch that when you submit:select hashamp(hashbucket(hashrow(pi_col))) from tab;This will result in:123456...123456...If you ORDER BY each AMP sorts it's part of the answer set, followed by a merge sort via BYNET.Apparently the optimizer doesn't skip that sort and the sort algorithm used for the AMP-local sort does not preserve to original order (unstable).If you want the same 'order' for all those 4 queries then don't specify any ORDER BY.But it's still no guaranteed order :-)Dieter

Dieter

rgs 106 posts Joined 02/07
05 Dec 2008

If there is no ORDER by clause all rows with the same PI will end up in the response spool in the same order as they are in the table. There is no implied order in the table, but of course physically it has to be put in there in some manner, which could be considered as some kind of order. But you can’t rely on that in your design.When you add the ORDER by clause it will generate a response spool as in the other case with a sort field added in there separate from the data columns. The sort logic knows nothing of the order of the rows. It just looks at the sort key which in your example is the same. So the sort is free to put rows with identical keys in any order. Unless you know the internals of the sort algorithm you can’t expect it to keep rows having identical key values in some sort of order. There is no ‘keep the rows in the same order as they are read if the keys are the same’ rule in the sort logic, therefore it can output them in any order since there is no implicit order in a set in the first place.

You must sign in to leave a comment.