All Forums Database
ppad001 18 posts Joined 06/11
31 Jan 2014
Generating Surrogate Key Using ROWNUM

Hi All,
Greetings !
I have a requirement where I have to generate a surrogate ID for every row in the FACT table. Previously we used to use ROW_NUMBER() for the same. But the data volume has gone up in the recent requirements upto 100 mill +. The Queries are taking very long time but compartively very less without the ROW_NUMBER. What would be the best way to do so ? We dont want to use Identity columns ! Please help.

dnoeth 4628 posts Joined 11/04
31 Jan 2014

A similar thread statred a few days ago:
http://forums.teradata.com/forum/database/generate-surrogatekey-with-a-huge-table
 
 
Which part of the process is slow?
Checking for the MAX value or the calculating using ROW_NUMBER? Did you check DBQL?
Can you share how you create the SK?
 

Dieter

Adeel Chaudhry 773 posts Joined 04/08
31 Jan 2014

Please share your DB.table-names & current/expected row-counts to better propose a solution.

-- If you are stuck at something .... consider it an opportunity to think anew.

ppad001 18 posts Joined 06/11
03 Feb 2014

Adeel,
Estimated Row count = 200 million
Dieter,
I guess its the calculation of the ROW_NUMBER() part because I tried without the max part but of no help.
The query is something like this: No group by, No order by just a simple move with the ROW_NUMBER calc
SELECT
ROW_NUMBER() over (ORDER BY AccounT_ID),
PROCESSING_MONTH,CUSTOMER_ID
.
.
.
FROM TABLEA
I had a look into the DBQL, but could you please help me with what exactly to look for ?
 

Adeel Chaudhry 773 posts Joined 04/08
03 Feb 2014

Try the solution given in the last post of following thread:
http://forums.teradata.com/forum/database/generate-surrogatekey-with-a-huge-table

-- If you are stuck at something .... consider it an opportunity to think anew.

Adeel Chaudhry 773 posts Joined 04/08
03 Feb 2014

Also, how is the fact table populated? Like is it populated from single source? Or does it have multiple sources populating data into it?

-- If you are stuck at something .... consider it an opportunity to think anew.

ppad001 18 posts Joined 06/11
10 Feb 2014

Adeel,
Its loaded from a single source(TD source table).
I tried with "SUM(1) OVER(ROWS UNBOUNDED PRECEDING).." method as mentioned above, but did not find much difference.

dnoeth 4628 posts Joined 11/04
10 Feb 2014

OLAP functions redistribute the rows from spool 1 to spool 2, so all rows are kept in spool twice.
If you have large rows you might try to calculate the sequence using the PK columns only and then join back. This greatly reduces spool sizes and the join back might be quite fast if it's joining on the PI.
 

Dieter

You must sign in to leave a comment.