All Forums Data Modeling
Raja_KT 1246 posts Joined 07/09
18 Jul 2013
Modeling for NUPI and USI combination

Hi,
I think that for insert intensive I will use NUPI and USI combination. I am not using UPI. But the columns chosen for the UPI of a table are frequently the same columns identified as the PK during the data modeling process. Can you please suggest me?
 
Thanks and regards,
Raja

Raja K Thaw

My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1

Street Children suffer not by their fault. We can help them if we want.

dnoeth 4628 posts Joined 11/04
19 Jul 2013

Hi Raja,
choosing the PI is mainly based on joins (plus WHERE-conditions, but these can be covered by a secondary index, too)
Only if the PK is actually the most frequent join path you should implement it as UPI.
If you create a NUPI a USI on the PK is only neccessary when you actuall use it in WHERE-conditions. But if it's only because of the logical correctness of you data you don't need it, this should be maintained during load.
 
Dieter

Dieter

vrushang 4 posts Joined 11/10
24 Jul 2013

As UPI are can not be created on the PK column how to maintain the uniqness of the records as per LDM ??
I guess you are trying to create partition  table with PK column not being the partitioning column.
 USI can be defined for partitioned table to maintain the uniquess of the records.
Creating  NUPI and USI on the PK column would help maintain uniqness of the record within a partition.and may also establish the referential integrity
 

HCDM 21 posts Joined 05/10
19 Aug 2013

Using a USI to ensure uniqueness is not neccesarily the way to go.  Suppose your model has a PK specified that will give uniqueness, but your actual table has rows that are end dated so a new version of that row can be inserted.  That requires no unique index on the PK. Also, Soft Referential Integrity (Soft RI) is an alternative.

Lee Arnett
Senior Product Manager – Healthcare Data Modeling
Monument, CO
719-351-0766
Lee.arnett@teradata.com

Jake Kurdsjuk 9 posts Joined 06/13
20 Sep 2013

When choosing your PI, as Dieter has stated, your primary goal is to optimize access moving forward, keeping in mind both its role in data distribution across the platform and how it may impact load processing.  UPI's on PK's are fairly common on other database platforms but are often poor choices on Teradata since any access by PI would require all columns.  NUPI's are usually the best possible choice, using a column that will be used frequently in access and join processing.  When moving from an LDM to a PDM you should look within your subject area to find shared identifiers across your various tables (like an Account ID) that is a column within the PK of most/all of the tables.  Choosing a column like this means that all joins between tables in that SA will be AMP local (no row redistribution required).  And if you are worried about minor skewing on a table or two remember that when you go to join the tables the data will first be redistributed by that column that caused the skewing (i.e. your spool is skewed) every time, so measure the cost in lost table space due to the skewing against the cost of spook skewing every time a join is performed.
Specifying a USI on an "insert intensive table" may be problematic as secondary indexes immediately eliminate your ability to use some load utilities.  They must also be maintained as a part of the insert, which will impact how long that may take.  As Lee suggests, if the only goal of the USI is to enforce the PK then there are likely better alternatives, including soft RI and enforcing it as a part of the load process.  I almost never suggest, let alone specify a secondary index until I have data and access demographics.  Otherwise it's just a best guess as to whether the Optimizer will ever choose to use it.  Often times it's best to see how the table performs with the PI only and then look to optimize performance with Secondary and/or Join indexes as necessary.

Jake Kurdsjuk
Product Manager
Teradata Communications Data Model

You must sign in to leave a comment.