All Forums General
sunnysteve 13 posts Joined 10/13
03 Jun 2014
Disadvantages of denormalization in teradata

Hello ,
 
can some tell me why denormalization may not used or not good in teradata? 

Raja_KT 1246 posts Joined 07/09
04 Jun 2014

If you can visualize the structure of data with no normalization,ist normal form, 2nd normal form... then you can realize it.

 

Having a wider row , may not fit the data block size.It may give rise to performance issue, I/O issue.In turn, it may affect

other dependent applications. Update anomalies can happen.

 

My opinion is 3rd normalization is good. On top of that build  semantic layer(s).

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

andydoorey 35 posts Joined 05/09
08 Jun 2014

Denormalisation in Teradata tables is an engineering decision, i.e. there isn't necessarily a right answer - it depends on what works.  It's best not to denormalise columns that can change, as if you do then you need to have a process in place to keep them updated.  However here are some examples where denoralisation may be good:
Assume we have 3 tables, CUSTOMER, ORDER, ORDER_ITEM.  The ORDER_ID is unique to an order.  In 3NF the ORDER_DATE column should only exist on the ORDER table, and should not be needed on the ORDER_ITEM table.  However this column is likely to be used extensively, and shouldn't ever change, so it would be sensible to denormalise this column.  You may also want to partition the table on that column too, which would improve query efficiency.
If the customer belongs to a particular business_unit, and this is used in your company, then again this may be denormalised to the other 2 tables.  
The CUSTOMER_ID is also not needed on the ORDER_ITEM table in 3NF, but again it would be sensible to denormalise it, and you may want to use it as your primary index on all 3 of the tables to improve join efficiency.
A further example of where denormalisation may be sensible is for slowly changing dimensions.  If we have a CUSTOMER_ADDRESS_HISTORY table with the customer's address, and start and end date columns, then to find a customer's address when they ordered an item you would have to join on the order_date between the address_start_date and address_end_date.  If the customer's location when they ordered is useful for your business then denormalising the address details at the time of the order (or maybe just a region if that is what is needed) onto the ORDER and/or ORDER_ITEM tables would make reporting sales by region far easier.

You must sign in to leave a comment.