Teradata uses a hashing algorithm that somewhat mimics a "random" distribution. It's not completely random since the same primary index value will always go to the same AMP. The fewer rows that you have in your table compared to the number of AMPs in your system, the more uneven your distribution will be. This is not any different than a true random distribution. If you had 200 rows in your table and you had 200 AMPs, if you randomly distributed the data across the AMPs, you would end up with no data on some of the AMPS and as many as 5 or more rows on other AMPs. As the number of rows in a table grows, the distribution of the table across AMPs will become more even.You don't really have to worry about the uneven the distribution of small tables as long as your queries are not redistributing your large tables in order to join to your small tables. If this happens, your queries could become skewed to one or a few AMPs. Usually, if you have good statistics on the small table (and other tables involved in the query), the optimizer will instead choose to duplicate the small table across AMPs in order to join to larger tables. This would allow the query to be as evenly distributed as the large table.Hope that helps.

I have 4000 rows in a table... That table has a Unique primary index defined on one column. its datatype is varchar(19) datatype.I have a 20 node system which has 200 AMPs. There is data distribution skew ranging from 90 to -50...Could anyone help why this happens...is teradata parallelism works only with HUGE data ????Thanks,Pots.