All Forums UDA
marcmc 112 posts Joined 12/05
31 Aug 2007
Data Skew

I noticed that last night during our data warehouse run that some log entries were not occurring towards the end of the run.The log generated from our controlling linux problem suggested that the load completed to the end but our teradata log table stopped getting entries about 4 stored procedures from the end.I ran a ske indicator piece of sql that showed our two main log tables with very high skew approx 82%.Would this be a reason for the log entries not going in?

marcmc 112 posts Joined 12/05
31 Aug 2007

I think the fix is to backup the current table, redefine it with a more suitable primary index and copy back in the data to the new table.I think this will redistribute the data across the amps again based on the new primary key. Correct?

Deedz 1 post Joined 10/06
31 Aug 2007

Although from what you have mentioned recreating the table with a new PI ( mind it not PK they are not the same in Teradata) should be a viable option, but I would suggest you do the below :a) Do a data demographics analysis. You can use the Help Stats functionality.b) Based on the above see if the skew is not because of lots of NULLs in the current PI. I had a similar experience when a table was very much skewed, The problem was a large number of Nulls in the PI columns. This was resulting in a Hot AMP. I filled the Nulls with Dummy value and that promtly removed the skew.c) Use Index Wizard if you want.Let me know if these worked.

marcmc 112 posts Joined 12/05
31 Aug 2007

I've checked the stats.select hashamp(hashbucket(hashrow(job_id))) ,count(*)from log_tablegroup by 1;11 140 327 413 13122 this is the problem1 305 329 2364 62 1012 123 88310 88 306 28I know exactly why this happened. There's a very valid if very stupid explanation.There is a date field i could use to give a very even partition.There is also one other column I could use(a description column).What I really want to know is:If i drop the table and re-create with a new PI will it automatically rebalance the data across the amps?

novice 38 posts Joined 07/07
22 Sep 2007

Yesdropping the table and recreating with a new PI will result in automatic reallocation of the data across AMPs.cheers,

marcmc 112 posts Joined 12/05
28 Sep 2007

thankyou.

You must sign in to leave a comment.