Hello,Skewness is the statistical term, which refers to the row distribution on AMPs. If the data is highly skewed, it means some AMPs are having more rows and some very less i.e. data is not properly/evenly distributed. This affects the performance/Teradata's parallelism. The data distribution or skewness can be controlled by choosing indexes.HTH.Regards,Adeel

-- If you are stuck at something .... consider it an opportunity to think anew.

thx how 2 find out skew value?

Try something like:SELECT TSIZE.DatabaseName ,TSIZE.TableName ,TDEF.CreateTimeStamp AS Created ,TDEF.LastAlterTimeStamp AS LastAltered ,TDEF.AccessCount ,TDEF.LastAccessTimeStamp AS LastAccess ,SUM(TSIZE.CurrentPerm) AS CurrentPerm ,SUM(TSIZE.PeakPerm) AS PeakPerm, (100 - (AVG(TSIZE.CurrentPerm)/MAX(TSIZE.CurrentPerm)*100)) AS SkewFactorFROM DBC.TableSize TSIZEJOIN DBC.Tables TDEF ON TSIZE.DatabaseName = TDEF.DatabaseName AND TSIZE.TableName = TDEF.TableNameGROUP BY 1,2,3,4,5,6;Lower the skew factor the better - although if you have only a very few rows on a table you are going to have a skewed table no matter what!

In Teradata Administrator you can simply right-click the table and select "Space Summary".Regards,Adeel

-- If you are stuck at something .... consider it an opportunity to think anew.

What is a acceptable Skew Factor ? Say if one of my table is having skew factor of 16. Is it acceptable?--Thanks in advance.

how to check the skew factor?

suppose i use Primary index that situation how to distributes the row?

suppose i use unique Primary index that situation how to distributes the row?

normally if we u PI then data will distribute evenly thats not an issue

if we use UPI then how we have to distributes the records evenly

Hi,

how to distributes the row in mload if we use UPI and how in fload if we use UPI and how if we use NUPI

and finally i have to distributes the row evenly to all AMPs for best skew factor,

any one which best way to distribute the row evenly in mload and fload when we use UPI and NUPI

Skew facter is tell the distribution of rows

if uniformally distribution i.e skew is the zero

if the skew factor is reverse of parllel efficence

Hi,

Below is the query which is usefull to find out skewfactor.

`SELECT `

`TABLENAME,`

`SUM(CURRENTPERM) /(1024*1024) AS CURRENTPERM, `

`(100 - (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR `

`FROM `

`DBC.TABLESIZE `

`WHERE DATABASENAME= <DATABASENAME> `

`AND `

`TABLENAME =<TABLENAME> `

`GROUP BY 1;`

Hi,

The data distribution of table among AMPs is called Skew Factor.....

Generally For Non-Unique PI we get duplicate values so the

more duplicate vales we get more the data have same rowhash

so all the same data will come to same amp, it makes data

distribution inequality,

One amp will store more data and other amp stores less

amount of data, when we are accessing full table, The amp

which is having more data will take longer time and makes

other amps waiting which leads processing wastage

A skew factor of 0 indicates that the data is perfectly

distributed across all the AMPS

In this situation (unequal distribution of data)we get Skew

Factor High........

For this type of tables we should avoid full table scans

Skew factor is distribution of rows of a table among the

available no.of AMP's.

If your table has a chance of using unique primary index,it

is always better to use UPI which ensures the skew factor

around 0%.

If there is no chance of having unique values column in a

table choose a column as PI(primary index) which has less

duplicate values which inturn results in less skew factor.

That is the data will be distributed almost(not exactly

equal percentage) equally to all AMP's.....

ex:

AMP0 AMP1

10000(10%) 9000000(90%)

in this situation skew factor is very high 90%

could any one please tell how we choose our primary indexes

teja sopalli

Hello,

if the table and views contains the same values in all record in 3 out of 7 columns; does it impact the performance?

I was told that having all values the same does tend to lead to skewing. Is that correct statment? Does it make sense to remove those 3 columns with the same values for performance and space reazon?

thank you.

The table will only be skewed if the Primary Index is based on those columns.

Of course it's strange when there's only a single value in a column, what's the business meaning?

If you drop the column you can add it as a literal to the view-definition. But if there might be other values in future, better apply `COMPRESS 'thisvalue'`

and it's using almost no space.

Dieter

Thank you Dieter for your reply!

the columns are - effective_dt, expiration-dt, load_ts and the values in effective_dt and load_ts are identical in all rows. The expiration_dt has the same value in all rows as well. And there is not much of a business value at all. It was design that way before. The question is will it improve performance if those columns are dropped. The entire table is only 1.8 gb. So maybe just leave it alone?

efct_dt Expr_dt load_ts

5/16/2016 12/21/9999 5/16/2016

The table will be smaller when you drop those columns (or COMPRESS them), at least a Full Table Scan will be faster.

Dieter

Thank you Dieter

A NUPI column with high number of NULL result in high skew factor.

if yes then why ? as NULL cannot be compared with each other.

This is because of the location of all NULLs on a same (and single) AMP, according to the hash process

Pierre

BTW,

i just notice a case where a small table is displayed with a high skew by TD Administartor (15.00):

Space by AMP gives 8 lines with almost the whole of 1.3 M lines on one AMP, although a query shows a good distribution on all AMPs (60).

after copy on a new temp table TD Administrator display is correct, i mean identical to the query result: how to explain that ?

It looks like TD Admin had lost information on the first table.

Pierre

Thenk you Pierre..

It happens some time when our database behave unexpectedly..

Even i checked practically and found that null values create same hash code ..so more nulls can cause high percentage of skew on table.

hi what is skew factor????? what is the use of that?