NOTE:  This content is only relevant to sites on Teradata 12 or earlier.

Congratulations, you’re on Teradata 12! Now it’s time to start redefining your AMP worker task (AWT) monitoring procedures by making use of the new ResUsageSAWT table.


There’s a lot of reasons to like this new table: It’s part of the ResUsage group of tables, so no special operating system privileges are needed to access the data. It combines both AWT in-use counts with information on the message queue length and possible flow control, all in one row. You can selectively manipulate the data using SQL.


But there’s a special technique you’ll want to use, if you’re accessing the ResUsageSAWT table directly. This technique is needed only if your ResUsage logging rate is a different number of seconds from your collection rate. If you don’t embrace this technique when you have differing rates you can expect to see inflated AMP worker tasks (AWT) in-use counts. So keep reading, and I’ll tell you why this happens, and how to make adjustments in your SQL that will produce in-use counts that match reality.


Different Logging and Collection Rates


The values carried in the ResUsageSAWT table in-use count columns report a sum of the in-use counts reported in each collection period across that log period. If your logging and collection rates are different, then there will be multiple collection intervals in each log period. These summed-up numbers need to be divided by the value in the CollectIntervals column in order to provide an average over the logging period, rather than a sum. If this step is not taken, numbers reported in WorkTypeInUse00 through WorkTypeInUse15 columns, as well as in the MailBoxDepth column, are likely to appear higher than they actually are.


The CollectIntervals Column


It is the CollectIntervals column in the ResUsageSAWT table that tells you if the collect rate and the logging rate were a different number of seconds for that collection. This column indicates how many collect intervals are contained within each log interval.


Here is a small excerpt of output from that ResUsageSAWT data illustrating a CollectIntervals value of 1. In this case, the logging rate and the collection rate were both 60 seconds. In this case, no special technique is required to evaluate the data.

 

Real-World Example Before and After the SQL Adjustment


Below is ResusageSAWT output where the logging rate was 600 seconds and the collection rate was 60 seconds, and the SQL that was initially used to access the data. Note that this first SQL statement accesses the WorkTypeInUse00 (that’s MsgWorkNew) and WorkTypeInUse01 (MsgWorkOne) values without any adjustment, showing the exact values carried in the SAWT table:

SELECT  
Thedate
,thetime
,nodeid
,vprid
,secs,collectintervals 
,WorkTypeInuse00 AS worknew 
,WorkTypeInuse01 AS workone 
,inusemax
FROM dbc.resusagesawt
ORDER BY worknew DESC;

Here is a subset of the output from that SQL:

WorkTypeInUse00 exceeds the physical limitation of 50 for the MsgWorkNew work type. In addition, the InuseMax column is less than the WorkTypeInUse00 and WorkTypeInUse01 values combined. Both are physically impossible.


Using the same SQL as above, but dividing the statistics columns by the CollectIntervals value results in an accurate reporting of in-use counts:

SELECT  
Thedate
,thetime
,nodeid
,vprid
,secs,collectintervals 
,WorkTypeInuse00/collectintervals (FORMAT '99.99') AS worknew 
,WorkTypeInuse01/collectintervals (FORMAT '99.99') AS workone 
,inusemax
FROM dbc.resusagesawt
ORDER BY worknew DESC;

In summary, If your logging rate is larger than your collect rate, make sure when you access in-use counts, as well as your MailBoxDepth column, to divide by the value in CollectIntervals.
 

Discussion
monisiqbal 17 comments Joined 07/09
08 Dec 2009

In your example, the value of workdone represents the work done in 60 seconds. Right?

carrie 595 comments Joined 04/08
09 Dec 2009

There are a couple of different examples of ResUsageSAWT output in the original text. Let’s consider them separately.

The first example, under the heading The CollectIntervals Column, represents the AMP worker task in-use counts at the end of a 60 seconds logging and collect interval (where the 2 intervals are equal). It is not really representing the work done in 60 seconds, rather it is a snapshot of the in-use counts (number of AMP worker tasks that are being used) at the end of the collect interval.

The other 2 examples, under the heading Real-World Example Before and After the SQL Adjustment, show a case where the logging interval is longer than the collect interval. In those cases, the in-use counts have undergone a division and represent an average of the in-use counts across all collect intervals contained within the logging interval. That is why they carry an odd-looking decimal place.

Thank you for your question. I hope my response is helpful in answering it.

monisiqbal 17 comments Joined 07/09
09 Dec 2009

Carrie, thanks for explaining both the examples.
Just to be clear, in the Real-World example, the value of WorkDone=140 corresponds to 600 seconds.
And in the SQL Adjustment example, WorkDone=14.00 represents the work done for 60 seconds.

Please correct me if I'm wrong.

Much thanks.

carrie 595 comments Joined 04/08
10 Dec 2009

Almost, but not quite, so let me try to clear this up for you. (I think you meant "WorkOne", right?)

"WorkOne=140" represents the sum of all AMP worker tasks in-use counts for each collect interval within the logging interval. The collect interval is 60 seconds, the logging interval is 600 seconds. A snapshot of how many AMP worker tasks are being used by that work type (WorkOne in this case) at the end of each collect interval is taken. With each succeeding 60-second collect interval, that in-use count number grows, and what appears in the SAWT output is that sum at the end of the logging interval of all those different point-in-time snapshots. This is not "work accomplished" or "CPU time," it is a number of in-use count snapshots accumulated over time.

In the adjusted example, "WorkOne = 14.0" represents an average in-use count for that worktype across that logging interval of 600 seconds. It does not represent work done. It represents what the average number of AMP worker tasks that were being used by that worktype for the entire logging interval.

Thanks, -Carrie

monisiqbal 17 comments Joined 07/09
10 Dec 2009

Got it! This was really helpful, cleared up the confusion. (And yes, I meant "WorkOne")

Thanks again.

vincent91 20 comments Joined 02/10
09 Feb 2010

thank you for all your contributes, it's really help me.
Can anyone explain to me (with simple words :=)) the meaning of MSGWORKNEW and MSGWORKONE ?

The description in Chapter 7 of TUDoc does not really clear for a beginner like me

thanks again

carrie 595 comments Joined 04/08
10 Feb 2010

I can try to clarify that for you...

Messages arriving on the AMP that require an AMP worker task (AWT) to process the work they contain are assigned one of several different “types”. These message types categorize these arriving messages based on the importance of the work they will do, and on how much they contribute to completing work that is already underway in the database.

WorkNew (sometimes known as MSGWORKNEW or WorkType00) is a type of message that is sent from the dispatcher that represents a new piece of user-initiated work, such a step in a query plan. The message could specify to start a table scan on the customer table, for example.

WorkOne (sometimes known as MSGWORKONE or WorkType01) is a type of message that is sent from an already executing WorkNew AMP worker task, and often represents what is referred to as spawned work. If the query step that is performing a table scan of the Customer table requires a row redistribution of the rows being accessed by the scan (for example, to prepare for a join in a subsequent step), then a WorkOne message is spawned (not by the dispatcher, but by one of the WorkNew AWTs working on that same step) and is sent to all AMPs with instructions to acquire a second AMP worker task on behalf of the same query step.

Two AMP worker tasks are often required within one step for things such as row redistribution or table duplication, or global aggregation. In the case of row redistribution, the WorkNew AWT does the scanning and sending rows to their new AMP, while a WorkOne AWT receives the redistributed rows on each AMP and writes them out to spool, possibly sorting them locally first. So in this case each AMP will have both a sender AWT (WorkNew) and a receiver AWT (WorkOne) working on behalf of the query step.

vincent91 20 comments Joined 02/10
11 Feb 2010

Thanks Carrie for your light. I learnt a lot of things with your blog. It's a gold mine for me.

geethareddy 145 comments Joined 10/11
01 Jun 2012

Hi Carrie,
we recently migrated from Sybase(SB) to Teradata(TD), right now we are on V14,
We started loading data in to TD using TPT oad/update operators from SB to TD, i am trying to generate the AWT reports on the following thing, on which i have few questions and concerns,

By end of each day, we would like to see how many AWTs consumed by each load, i heard i need to sum Worknew &WorkOne. But i dont think so this is right equation to calcualte AWTs by each load. Can you confirm which worktypes i need to take in to consideration

Also i was wondering for the filed Collectintervals not found in V14. After some research i realized that it was removed from V14. So that means we no need to divide with Collectintervals or not.

Thanks,
Geeta

carrie 595 comments Joined 04/08
04 Jun 2012

The number of AWTs used by load jobs is easy to predict and you don't need to look at Reusage data. For each load job, here is what will be used:

FastLoad or TPT Load Operator:
Phase I: 3 AWTs per AMP (1 sender, 2 receivers)
Phase II: 1 AWT/AMP

MultiLoad or TPT Update Operator:
Acquisition Phase: 2 AWTs per AMP (1 sender, 1 receiver)
Apply Phase: 1 AWT/AMP

Load jobs use WorkNew and WorkOne work types. But summing up WorkNew and WorkOne will give you all the AMP worker tasks used by all the user-initiated work. Plus what ResUsageSAWT reports is point in time for that logging interval, so if your load jobs exceeded your resusage logging interval, you would be double-counting or probably more, if your goal is how many AWTs were used by a single load job.

You are correct that CollectIntervals no longer appears in the ResUsageSAWT table, so there is no longer a need to divide by that column for fields like MailBoxDepth or in-use counts. That division is automatically done for you. What you get reported is an average across all the collect intervals that make up the logging interval.

Thanks, -Carrie

geethareddy 145 comments Joined 10/11
05 Jun 2012

Thank You Carrie

Thanks,
Geeta

vasudev 35 comments Joined 12/12
31 Dec 2012

Hi Carrie,

I want to know the general usage of DBC.ResUsageSAWT table. When i saw the column details, it contains lot many columns. Is there any way we can indentify whether the the user is out of awt in DBC.ResUsageSAWT table? By using DBC.ResUsageSAWT can we identify the spool space used by the user? Please clarify.

Thanks in advance
Ganapathy

carrie 595 comments Joined 04/08
03 Jan 2013

Ganapathy,

The ResUsageSAWT table is primarily useful for assessing AMP worker task metrics by AMP. It provides AWT in-use counts and maximum in-use counts across the logging interval by work type for each AMP in the configuration, so you can see how close you are to running out of AWTs. It also provides information on flow control, how many times each AMP entered the state of flow control, for example.

The user does not run out of AMP worker tasks, it is the AMP that may run out of AMP worker tasks. When that happens a user's query may have to wait until one or more AWTs are freed up by other queries. There is no mechanism in the SAWT table to see which user was using how many AWTs or if a user's queries were delayed by waiting for an AMP worker task. Howevever a different table, the ResUsageSPS table, can provide AWT in-use count information by workload. See the documentation on the ResUsageSPS table.

The ResUsageSAWT table is only concerned with the flow of work on the AMPs, in terms of AMP worker task usage and flow control metrics. It does not record anything about a user's use of spool space.

There is an orange book on AMP Worker Tasks that goes into more detail on this topic, if you or anyone you work with have access to the orange book repository.

Thanks, -Carrie

suhailmemon84 64 comments Joined 09/10
18 May 2013

Hi Carrie,
why are awt requirements for the first phases of a fastload vs an mload different?
FastLoad or TPT Load Operator:
Phase I: 3 AWTs per AMP (1 sender, 2 receivers)
Phase II: 1 AWT/AMP
MultiLoad or TPT Update Operator:
Acquisition Phase: 2 AWTs per AMP (1 sender, 1receiver)
Apply Phase: 1 AWT/AMP
In other words, why does a fastload have 1 sender AMP 2 receivers AMPS and an MLOAD has 1 sender 1 receiver AMP?
Regards,
Suhail
 

carrie 595 comments Joined 04/08
21 May 2013

Suhail,
 
Fastload has two receiver tasks (a workone and a worktwo) while MultiLoad only has one (workone) during the acquisition phase. This is because FastLoad uses a pre-sort approach when initially appending the rows to the table, whereas MultiLoad appends the rows directly to the worktable without any pre-sorting required.  The pre-sort approach used by FastLoad has proven to be very efficient for that utility, but in the past contributed to the receiver AWTs sometimes falling behind, which is why a second receiver AWT was added.  I cannot tell you why MultiLoad did not exhibit this same issues with only one receiver, but I would assume it is related to differences in how the two utiliites were architected.   
 
Thanks, -Carrie

suhailmemon84 64 comments Joined 09/10
22 May 2013

Thank you for this info. I learned something today.
Regards,
Suhail

brokenp87 2 comments Joined 05/12
04 Mar 2016

One question, when the value of TheTime is '04:10:00.00' all the data regarding awts refers to the interval from 04:00 to 04:10 or refers to the interval 04:10 to 04:20?

carrie 595 comments Joined 04/08
07 Mar 2016

TheTime is defined in the Resource Usage Macros and Tables  manual as:  Nominal time of the log entry.
 
Since the log entry is not made until the end of the interval, this means TheTime represents the end of the logging interval, which is your first assumption:  When the value of TheTime is '04:10:00.00' all the data regarding awts refers to the interval from 04:00 to 04:10.
 
Thanks, -Carrie

You must sign in to leave a comment.