Plain text icon CollectStatsQueue.txt13.25 KB

When you start designing a statistics collection process topics include:

  • stats on some tables are more important than others.
  • some stats on a single table are more important than others.

Both problems involve some kind of prioritization process, which finally leads to a sorted list of COLLECT STATS statement (See Marcio Moura's blog When is the right time to refresh statistics?)

Processing this list (using a BTEQ script or a cursor in a Stored Procedure) could lead to another problem:

  • only a limited time frame for collection is available.

Therefore you might have to stop the process before all those stats are collected.

But now you have to take care that the remaining stats will be collected during the next run (and probably with a higher priority).

Introducing the queue table

Queue tables have been implemented in V2R6, but are still one of Teradata's hardly known features. Their main purpose is to allow asynchronous event processing in a FIFO and this perfectly matches with our requirements:

  • When the process stops the remaining rows are still in the queue and will be processed first when the next run starts.
  • More statistics might be collected in the same period when multiple processes consume rows from the queue (this is really hard to implement without a queue - how to decide which stats should be collected by which process?).

As an additional benefit ETL-jobs requiring some final COLLECT STATS steps might finish earlier, when those statements can run asynchronous via queue.

The attached script creates this queue table (CollectStatsQueue) plus two log tables (CollectStatsLog and CollectStatsLogDetails), a Stored Procedure (CollectStatsProc) to process the queue and a macro (StopQueue) to stop the processing. Syntax is based on a TD13.0 running in Teradata mode, older releases would require some modifications.


This is the main driver table for the collection process, a COLLECT STATS is triggered for each inserted row.


When your current process results in an ordered list this order of execution is maintained by setting the queue's "Query Insertion Timestamp" column to

 + (ROW_NUMBER() OVER (ORDER BY whatever_determins_the_order) * INTERVAL '00.000001' SECOND)

For a single row insert higher or lower priority might be assigned by simply adjusting this value, e.g.





Optional, when supplied used instead of the column list


Comma-delimited list of column names (for column and index level stats)


Collect on  T (Table), C (Column), I (Index) level


uses SAMPLE when set to any non-NULL value


Records info about every run of the CollectStatsProc including the number of statistics collected and the number of errors during the collect.


Records info for every failing (optionally also successful) COLLECT STATS.


This Stored Procedure actually submits the COLLECT STATS for each row in the queue


Process will stop when a COLLECT finishes after MaxTS or all rows in queue are processed.

If MaxTS is set to NULL the process will run until the StopQueue macro is executed.

IN LogDetails

If set to 'Y' info about each processed row is written to CollectStatsLogDetails, otherwise only failed collects.

IN MonitorInterval

Update the CollectStatsLog table after x minutes with info about the number of statistics successfully collected or failed.


SQL Errorcode: when there's any error outside of the actual COLLECT statement the process stops


Result of the execution: Success/Warning/Failed plus additional info.


Macro to stop all SPs processing the queue: invoke EXEC StopQueue; from any session.

Usage examples

CALL CollectStatsProc(CURRENT_TIMESTAMP(2)+INTERVAL '1' HOUR,'Y',5, rc, msg);

Run for up to one hour, log all rows in the CollectStatsLogDetails table and update the CollectStatsLog table every 5 minutes.

Can be stopped earlier by simply running EXEC StopQueue; from any session.

CALL CollectStatsProc(NULL,'N',10, rc, msg);

Run until StopQueue is executed, log only failed collects in the CollectStatsLog table and update the CollectStatsLog table every 10 minutes.

Running in parallel

As the CollectStatsProc will probably be CALLed in a BTEQ script it's easy to run multiple copies:

CALL CollectStatsProc...

Some remarks on running multiple instances of CollectStatsProc:

  • Collecting multiple stats on a small table using one row per index/column might result in a deadlock (when stats data is merged into the system tables), thus it's better to collect those stats on table level
  • As stats collection is quite resource intensive you should check if additional resources are available, multiple instances are mainly suitable for running in a dedicated batch window.

Sample script

The following example to recollect all existing stats within a database is based on my StatsInfo view:

INSERT INTO CollectStatsQueue
 ( QITS, DatabaseName, TableName, IndexName, ColumnName, CollectLevel, SampleSize)
   + (ROW_NUMBER() OVER (ORDER BY databasename, tablename) * INTERVAL '0.000001' SECOND)
      ,CASE WHEN CollectLevel = 'T' THEN '' ELSE IndexName END AS IndexName
      ,CASE WHEN CollectLevel = 'T' THEN '' ELSE ColumnName END AS ColumnName
      ,CASE -- collect on table level for small tables
          WHEN COUNT(*) OVER (PARTITION BY TableId) > 1
             AND (NumRows < 100000
                  OR SUM(CollectDuration) OVER (PARTITION BY TableId) < INTERVAL '0:10' MINUTE TO SECOND)
             THEN 'T'
          WHEN StatsType IN ('Col', 'MCol', 'Part') THEN 'C'
          ELSE 'I'
       END AS CollectLevel
   WHERE TableType <> 'TempTbl'
   AND MissingStats = 'N'
   AND DatabaseName = ...
 ) AS dt;   

CALL CollectStatsProc(CURRENT_TIMESTAMP(2)+INTERVAL '10' MINUTE,'Y',1, rc, msg);

SELECT * FROM CollectStatsLog ORDER BY StartTS;

SELECT * FROM CollectStatsLogDetails ORDER BY StartTS;

Please provide any kind of feedback (especially about performance when running multiple instances).

ulrich 51 comments Joined 09/09
23 Oct 2012

Very nice idea! Thanks for sharing

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

CHERRY_UZZE 1 comment Joined 07/11
25 Nov 2012

Thanks for sharing, but unfortunately not finding any attachment here.

dnoeth 86 comments Joined 11/04
25 Nov 2012

Strange, there should be an attachement area right to the headline of this blog entry. You could try to empty the browser cache.

Otherwise this is the direct link:



samp6050 1 comment Joined 06/08
05 Dec 2012

Great post Dieter. Thank you for sharing!


Abhishektd 2 comments Joined 06/11
11 Dec 2012

Thanks for sharing.

Thanks & regards,

18 Dec 2012


I am facing an issue in running a teradata procedure.
Getting the error code as 3807,

This procedure is called via informatica mapping and this is a monthly process, suddenly having issue in inserting to a table through the proc.

Till last month this was running fine.

Please help me in resolving this issue.

Sasikala V

Abhishektd 2 comments Joined 06/11
18 Dec 2012

Verify once, that the procedure and the table are qualified with the databasename. This may be one of the reasons behind the issue.

Thanks & regards,

ahnajeeb 2 comments Joined 03/11
07 Feb 2013

Nice idea - We can also use other parameters. I have been able to get very good results considering other factors Like
- Update priority based on Frequency_Of_use (Partition, PI and PPI can get special attention)
- Consider Table / Column level Stats Collection.
- Set a Max limit on CPU to be used by Stats Process.
- Increment the priority of left overs to avoid starvation.
- split the table using MOD function to run # parrallel threads based on State of system.



mikesteeves 3 comments Joined 10/11
14 Nov 2013

Excellent demonstration of a pratical use of a Queue table, excatly what I've been looking for.  
Thanks for the informative post Dieter.

david clough 4 comments Joined 12/09
23 Dec 2013

Like the idea of the Stop Queue Macro, Dieter.
Think I'll take your concept and use it with my 'mixed workload' application that I'm currently building. 
Probably what I'll do is allow the Stored Procedure - the one which does the Consuming - to run 'forever',  unless we decide (either by choice, or by periodic necessity) to Stop the Queue.
So, thank you.

Raja_KT 10 comments Joined 07/09
30 Jan 2014

Yes, the Stop Queue Macro!!!!! amazing. I was in a different world before this :).

Raja K Thaw
My wiki:
Street Children suffer not by their fault. We can help them if we want.

ktewksbury 5 comments Joined 09/09
15 Dec 2015

Are there any recommened changes to this process due to 14.10 new features?  Especially around the area of collecting stats at "table level" with the new syntax, i.e. collect stats column (col1), column (col2), index (column 6) on table...

dnoeth 86 comments Joined 11/04
21 Dec 2015

Hi Kevin,
sorry for the late answer, I overlooked your comment...
This works as-is in 14.10, but I didn't find a way to code combining multiple related stats (e.g. (col1), (col2), (col1,col2)) into one statement to get the benefits of the new optimization.


ktewksbury 5 comments Joined 09/09
05 Jan 2016
   CURRENT_TIMESTAMP  + (ROW_NUMBER() OVER (ORDER BY databasename, tablename) * INTERVAL '0.000001' SECOND),
      ,CASE WHEN CollectLevel = 'T' THEN '' ELSE IndexName END AS IndexName
      ,CASE WHEN CollectLevel = 'T' THEN '' ELSE ColumnName END AS ColumnName
      ,CASE -- collect on table level for small tables
          WHEN COUNT(*) OVER (PARTITION BY TableId) > 1 AND (NumRows < 100000 OR SUM(CollectDuration) OVER (PARTITION BY TableId < INTERVAL '0:10' MINUTE TO SECOND) THEN 'T'
          WHEN StatsType IN ('Col', 'MCol', 'Part') THEN 'C'
          ELSE 'I'
       END AS CollectLevel
   WHERE TableType <> 'TempTbl'
   AND MissingStats= 'N'
   AND DatabaseName = ....
    ) AS dt

Thanks Dieter!  No Problem! 
I was just testing the below/above/attached query for insert into queue table, but it fails with multiple columns not existing in the statsinfo view.  The view here at this site might have been modified or was given to them by Teradata recently, I am not sure as I just started a few months ago.  Anyway the columns that are missing are...numrows, collectduration,& missingstats.  Do you have the original statsinfo view?
Also, they told me that Teradata had told them that it was better to collect at the table level (it was faster) than collecting at column level.  True???

ktewksbury 5 comments Joined 09/09
05 Jan 2016

Sorry, also samplesize column was missing from the statsinfo view as well

ktewksbury 5 comments Joined 09/09
05 Jan 2016

Dieter,  I was just found your new documents on the statsinfo (from above link to statsinfo view), but with these changes how would you implement a change to the insert into the queue table?
Thanks again!

dnoeth 86 comments Joined 11/04
05 Jan 2016

Hi Kevin,
this queue approach was based on TD13/13.10, but my StatsInfo query changed in TD14, I removed some columns, see New StatsInfo query for TD14 for more details.
For TD14 it is actually recommended to collect multiple stats in a single statement. Not neccessarily on a table level, but those which result in a more efficient collection (That's why I wrote I don't know how to find those which benefit from combining them):
New opportunities for statistics collection in Teradata 14.
And in TD14.10 you should look at AutoStats instead, which really simplifies collection:
Easing Into Using the New AutoStats Feature


ktewksbury 5 comments Joined 09/09
05 Jan 2016

Thanks Dieter!

You must sign in to leave a comment.