All Forums Tools
Rajesh.Prabhu 16 posts Joined 08/06
22 Sep 2010
Session and Checkpoint consideration in TTUs and TPT

I am trying to explore the usage of SESSION and CHECKPOINT in Teradata FastExport/FastLoad/Mload and TPT-Teradata Parallel Tranporter.
As SESSION parameter if we dont specity in TTUs, it will take default as 1 for each AMP for 100 AMP system it sets as 100 sessions default. which can be expensive for small table.

But my question is what are factors/criteria we need to consider when you use SESSION and CHECKPOINT parameter in TTUs and TPT? Which impacts perforamce of export and load.

If i want to load table with 1Million record and 100Million record what is the best practices/consideration i should use to define #SESSION and #CHECKPOINTs.

Appreciate the valuable input. Especially from DNoeth and Feinholz.
:-) Danke/Thanks

ratchetandclank 49 posts Joined 01/08
23 Sep 2010

DNoeth and Feinholz... Please respond..

feinholz 1234 posts Joined 05/08
23 Sep 2010

Checkpointing:

The basic rule of thumb for checkpointing is this: how much time are you willing to lose in a batch job if the DBS should encounter a restart?

If you have a job that should run for, say, one hour, is it acceptable to checkpoint every 15 minutes? If the DBS should encounter a restart, that is the amount of data that would need to be resent to Teradata when the job resumes after the restart.

Depending on the row size and the utility, 1 million rows can usually be loaded in under a minute. Therefore, it is probably not necessary to enable checkpointing if loading 1 million rows.

And again, depending on row size and utility, 100 million rows may take 10-20 minutes (could be more, could be less). Even checkpointing every 1 million rows might be too often.

A lot of the decisions for checkpointing will need you to map the time you are willing to lose (for resubmission of the rows since the last checkpoint) with the number of rows per minute that can be loaded so that you can determine the checkpoint value.

FastLoad's checkpointing is row-based. MultiLoad supports both row-based checkpoints and time-based checkpoints.

TPT checkpointing is purely time-based (expressed in terms of seconds).

Session Usage:

The number of sessions to use is dependent upon several factors. Network speed between your load server and Teradata, the amount of data to be loaded are a few. I would think that you can get pretty good performance with about 30 sessions. You could try more but at a certain point, some sessions will just not be used that much if the utility can process the rows fast enough and the network speed is not a bottleneck.

Most of our utilities support some type of config file where you can set the "max sessions" so that the jobs do not try to connect 1 per AMP. These config files should be documented in the respective reference manuals.

--SteveF

Rajesh.Prabhu 16 posts Joined 08/06
27 Sep 2010

Thanks Feinholz thats really good informative note. And thanks to ratchetandclank for redirection as well.

In one of the forum i have read TPT can be use to load set of files in a directory. As we have limit of 15 files for FEXP and FLoad as these TTUs works well with situation of 15 load at a time.

How can we achieve loading more than 15 files using TPT?
Is there any bottleneck/ or performance issue if we load a directory contains 30 to 40 files using TPT?
If TPT is time-based checkpoint what is the rule of thumb for checkpoint we can set for 100M records on appox 400 AMPs system?
What is the relation between SESSIONS and INSTANCE in TPT? what is the thumb rule we can use for seeting these Attributes in above situation?

I am currently evaluating TPT please provide all your expert advice and thoughts.

Thanks in advance to everyone reading and writing this forums..!!

ratchetandclank 49 posts Joined 01/08
27 Sep 2010

Yes, you can use more than 15 files using TPT. You have to use the wildcard in the FileName attribute of DC operator. Like "VARCHAR FileName = '*.dat' ", and the DirectoryPath attribute will point to the directory where the files are stored.

Sessions is the number of sessions connected from the client to the teradata database. INSTANCE is the number of instances of the operator. For example, if you are running against a database having 100 AMPS, and specify 4 instances of LOAD/UPDATE operators, it will spawn 4 LOAD/UPDATE operators, and every INSTANCE of the operator will connect to 25 sessions (Since it is an 100 AMP system).

Rajesh.Prabhu 16 posts Joined 08/06
29 Sep 2010

Ratchetandclank , it means if i have single Instacne of Export on 100 AMP it will utlize 100 Sessions by default. As TPT claims this is fastest in terms of throughput versus generic utilities.

When i ran a Generic FastExport for 1M records on 100AMP system with 4 session it took 33Sec and for 12session it took roughly a minute. Because it does not required those many session for 1Million records, so with less sessions it works well.

But when i was running the TPT-EXPORT for same of amount of data/amps with default ALL SESSIONs it was executing faster than (MaxSession 4 to10) session takes more execution time. I dont know weather i am missing any other attribute to be initialized(set) in TPT scripts.

Currently my requirement is throughput test in terms of performance and minimum CPU/IO utilization.

1M is small is small amount of data i am currently testing but we will be testing more than plenty of 100M records. when i was going through TPT-forum and white papers i am impressed , but i am not getting required result right now.. Is there any thing i am missing.!!

Pls help..!!

ratchetandclank 49 posts Joined 01/08
30 Sep 2010

I did not understand this statement:

it was executing faster than (MaxSession 4 to10) session takes more execution time

Did you find a better performance, time wise ?

The CPU Utilization will be more in case of TPT when compared to the legacy fastexport because there are a lot of other things that TPT needs to do when compared to legacy fastexport.

There are no attributes in export operator which directs TPT to increase performance as such.. Increasing the number of instances would help because there will be two processes sending the records instead of a single process. However, it will have an effect on the CPU utilization. We need to find a balance.. That would depend on the power of the client system we are using.

Rajesh.Prabhu 16 posts Joined 08/06
30 Sep 2010

I Apologize for my statment..

Well i was not getting the better peformance using TPT-Export for following ..!!

Traditional TTU-FEXP : 1M records / 100 AMPs system / Sessions(4 to 10) = Excecution time 30sec.

TPT-EXPORT : 1M records / 100 AMPs system / 4-10 Sessions = Excecutiong time is More than 3Min. But if specify more session upto 40-50Session execution time is better.

In Traditional TTU if i reduce the # of sessions it gives better performance,
Whereas in TPT if reduce # of session gives worst performance.
I am not getting rule of thumb for defining session in TPT..

Hope my statement is explainatory..!!

As you said if we increase number of INSTANCE will give better performance can share some sample code..!

ratchetandclank 49 posts Joined 01/08
04 Oct 2010

No need to apologize..

The legacy Fastexport writes to a file. With TPT, are you using dataconnector operator as consumer ? I still am not able to believe that export operator is taking more time than fastexport to load the records..

Might require Feinholz's assistance..

Rajesh.Prabhu 16 posts Joined 08/06
13 Oct 2010

I think there is "NO Rule of Thumb" for defining Sessions or Instance in TPT. I came to a conclusion after running set of TPT scripts and i wud like to share my experience with you ALL.

TPT-EXPORT
1st Run:
1)Run the EXPORT with ALL available Sessions(1 per AMP) , with 1 Producer and 1 consumer Instance.
2) See the Job execution time.

2nd Run:
1)Run the EXPORT with ALL available Max Sessions(No change in MAXSessions) , with 2 Producer and 1 consumer Instance.
2) See each Producer are balancely exported #rows and compare each Producer Instance CPU time. If it is "Balanced accross both the Instance" then Increase number of Producer Instance.
Note: Try to maintain Number of Instace in mulitple of Available AMPs.
3) Similarly try to increase Producer Instance in multiple so that your job execution will give expected exection time.

3rd Run:
1) once you identified number of Producer instance required for your Export JOB, try to reduce number of MaxSessions because all sessions are not required for any TPT job. So Try reducting number of sessions in multiple of number of available amps and see if your job run faster.

Similar approach for balancing of Instance and Sessions you need to do when you run TPT-LOAD/UPDATE.

with above consideration i got better result in TPT-Export versus FastExport.

Traditional utitlies(FEXP/FLOAD) we have rule of thumb for Sessions and checkpoint based on size of data.

In TPT its bit difficult to get rule of thumb for identifying number of sessions/Instance

Now the question is which tool is better for extract and load one can decide?

Thanks for all viewers and experts who showed there interest in this BLOG.

feinholz 1234 posts Joined 05/08
14 Oct 2010

There is no Rule Of Thumb, unfortunately. And that is just the nature of the beast when dealing with parallel processing. There are a lot of factors (network speed, CPU bandwidth, I/O bandwidth, etc.).

I am still trying to obtain more information which might help to explain the differences in performance we see between FastExport and the Export operator.

For FastExport, the default of 4 sessions was determined to be the optimal number, and I believe for many scenarios it still holds true. Once the SELECT statement is issued and the rows are spooled, it does not take much to actually process the data. Without going into the inner workings of the Teradata Database with how the rows and blocks are distributed, suffice it to say that 4 sessions seems to be an optimal number for FastExport.

The Export operator, though, can take advantage of parallel processing (as long as CPU bandwidth is not a bottleneck) on the client side for extracting out the rows from within the blocks of data being returned from Teradata through the use of multiple instances. Thus, it is possible for a larger number of sessions to assist here.

However, like all things related to TPT, your mileage may vary. I have a report from someone that was running with TPT 8.2 that tried many combinations of sessions/instances and found that 32/8 was faster than 8/1 or 8/2, and even faster than 4/1, 16/4, 64/16, etc.

But 8.2 was a long time ago. We have made a lot of performance improvements in TPT since then, including the notion of "buffer mode processing" that enables the Export operator to buffer up the rows of data prior to sending the blocks to the Load operator (as opposed to the standard way of sending data from producer operator to consumer operator on a row-by-row basis).

So, without belaboring the point, the main message of this post is to indicate that every site, every job, every scenario, might lead each and every customer to a different set of settings for their job, and some of the fine tuning will involve trial and error.

In the 14.0 timeframe, we are attempting to execute more performance characterizations in the hopes of producing some type of report or list of recommendations. Hopefully, that project will be successful. Time will tell.

--SteveF

Rajesh.Prabhu 16 posts Joined 08/06
14 Oct 2010

Thanks Fienholz..
The reason being evaluting TPT is , I(My organization) looking for solution i.e. robust and should give better performance in terms throughput with less development and maintanenece cost.

We are using Traditional utilities (FEXP/FastLoad) since over 10 years, but i read some productive information about TPT like

1) Export and load can be processed in Parallel enviroment as compare to Standalone(FEXP/FastLoad) which happens Serially.
2) Process the set of files (Directory Scan) in a directory with single Load scripts. There is no limit as well
3) You can Export and Load using single script without landing the data into the file which save I/O and throughput. (Is there any limitations of size of data can be processed?)

So i thought why not propose new feature TPT(Eearlier was TWB) which can serve your all our objectives.

I really like the TPT features but my challenge was defining TPT-attributes without any guidelines.
A single Job i need to run several times with different MaxSessions/Instance until i get better results. Unless proper set of guidelines for these Attributes or parameters desiging and testing TPT would be a challenge.

Hope TD14 come up with some good Performance Tricks+Guidelines which suffice most of the objectives.

One thought even if new Version can tell us the Explain/Execution plan of TPT would be an added advantage..!!

feinholz 1234 posts Joined 05/08
20 Oct 2010

I am not quite sure an official explain of the execution plan would necessarily help you. I believe our documentation has information in it regarding the amount of resources being used based on the number of instances you specify for each operator.

By default, each data stream uses 10MB of memory. There is a data stream that connects each producer instance to each consumer instance (even though they may all not be used).

So, if you have one instance of each operator, there is one data stream. If you use 2 instances of each operator, there are 4 data streams, etc.

Each instance of an operator is a separate process. The infrastructure also has several processes running in the background handling things like checkpoint timers and the log server.

As far as any execution plan, any other info would not be of use to you.

The performance characterization is being scoped for 14.0, I am not sure whether it will be committed. As always, it is a resource issue.

I think that once you get a handle on your environment, your network bandwidth, your CPU bandwidth, disk speed, etc., you will be able to figure out which configuration of producer and consumer instances will work best for your environment.

--SteveF

vincent91 14 posts Joined 02/10
31 Oct 2014

how can I manage the number of sessions used by TPT Streams ?
 
Unlike TPT Load operator, Stand Alone FastLoad or  Stand Alone MultiLoad, we cannot define a session limit for TPT STREAM in workload Designer (Viewpoint portlet). Of course I can use Max sessions/Min sessions attributes in each job but it is tedious to control every job.
I'd like to handle the number of session used by TPT Stream operator globally as it can be done with TPT utilities in Workload Designer/Utility Sessions/Session.
 
Any idea or suggestions how to do that ?
 
Thanks
 
 

You must sign in to leave a comment.