All Forums Tools
leo.issac 184 posts Joined 07/06
16 Feb 2009
TPT Vs other ETL Utilities

What is the key factor that makes TPT better than other Load utilities.Given that it uses SQL like language syntax but invokes various Load utility protocols.How is it different and similar from other load utilities? Does it really, Impact performance and how?

ispaleny 11 posts Joined 12/04
18 Feb 2009

For TPT 12:some pros:1. is a new flag ship of Teradata ETL Tools2. should get all new functionality of Teradata3. can read files in multiple streams (higher speed)4. uses an internal pipe mechanism (load does not wait on extraction)5. can easy switch Teradata loading tool6. running the main component as a local service7. multiple ETL steps in one jobsome cons:1. compiled proprietary language with no code/error interaction = very hard debugging2. local unmanaged checkpoint directory with thousands of files3. needs a static column schema4. text field sizes in schema need to multiplied by 2 or 35. low 3rd party ETL tools support6. very complex functionality delays a full functional new version release by several months7. not all Teradata SQL statements are supportedAn ideal TPT project is a high volume load using scripts based on large ASCII text files in US.The worst TPT project is based on a 3rd party ETL tool using a local character set or UNICODE.TPT 13 has some changes in design, but I have not seen it yet.

leo.issac 184 posts Joined 07/06
23 Feb 2009

Thank you, Your posting was quite informative.Here is another question for you. Talking about Data streams, what exactly are they? Is it the system Memory or the portion of Disk drive? How effective would it be performance wise,if we run TPT on a client machine with low Memory?

surish711 2 posts Joined 04/11
21 Apr 2011

Even am wondering what a data stream is meant by (am new to TPT)..It should be sme kind of memory only right?Could someone pls throw some light on it..

feinholz 1234 posts Joined 05/08
21 Apr 2011

TPT "data streams" use internal shared memory.
If your system has low memory, then TPT might have difficulties.

Some corrections to the above pro/con list:

TPT is not an ETL tool. It is a loading tool. Because our script language is "SQL-like" we do have the ability to perform some "minor" tranformations and filtering. TPT is not to be thought of as an ETL tool though.

We do not use pipes. We use shared memory.

I would not classify our checkpoint/restart capabilities as "local unmanaged checkpoint directory with thousands of files". When a job terminates successfully, we delete the checkpoint files. Therefore, there should only be a checkpoint file in the checkpoint directory for each job that is currently running, or for jobs that may have terminated abnormally. And if the user would like to delete these files, we provide commands to do so.

In 13.10 and 14.0 we are making great strides in improving the capabilities of determining the schema dynamically. We have also added enhancements to make the script language a lot less verbose.

When it comes to "3rd party ETL tool support", we actually have a lot of support. We have an API-interface to TPT that is widely used by the likes of Ab Initio, Datastage, Informatica, etc. It is called TPTAPI and is widely used.

For "con #6" I would say that we attempt to deliver patches (efixes) with new functionality as soon as we can get the features into the product. We do not always wait for new releases before providing new features.

Lastly, for "con #7" I would like to know to which SQL statements you are referring when you indicate we do not support all, and why you would want or need all of the SQL statements to be supported. For a loading tool, I am sure there are some SQL statements that just do not need to be supported.


vincent91 14 posts Joined 02/10
17 Nov 2014

Hello Feinholz,
About DATA STREAM and internal memory considerations.
We load the TERADATA by batch processing overnight.
The system is configured with 20 simultaneous loads. We do it through UTILITY LIMITS in Workload Designer.
I understood that TPT DATASTREAM is based on the client RAM memory.
How do I size the RAM on my client (Linux) to respond to 20 simultaneous Teradata's loads ?
I have no idea about the RAM size needed for my activities.
Thanks for your help

feinholz 1234 posts Joined 05/08
17 Nov 2014

Each load job is independent of each other and so it depends on your system memory availability and your virtual memory settings.
Each job, by default, will allocate 10MB of shared memory for the data streams.


ratchetandclank 49 posts Joined 01/08
28 Nov 2014

vincent91, Please check "-h" option in the TPT.

You must sign in to leave a comment.