All Forums Tools
indra91 16 posts Joined 06/16
18 Jul 2016
TDCH-TPT Interface--For loading into Hadoop

I need to ingest volume of data from Teradata to Hadoop using TPT.I saw in the TPT documentation that we can achieve this using TDCH-TPT interface.I would like to know the following about the process:

  1. Whether it follows the same process and extracts data block by block.
  2. Whether it utilizes all the nodes in the cluster while loading into Hadoop.
  3. In this case whether TPT needs to be installed in all the nodes in the hadoop cluster?
  4. For 1 single table ingestion and export to hadoop whether both the read(Teradata) and write(Hadoop) whether both the process are multithreaded while using TDCH-TPT interface.
feinholz 1234 posts Joined 05/08
19 Jul 2016

The use of TDCH by TPT is basically performed by TPT sending a command to the name node of the Hadoop cluster to execute the TDCH command, passing the needed information as command line options to TDCH.
 
If you are using the Export operator to extract from Teradata, then the data is processed block by block.
 
TDCH will use as many mappers as needed (or as indicated by the user).
 
TPT does not need to be installed on any of the Hadoop nodes.
It is generally installed (with the rest of TTU) on the client server.
The Export operator will run as a multi-process operator if you tell it to use more than 1 instance. TPT is not a multi-threaded application.
 
For information about TDCH and whether it is multi-threaded, you would have to refer to the TDCH documentation.
 

--SteveF

You must sign in to leave a comment.