The objective of this article is to explain step by step how to improve Talend in Teradata environment using Teradata utilities (FastLoad, MultiLoad).

Part 1 : Application setup

Prerequisites: Setup Talend TOS environnement for Teradata

  • Download TOS All available for all environment (Window, Linux, MacOs)
  • Install TOS

Step 1 : Download Talend

The download of Talend Open Stiudio Zip file is available on: http://www.talend.com/download.php#

Step 2 : Unzip on your local directory

Step 3 : launch TOS Application

To launch TOS Application on the Windows environment, execute “TalendOpenStudio-win32-wpf.exe” and accept the License agreement:

Step 4 : Create a new connection

After license agreement aceptance, click on  Then complete the fields on the next screen : "Référentiel" : Repository=Local, "Email" = , "Mot de passe":Password=, "Créer un nouveau projet":Create a new project.

Step 5 : Create a new Project

Select on the list box “Create a New Project” and click on "OK" button, the screen allowing the creation of a new project appears:

"Nom du Projet": Project Name = Step By Step,

"Nom technique": Technical Name = STEP_BY_STEP,

"Description du projet": Project Description = ,

"Langage de génération": Generation Language select  Java



Step 6 : Select a Project

Select now new Project that you have created, "Projet Sélectioner": Select Project = Step By Step.

Step 7 : Start TOS Application

Click on “Start now” to open TOA application

TOS application is now ready to use.

Part 2 : Step by Step Development

Scenario 1: Classical load ETL with the tTeradataFastLoad using JDBC

Generate a dataset sample and load the result into a Teradata table. (the CREATE TABLE script for cust_teradata is at the end of the document)

Components need:

  • a tRowGenerator (category Misc in the component Palette)
  • a tTeradataFastLoad (category Databases/Teradata in the component Palette)

Step 1 : database metadata discovery wizard

  • click right on the “DB connections” item in the Repository and choose a name of the metadata connection in your repository, for example “myTeradata”.

  • then in the following step, specify the right value to connect to your Teradata database. Click finish to save it in your repository.

Step 2 : retrieve metadata schema from the Database

  • click right on “myTeradata” item in the Repository and choose “Retrieve Schema”, then choose your expected filter Table, View, Synonym or write a specific filter of Object return into the wizard.
  • check to select Table structure that you would like retrieve in the metadata repository.

  • then in the last step you can verify the structure returned via the catalog.

Step 3: Create a new JobDesign

  • click right on “Job Designs” in the Repository and specify a name
  • then choose a tRowGenerator component into the component palette (Misc category)

  • drag-n-drop the cust_teradata item in your repository into the JobDesign interface and choose the tTeradataFastLoad component into the dialog box.

  • then create a link ROW between tRowGenerator and tTeradataFastLoad components. Click-right on tRowGenerator, choose “Row” and “Main” and link it to tTeradataFastLoad.

Step 4: tRowGenerator configuration

  • Double-click on tRowGenerator to open configuration GUI.

Step 5: tTeradataFastLoad configuration

  • Double-click on tTeradataFastLoad to open basic settings properties tab.

Step 6: tab RUN to execute the JobDesign

Scenario 2: load ETL with the tTeradataMultiLoad using mload teradata utility

The dataset already exist into a delimited file (comma or semi-colon separated). You can produce this dataset with the previous JobDesign. Just modify the output component and replace by tFileOutputDelimited.

Step 1 : tTeradataMultiload

  • Pick and drop the tTeradataMultiLoad into a new JobDesign

Below, you can see :

  • the mload script code generated by Talend


/**
* [tTeradataMultiLoad_1]
*/
currentComponent="tTeradataMultiLoad_1";

String tableFullName_tTeradataMultiLoad_1 = "talend" + "." + "cust_teradata";
java.io.FileWriter fw_tTeradataMultiLoad_1 = new java.io.FileWriter("D:/scriptfolder/"+"cust_teradata"+".script");

StringBuilder script_tTeradataMultiLoad_1 = new StringBuilder();
fw_tTeradataMultiLoad_1.write(".LOGTABLE "+"talend"+"."+"cust_teradata"+"_lt;\r\n");
fw_tTeradataMultiLoad_1.write(".LOGON "+"talend"+","+"talend"+";\r\n");

fw_tTeradataMultiLoad_1.write(".BEGIN IMPORT MLOAD TABLES "+tableFullName_tTeradataMultiLoad_1+" SESSIONS 8;\r\n");
fw_tTeradataMultiLoad_1.write(".LAYOUT customer_layout;\r\n");
fw_tTeradataMultiLoad_1.write(".FIELD "+"ID"+" * VARCHAR(255);\r\n");
fw_tTeradataMultiLoad_1.write(".FIELD "+"NAME"+" * VARCHAR(255);\r\n");
fw_tTeradataMultiLoad_1.write(".FIELD "+"DATE_BIRTH"+" * VARCHAR(255);\r\n");
fw_tTeradataMultiLoad_1.write(".DML LABEL DML_LABEL;\r\n");

fw_tTeradataMultiLoad_1.write("INSERT INTO "+tableFullName_tTeradataMultiLoad_1+"("
+"ID,NAME,DATE_BIRTH"+") VAlUES("+":ID,:NAME,:DATE_BIRTH"+");\r\n");
fw_tTeradataMultiLoad_1.write(".IMPORT INFILE "+"D:\\scriptfolder\\bulk_file.txt");
fw_tTeradataMultiLoad_1.write(" FORMAT VARText '"+";"+"' LAYOUT customer_layout APPLY DML_LABEL;\r\n");
fw_tTeradataMultiLoad_1.write(".END MLOAD;\r\n");
fw_tTeradataMultiLoad_1.write(".LOGOFF;\r\n");

fw_tTeradataMultiLoad_1.close();

String sb_tTeradataMultiLoad_1= new String("cmd /c mload < "+"D:/scriptfolder/"+"cust_teradata"+".script > "+"D:/scriptfolder/bulk_file_log.txt"+" 2>&1");
/**
* [tTeradataMultiLoad_1]
*/

 

  • Create table script for cust_teradata
CREATE MULTISET TABLE talend.cust_teradata ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
ID INTEGER,
NAME VARCHAR(255) CHARACTER SET LATIN NOT CASESPECIFIC,
DATE_BIRTH DATE FORMAT 'YYYY-MM-DD')
PRIMARY INDEX ( ID );
Discussion
18 Nov 2009

This is a great doc put together. I have a question for you.

What's the diff between trowgenerator and writing to a flat file and loading via Fastload in TD

You must sign in to leave a comment.