All Forums Teradata Studio
Aastha007 6 posts Joined 03/13
07 Aug 2014
Error while trying for data movement between Teradata and Hadoop using TDCH installed on Teradata Studio

Aim: To achive data movement between Teradata and Hortonworks Hadoop using TDCH with Teradata Studio Client
 
Steps followed to configure and run TDCH connector using Teradata Studio client tool are as described below:
 
Software used along with the versions:
Teradata Studio 15.00.01
Hortonworks Hadoop 2.1
TDCH connector 1.3.1
Java Runtime Environment (JRE) version 1.6 or above

I Managed to get the TDCH connectors installed properly using the below steps:
1. After unzipping Teradata Connector for Hadoop tar.gz package, copied the following files into Hadoop's lib directory:
   /usr/lib/hadoop/lib
    a) teradata-connector-1.0.6.jar
    b) tdgssconfig.jar
    c) terajdbc4.jar
 
2. The Teradata Connector should be installed on a node that can submit hadoop
           mapreduce jobs via the 'hadoop jar' command; in mosts cases TDCH should be
           installed on a client node in hadoop cluster
3. After moving the TDCH rpm onto the target node, install the TDCH rpm using
           the following command. The rpm will install the TDCH jar and all other
           relevant files into the /usr/lib/tdch/<version> directory
          
             rpm -ivh teradata-connector-<version>-<target distro>.noarch.rpm
            
4. Setup defaults via the Teradata Connector configuration files (optional)
           Copy the following files from the /usr/lib/tdch/<version/conf
           directory into Hadoop's conf directory, set the default
           values in the template, and rename the files to remove ".template"
           suffix. The parameters in the configuration file will be used as
           default settings for Teradata Connector jobs.
             a) teradata-export-properties.xml.template
             b) teradata-import-properties.xml.template
   
Note : I tried with both modifying the required properties in the above two xml's and with any modifications.
 
 
5. Modify mapred.child.java.opts property value in mapred-site.xml on
           client node and add the following option (optional)
             -Djava.security.egd=file:/dev/./urandom
           Alternatively, user can also include the following JVM option in
           Teradata Connector job command:
             -Dmapred.child.java.opts="-Djava.security.egd=file:/ dev/./urandom"
6. Modify the following OS networking values on the name node.
           (recommended)
            /proc/sys/net/ipv4/tcp_max_syn_backlog to 2048
           
7. Install TDCH into HDFS such that Teradata DataMover and
           Teradata Studio can launch TDCH jobs remotely
       
         Navigate to the /usr/lib/tdch/<version>/scripts directory,
         and run the configureOozie.sh script, providing the hostname
         or IP address of the node running the NameNode process as an
         argument.
        
         ./configureOozie.sh nn=<NameNode hostname or IP address>
        
         The configureOozie.sh script supports many more command line
         arguments; run the script without arguments for details.
 
After having done all this I was able to create a Hadoop Connection Profile using Hive as User. But I am unable to execute the LOAD_TO_HCATALOG and LOAD_FROM_HCATALOG connectors.

Below are the syntax for the queries executed on Teradata Studio 15.00.01

1)
select * from SYSLIB.LOAD_TO_HCATALOG (
on
(select * from UDA.TEST)
using
server('153.65.161.221')
port('9083')
username('hive')
dbname('default')
tablename('HADOOP_TEST')
) as A;

2)
SELECT *
FROM LOAD_FROM_HCATALOG(
using
server('153.65.161.221')
port('9083')
username('hive')
dbname('default')
tablename('HADOOP_TEST')
columns('*')
) AS A;
 
 
Error:
For the above query I get an error saying "syntax error expecting something like a ; between LOAD_TO_HCATALOG and ‘(‘  "
I have referred to the Teradata Query Grid document and other internet sources to ensure the syntax is correct. Please guide on what is going wrong here. Am I missing some configurations needed to make the data movement possible ?
 
Regards,
Aastha

fgrimmer 553 posts Joined 04/09
07 Aug 2014

Aastha, You are confusing two different products. Teradata Studio does not use SQL-H when transfering data between Teradata and Hadoop. If you want to transfer data between Teradata and Hadoop with Studio you would use the Smart Loader for Hadoop, a feature within Teradata Studio. Please refer to the article (http://developer.teradata.com/tools/articles/smart-loader-for-hadoop ) on how to use this feature. 
If you want to run SQL-H to transfer data from Teradata and Hadoop, this is a different product. Please post your question to the Extensibility Forum.

Aastha007 6 posts Joined 03/13
07 Aug 2014

Yes I agree that I am confused. But if I check the post "SQL-H: Teradata to HDP: License Required?" on developer exchange I see the SQL-H queries being run from Teradata Studio itself.
I have installed TDCH on studio and what studio can do is do a data transfer using export, import and plugin command line. But at the same time I see load_from_hcatalog and load_to_hcatalog being run from Studio which is what confuses me.
 
 

fgrimmer 553 posts Joined 04/09
07 Aug 2014

Aastha, load_from_hcatalog() and load_to_hcatalog() are SQL-H functions. From Teradata Studio, you can open a SQL Editor window and execute SQL commands that call database functions. These commands are being executed by Teradata Database. Teradata Studio passed the SQL statement, via the JDBC driver, to the Teradata Database to be executed. Like I mentioned above, if you want to transfer data between Teradata and Hadoop using Studio functionality than you would use the Smart Loader for Hadoop.

sj186043 4 posts Joined 07/14
02 Sep 2014

I have extracted the below files in my Windows machine but unable to copy them  to Hadoop's lib directory.
  a) teradata-connector-1.0.6.jar
    b) tdgssconfig.jar
    c) terajdbc4.jar
 
What is the command or procedure I should follow?

fgrimmer 553 posts Joined 04/09
03 Sep 2014

Swati, Please follow the instructions in the Readme on our Teradata Studio download page. As for moving files from Windows machine to Hadoop, you can use PUTTY or WinSCP.

sj186043 4 posts Joined 07/14
05 Sep 2014

Thanks for the reply. I have installed TDCH as mentioned in the Readme.
Now, I am trying to import the data from TD to Hadoop using command line. But I am getting the following error :
FIle -url does not exist.
I am giving the following url : -url jdbc:teradata://dbc/database=students.

fgrimmer 553 posts Joined 04/09
05 Sep 2014

Swati, For command line usage of TDCH, please post issues to the Extensibility Forum (http://forums.teradata.com/forum/extensibility) or comment on the article (http://developer.teradata.com/connectivity/articles/teradata-connector-for-hadoop-now-available).

You must sign in to leave a comment.