All Forums Connectivity
sguthi0910 5 posts Joined 10/13
10 Jan 2014
Sqoop Import from Teradata with 30 lines of SQL query using –query is failing

Hi,

 

I was trying to import data from teradata into hadoop using sqoop command. This import includes joining of tables on teradata and import the results to hive. I'm using the --query option of sqoop to insert the SQL query. The SQL query which I'm using is more than 30 lines. The import is failing with the following error.

 

14/01/10 11:30:28 INFO manager.SqlManager: Using default fetchSize of 1000

14/01/10 11:30:28 INFO tool.CodeGenTool: Beginning code generation

14/01/10 11:30:29 ERROR manager.SqlManager: Error executing statement: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.00.00.01] [Error 3707] [SQLState 42000] Syntax error, expected something like ';' between an integer and '('.

com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.00.00.01] [Error 3707] [SQLState 42000] Syntax error, expected something like ';' between an integer and '('.

 

 

I tried "sqoop eval" to check whether Sqoop can handle such large query.  It was successful. It returned me the result on the putty console. But when I use the same SQL in the import command it is not working. I'm using the following command.

 

sqoop import -libjars /usr/lib/sqoop/lib/tdgssconfig.jar,/usr/lib/sqoop/lib/ terajdbc4.jar --driver com.teradata.jdbc.TeraDriver --connect "jdbc:teradata://111.111.111.11/DATABASE=vedw" -m 1 --username uname --password pwd --hive-table PRED_CUST --hive-import --query "SELECT query with JOINS and WHERE \$CONDITIONS" --target-dir /user/hdfs/PRED_CUST

 

Following is the sqoop eval command which is working fine.

 

sqoop eval -libjars /usr/lib/sqoop/lib/tdgssconfig.jar,/usr/lib/sqoop/lib/ terajdbc4.jar --driver com.teradata.jdbc.TeraDriver --connect "jdbc:teradata://111.111.111.11/DATABASE=vedw" -m 1 --username uname --password pwd --query "SELECT query with JOINS and WHERE"

 

Please help me with this if there is anything wrong I'm doing. Or please suggest me some workaround. Thank you.

 

Srikanth

td_admirer 36 posts Joined 07/12
10 Jan 2014

use an options file parameter.. "--options-file" and use the "--query" option with the sql in multiple lines with a line concatenation "\"
something like..
 
--query
select \
col1 \
,col2 \
,col3 \
,col4       \
from table1 a \
join \
table2 b \
blah blah blah

sguthi0910 5 posts Joined 10/13
10 Jan 2014

Hi td_admirer,
Thanks for the quick response. I made that 30 lines query into single line query. I was still getting the same error. The query with 30 lines(new line for every condition) did not work at all. Please correct me if I was wrong.
Thanks,
Srikanth

td_admirer 36 posts Joined 07/12
12 Jan 2014

can you paste the entite sqoop command that you were using?

sguthi0910 5 posts Joined 10/13
14 Jan 2014

Hi,
Here is the sqoop command.
sqoop import -libjars /usr/lib/sqoop/lib/tdgssconfig.jar,/usr/lib/sqoop/lib/ terajdbc4.jar --driver com.teradata.jdbc.TeraDriver --connect "jdbc:teradata://111.111.111.11/DATABASE=vedw" -m 1 --username uname --password pwd --hive-table PRED_CUST --hive-import --query "SELECT vedw.wo_header_drvd1_view.wo_period_date, CASE WHEN (vedw.wo_header_drvd1_view.wks_end - 6) = vedw.wo_header_drvd1_view.wo_period_date THEN (vedw.wo_header_drvd1_view.wks_end - 6) ELSE (vedw.wo_header_drvd1_view.wks_end + 1) END,  vedw.wo_header_drvd1_view.cal_year || '/' || (vedw.wo_header_drvd1_view.cal_period (format '99')),  vedw.wo_header_drvd3_view.region_descr,  vedw.wo_header_drvd3_view.area_descr,  vedw.wo_header1_view.branch_code,  count(vedw.wo_header_drvd1_view.wo_nbr),  vedw.wo_equipment_drvd1_view.alt_customer_name,  vedw.wo_equipment_drvd1_view.alt_customer_nbr,  vedw.wo_header1_view.wo_nbr,  vedw.wo_header_drvd3_view.SLM_offering_class,  (SUM(vedw.wo_header_drvd1_view.actv_repair_time_onsite + (vedw.wo_header_drvd1_view.actv_repair_time_remote - vedw.wo_header_drvd1_view.actv_repair_time_sentinel)) / (COUNT(vedw.wo_header1_view.wo_nbr) * 60.0)),  vedw.wo_header1_view.source_country_code,  sum(vedw.wo_header_drvd1_view.parts_used),  (SUM(vedw.wo_header_drvd1_view.travel_time) * 1.000 / (COUNT(vedw.wo_header1_view.wo_nbr) * 60.0)),  (SUM(vedw.wo_header_drvd4_view.best_fit_resolution_time - vedw.wo_header_drvd4_view.best_fit_drop_time_resolve) / (COUNT(vedw.wo_header1_view.wo_nbr) * 60.0)),  vedw.wo_header1_lkup_dn_view.aof_desc_gbl_1,  vedw.wo_header1_view.aof_code_gbl_1,  vedw.wo_equipment_drvd1_view.crnt_parent_cust_ind_code,  SUM(vedw.wo_header_drvd1_view.onsite_activity * vedw.wo_header_repair_success_vw.SN_3_day_opportunity_ind) ,  SUM(vedw.wo_header_drvd1_view.onsite_activity * vedw.wo_header_repair_success_vw.SN_3_day_opportunity_ind * (vedw.wo_header_repair_success_vw.SN_3_day_opportunity_ind - (CASE WHEN (vedw.wo_header_repair_success_vw.SN_3_day_failure_ind + (CASE WHEN (vedw.wo_header_drvd2_view.onsite_count_WCS_visit + vedw.wo_header_repair_success_vw.sn_3_day_failure_ind - vedw.wo_header_repair_success_vw.SN_3_day_opportunity_ind) > 0 THEN 1 ELSE 0 END)) >0 THEN 1 ELSE 0 END))),  vedw.wo_header1_view.wo_type_gbl,  vedw.wo_header1_view.wo_type_lcl,  vedw.wo_header_repair_success_vw.SN_3_day_opportunity_ind,  SUM(vedw.wo_header_drvd2_view.onsite_count_WCS_visit),  vedw.wo_header1_view.territory_code FROM  vedw.wo_header_drvd1_view,  vedw.wo_header_drvd3_view,  vedw.wo_header1_view LEFT JOIN vedw.wo_header_repair_success_vw ON vedw.wo_header1_view.source_country_code=vedw.wo_header_repair_success_vw.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header_repair_success_vw.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header_repair_success_vw.wo_nbr and vedw.wo_header1_view.current_record_ind='y',  vedw.wo_equipment_drvd1_view,  vedw.wo_header_drvd4_view,  vedw.wo_header1_lkup_dn_view,  vedw.wo_header_drvd2_view WHERE  ( vedw.wo_header1_view.source_country_code=vedw.wo_header_drvd1_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header_drvd1_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header_drvd1_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  ( vedw.wo_header1_view.source_country_code=vedw.wo_header_drvd2_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header_drvd2_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header_drvd2_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  ( vedw.wo_header1_view.source_country_code=vedw.wo_header_drvd3_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header_drvd3_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header_drvd3_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  ( vedw.wo_header1_view.source_country_code=vedw.wo_header_drvd4_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header_drvd4_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header_drvd4_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  ( vedw.wo_header1_view.source_country_code=vedw.wo_equipment_drvd1_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_equipment_drvd1_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_equipment_drvd1_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  ( vedw.wo_header1_view.source_country_code=vedw.wo_header1_lkup_dn_view.source_country_code and vedw.wo_header1_view.instance_id=vedw.wo_header1_lkup_dn_view.instance_id and vedw.wo_header1_view.wo_nbr=vedw.wo_header1_lkup_dn_view.wo_nbr and vedw.wo_header1_view.current_record_ind = 'Y'  )  AND  (( vedw.wo_header_drvd1_view.wo_reportplus = 1  )   AND   ( vedw.wo_header_drvd1_view.onsite_use = 1  )   AND   (    vedw.wo_header1_view.wo_type_gbl  IN  ( '1','2','3','4'  )    AND    vedw.wo_header_drvd3_view.wo_type_gbl_taken  IN  ( '1','2','3','4'  )   )   AND   vedw.wo_equipment_drvd1_view.alt_customer_nbr  IN  ( '20066','4987303','78983558','021028','53998','00888874','30093','37305','81992380','AE325','SA320','39435','24329'  )   AND   vedw.wo_header1_view.caller  <>  'Kevin Peters'   AND   (    vedw.wo_header_drvd3_view.SLM_offering_class  IN  ( '73543','73346','73550','71351','73458','74559','76086'  )    OR    vedw.wo_header_drvd3_view.SLM_offering_class  LIKE  '58%'    OR    vedw.wo_header_drvd3_view.SLM_offering_class  LIKE  '66%'    OR    vedw.wo_header_drvd3_view.SLM_offering_class  LIKE  '53%'   )   AND   vedw.wo_header1_view.hht_flag  =  0   AND   vedw.wo_header_drvd3_view.cal_offset  BETWEEN  0  AND  1  ) GROUP BY  1,   2,   3,   4,   5,   6,   8,   9,   10,   11,   13,   17,   18,   19,   22,   23,   24,   26 \$CONDITIONS" --target-dir /user/hdfs/PRED_CUST

Fred 1096 posts Joined 08/04
14 Jan 2014

Why do you have \$CONDITIONS after the GROUP BY list?

sguthi0910 5 posts Joined 10/13
14 Jan 2014

without \$CONDITIONS the sqoop command is giving this error " must contain '$CONDITIONS' in WHERE clause."

Fred 1096 posts Joined 08/04
15 Jan 2014

Yes, but it can't be after GROUP BY. Try something like:
... BETWEEN 0 AND 1 ) AND \$CONDITIONS GROUP BY ...

sguthi0910 5 posts Joined 10/13
16 Jan 2014

Thank you Fred. It is working now.

purna88 2 posts Joined 05/14
25 May 2014

Hi
I not able to import data from teradata instance to hdfs. i am getting this error
 
I am using this qurie :sqoop import  --connect jdbc:teradata://10.23.227.22/DATABASE=mydb --username dbc -P  --table mydb.test_table --warehouse-dir /user/sqooptest/teradata/ --num-mappers 1
 
Error: ERROR tool.BaseSqoopTool: Got error creating database manager: java.io.IOException: No manager for connect string: jdbc:teradata://10.23.227.22/DATABASE=mydb
        at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:185)
        at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:202)
        at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:83)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:464)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

thome119 1 post Joined 11/12
18 Jul 2014

chmod 755 /etc/sqoop/conf/managers.d
chmod 644 /etc/sqoop/conf/managers.d/cldra_td_connector.txt
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Connectors/PDF/Cloudera-Connector-for-Teradata.pdf
You can follow this guide, but made sure you have the right permissions using de commands above
 

venkimadesh 1 post Joined 11/15
17 Nov 2015

Hi,
I am trying add 2 more dummy columns in the select list. The same query runnign fine in teradata. But through SQopp it showing the following error. Any suggestons please?

 SELECT a.*,'123' AS hdp_batch_id,'2014/12/23 00:00:00' as hdp_dbms_time FROM CS_int_COR_VW.ACG_COND_SRC_REF_HST a WHERE TRUNC(END(dbms_time)) = '9999-12-31'

 

hdp_batch_id & hdp_dbms_time are dummy column which i am adding into my table.

 

I am getting the following error 

 

15/11/17 05:45:18 INFO processor.TeradataSplitByPartitionProcessor: insert from source table to staget table, the insert select sql INSERT INTO "TDCSPSTAGE_054517143" ("c1","c2","c3","c4","c5","c6","TDIN_PARTID") SELECT "c1","c2","c3","c4","c5","c6", RANDOM(1, 4) FROM "VTDCSPSTAGE_054517143"

15/11/17 05:45:18 INFO processor.TeradataSplitByPartitionProcessor: the insert select sql starts at: 1447757118124

15/11/17 05:45:18 INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor starts at:  1447757118634

15/11/17 05:45:39 INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor ends at:  1447757118634

15/11/17 05:45:39 INFO processor.TeradataInputProcessor: the total elapsed time of input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor is: 20s

15/11/17 05:45:39 ERROR teradata.TeradataSqoopImportHelper: Exception running Teradata import job

com.teradata.connector.common.exception.ConnectorException: java.sql.SQLException: [Teradata Database] [TeraJDBC 15.00.00.20] [Error 3863] [SQLState 42000] Duplicate definition of 'c2' in NAMED phrase.

 

 

 

You must sign in to leave a comment.