All Forums Tools
Santanu84 122 posts Joined 04/13
26 Feb 2014
Teradata Parallel Transporter - Session Character Set

Hi All
I am creating a TPT script to load some characters into table. I know if I set the session character set to UTF8 it will get loaded. But I am unable to set the session character set to UTF8 using my TPT script. 
I tried saving the ".txt" file of my TPT script as UTF-8 format and used "USING CHARATER SET UTF8" before DEFINE JOB along with "tbuild -e utf8" option. But it is giving me error RDBMS Error 6705 : An illegally formed character string was encountered during translation.
Am I doing any thing wrong or is there any thing else which I am missing?
Experts, please let me know your answers.
Thanks
Santanu

Santanu84 122 posts Joined 04/13
26 Feb 2014

Hi Folks, 
Anyone, any update. I need help on this one.
Thanking You
Santanu

feinholz 1234 posts Joined 05/08
26 Feb 2014

The USING CHARACTER SET UTF8 is only for the data. Not the script.
The script does not have to be encoded in UTF8 in order for you to load UTF8 data into Teradata.
If there is nothing special in the script that requires it to be encoded in UTF8, just create the script in ASCII.
The -e command line option tells TPT That the script is encoded in something different than ASCII.
As for the error, the content of the data has one or more characters that are not acceptable to Teradata.
Teradata does not support every single UTF8 character sequence.
Where is the data coming from?
 

--SteveF

Santanu84 122 posts Joined 04/13
26 Feb 2014

Hi Feinholz
It is actually coming from oracle source. The thing is, people in my team tried to load that through informatica TPT API. There are 2 types of input data. Type-1 has chinease characters. When UTF8 is used with INF-TPT it is loaded. But then there is MS-LATIN (Windows-1252) characters they are not getting loaded with UTF8. But when MS-Latin (1252) code page is used within INFA-TPT they are getting loaded. 
So I thought of creating a TPT script (instead of INF-TPT job) to use Translate function and see the result.
Let me try your option. Meanwhile, do you have any other suggestions in this regard, that would be really helpful.
Thanking You
Santanu

Santanu84 122 posts Joined 04/13
27 Feb 2014

Hi Feinholz and others,
I have made a little progress on this. Now I know the problem is with some extended ASCII characters.
Such as, ASCII: 10 , 13, 188 etc.
I can find each individual character with ASCII function and change them with CHR function. But I cannot change the entire string, as the functions do not work that way.
Also I found the hexadecimal values of the string by CHAR2HEXINT function and able to manually change the hexadecimal codes like '000A00BC'XC to character.
But I am not sure in job how shall I change the hex values to character on the fly.
Such as, SELECT ''''||CHAR2HEXINT('1/4')||''''XC to change the entire string back to character. But it did not work.
Hope I am able to explain the problem.
Please help me in this regard.
Thanking You
Santanu

Santanu84 122 posts Joined 04/13
27 Feb 2014

Hi All,
Can anyone please tell me how to load Extended ASCII characters in UTF8 session charset to TD ?
Is there any way to insert '000A00BC'XC into teradata through any function or something where I can use the column name instead of the hexadecimal string?
Any help or guidance is appreciated.
Thanking You
Santanu

david.craig 73 posts Joined 05/13
28 Feb 2014

What Unicode characters you are refereing to? Use U+ code point notation. For example: U+000A  (<control>)
U+00BC  (VULGAR FRACTION ONE QUARTER)
These characters should load from UTF8 into the Unicode, or Latin, server character sets. Note that Unicode supports various Latin scripts: basic latin (ASCII), Latin-1, Latin Extended (see http://www.unicode.org/charts/).
 

Santanu84 122 posts Joined 04/13
28 Feb 2014

Hi David
Thanks for reply.
 
Well, there is the problem. Even I thought defining the column as unicode and using UTF8 as session charset will work.
But the problem is we are supposed to load the data through ETL tool Informatica where we are using code page UTF8.
 
Now the problems we are facing,
1. The source is Oracle
2. In the same source table we have 2 types of columns
a)
Column with extended ascii char (U+000A  <control> and U+00BC  (VULGAR FRACTION ONE QUARTER) within a large string)
b) 
Column with Asian characters such as Chinease.
3. When we are using MS-Latin code page and loding into Uniode colum ascii getting loaded. But the asian characters are becoming garbled.
4. On the other hand when UTF8 code page is used asian characters are getting loaded but these 8-bit ASCII characters are getting rejected with 6705 error code.
 
I am looking for a solution which will load both in single shot. 
I tried changing the 8-bit ascii characters to hexadecimal format using Char2HexInt function but then how to change those hex values back to ascii char using any function (not by using ''XC format) ?
 
Hope I am able to explain you the actual problem. Any solution or guidance is appreciated.
 
Thanking You
Santanu
 

david.craig 73 posts Joined 05/13
28 Feb 2014

All columns in the source table need to use the same character set. In this case, a Unicode character set is the only one to support both Chinese and Latin. So UTF8 is a good choice. There should be a way in Informatica to convert Latin to UTF8 before the load.

arun_tim1 4 posts Joined 02/14
02 Mar 2014

Hi ,

I would like share my issue which is currently facing in Teradata parallel transporter script in Z/Os. I am having the TPT job for loading file data into table but the file volume is high.
i am using the DATA CONNECTOR AS PRODUCER to read the file. i gave producer instance as 2 but it is taking 1 instance to read file.
Like , Instance 1 Reading file 'DD:PTYIN'. i have used the attribute MultipeReaders='Y' but the TPT job got disconnecting while acqusition phase.
Kindly help on this !!

Thanks in advance .

ratchetandclank 49 posts Joined 01/08
03 Mar 2014

@Santanu84, Try using the attributes ValidUTF8 and ReplaceUTF8Char. Refer to the documentation to see if it helps. 

ratchetandclank 49 posts Joined 01/08
03 Mar 2014

@arun_tim1, The information provided is not enough to dig deep into the issue you are facing. Please provide the logs of the operators to see where the problem might lie.
 
And, I think you should open a new thread for this question. 

arun_tim1 4 posts Joined 02/14
03 Mar 2014

 Hi
 I am using below syntax for the DATA CONNECTOR AS PRODUCER operator in the TPT job.
ATTRIBUTES                                           
 (                                                    
      VARCHAR FILENAME='DD:PTYIN',                    
      VARCHAR MULTIPLEREADERS='Y',                    
      VARCHAR FORMAT='DELIMITED',                     
      VARCHAR OPENMODE='READ',                        
      VARCHAR TEXTDELIMITER ='¬',                     
      VARCHAR PRIVATELOGNAME = 'data_logfile',
      VARCHAR INDICATORMODE='N'                       
) ;    
INSERT INTO TABLENAME
(
)
TO OPERATOR (UPDATE_OPERATORÝ3¨)      
SELECT * FROM OPERATOR(FILE_READERÝ2¨);
If i gave 2 instance for the producer operator i am getting the following abend in the mainframe job.                                                       
FILE_READER: TPT19008 DataConnector Producer operator Instances: 2
FILE_READER: TPT19003 ECI operator ID: FILE_READER-197561         
FILE_READER: TPT19221 Total files processed: 0.                   
UPDATE_OPERATOR: connecting sessions                              
UPDATE_OPERATOR: preparing target table(s)                        
UPDATE_OPERATOR: entering DML Phase                               
UPDATE_OPERATOR: entering Acquisition Phase                       
UPDATE_OPERATOR: disconnecting sessions                           
UPDATE_OPERATOR: Total processor time used = '0.034438 Second(s)' 
UPDATE_OPERATOR: Start : Mon Mar  3 07:05:29 2014                 
UPDATE_OPERATOR: End   : Mon Mar  3 07:05:32 2014                 
Job step LOAD_TABLES terminated (status 8)   
ABEND NAME :
07.05.42 JOB37991 $HASP165 ZKLD76AE ENDED AT IMF9S - ABENDED S000 U3000 CN(INTERNAL)
IN SPOOL JESMSG : The following abend details i got from the jcl log.
IDI0044I Current fault is a duplicate of fault ID F08604 in history file SYS3B.FLTANLZR.HIST2 - the duplicate count is 18
IDI0053I Fault history file entry suppressed due to: Duplicate fault or End Processing user exit.
kindly help on this !
 
                     

Santanu84 122 posts Joined 04/13
03 Mar 2014

Hi ratchetandclank
Are these attributes present some where in informatica ? Just wanted to know as I could not find such attributes in teradata. Any reference document suggestion for this would be helpful?
Please let me know your response.
 
Thanking You
Santanu

feinholz 1234 posts Joined 05/08
03 Mar 2014

@Santanu84: please do NOT use the attributes ValidUTF8 and ReplaceUTF8Char.
 
We are not happy with the results we are getting from that feature and we are redesigning it.
 
@arun_tim1: what version of TPT are you running on z/OS?

--SteveF

ratchetandclank 49 posts Joined 01/08
03 Mar 2014

@arun_tim1: Does not look like TPT caused the abend, because you can see the return code from the step LOAD_TABLES. Please paste the DC operator log of the execution. If possible, please execute the job setting the TraceLevel attribute to ALL, for DC operator and paste the log. 
IDI0044I and IDI0053I should not cause abend. Look at the documentation of these in the IBM website here:
http://pic.dhe.ibm.com/infocenter/pdthelp/v1r1/index.jsp?topic=%2Fcom.ibm.faultanalyzer.doc_7.1%2Fidiugg05493.htm
 

feinholz 1234 posts Joined 05/08
03 Mar 2014

I think we need to see the entire set of output.
We only have a partial story here.
The job terminated with an error (Job step LOAD_TABLES terminated (status 8) ).
But the part of the output you sent does not show why.
I do see that the Update operator was stopped in the middle of the Acquisition Phase for some reason, which means the error is probably with the DataConnector operator.
 

--SteveF

arun_tim1 4 posts Joined 02/14
03 Mar 2014

Hi feinhoz
 
I am using Teradata Parallel Transporter Version 13.10.00.04.
 

arun_tim1 4 posts Joined 02/14
03 Mar 2014

Hi
The following error i got in the DATA CONNECTOR
!WARNING! file 'DD:PTYIN' not processed (errno 129).
EDC5129I No such file or directory.                
Setting exit code = 12.                            
Method PX_Terminate entry:  Phase 1                

feinholz 1234 posts Joined 05/08
03 Mar 2014

Please send your entire JCL.
What dataset is PTYIN pointed to?

--SteveF

divyagolla 22 posts Joined 02/14
04 Mar 2014

Does Teradata 13 support TPT ??

feinholz 1234 posts Joined 05/08
04 Mar 2014

It is the other way around.  :)
The question should be, "Does TPT support Teradata 13?"
The answer is "yes", but it depends on which versions of TPT.
Our client products will support Teradata release "current and 4 back".
This means that TTU 13.0, 13.10, 14.0, 14.10 and 15.0 will support Teradata 13.

--SteveF

Santanu84 122 posts Joined 04/13
04 Mar 2014

Hi feinholz
Thanks for your reply.
Any further suggestion from you on my case ?
 
Thanking You
Santanu

MaximeV 19 posts Joined 11/13
19 Mar 2014

Hi Santanu84,
2 questions:
Can you provide the character set used on your oracle source which seems to both manage those extended ascii chars and chinese characters ?
On the informatica side, the only parameter you can modify related to character set is the one on the relational source used by tpt , am i right ?
 
 

ashish089 3 posts Joined 06/15
27 Jul 2015

Hi,
 
I am having configured my load process in SSIS package and the package got triggered and then gradually failed so I stopped the process.
But the table it got access to load it is still under load. How can I stop the process ?
Please help..!!!!

You must sign in to leave a comment.