All Forums Tools
VasiaV 7 posts Joined 07/12
26 Jul 2012
Fastload: fixed field multibyte

Hi

i have fixed field file in utf8 like that

field1   field2
 .я . . . .жжж
 .фф . . .у . .

dots here represent blanks. There are two fields, second field starts from 7-th position. Total record length is 10 characters. I want to load it in table

create multiset table a(a varchar(10), b varchar(10));

 

What fastload job script i need to write to load this?

 

I didn'd get it myself with Teradata Fastload Reference. Main problem as i think, that one specifies field lenghts in bytes in define command, not in characters. And in my case i have
record1 first field 7 bytes, second field 6 bytes; record2 first field 8 bytes, second field 4 bytes. Because blanks are 1 byte and cyrillic symbols are two bytes.

Thanks in advance.
 

ulrich 816 posts Joined 09/09
26 Jul 2012

unicode data can become tricky.

In case you want help you should attach

1. a testfile incl. unicode chars  

2. the ddl of the target table

3. any fastload script you tried so far

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

VasiaV 7 posts Joined 07/12
26 Jul 2012

I didn't get how to attach here, so
 

1.
 

FIELD1	FIELD2
 я    ЖЖЖ
 фф   у  

3-record text file with CR LF as line terminator

2. job script with DDL included

logon ..;
drop table errora_1;
drop table errora_2;
drop table a;

create multiset table a(a varchar(10), b varchar(10));

record 2;

set record unformatted;

define
FIELD1 (varchar(6)),
FIELD2 (varchar(3))
file=TSTu8.txt;
show;
begin loading a errorfiles errora_1, errora_2 ;
INSERT into a
values (:FIELD1, :FIELD2);
end loading;
logoff;

 

3. i launch fastload -c UTF-8
and got

0012 INSERT into a
     values (:FIELD1, :FIELD2);

**** 18:07:48 The length of: FIELD1 in row: 2 was greater than defined.
              Defined: 6, Received: 48111
 

VasiaV 7 posts Joined 07/12
26 Jul 2012

And some details. Blanks cannot be treated as delimiters, because data can contain inner blanks.

ulrich 816 posts Joined 09/09
26 Jul 2012

below the editor is a FILE ATTACHMENTS link which you can follow and which should be straigt forward to upload files.

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

sg1 4 posts Joined 05/09
26 Jul 2012

For UTF8; you need to multiply your DEFINE width by 3 times the target characters.

If you column is defined as VARCHAR(10); change your define stmt to VARCHAR(30).

 

 

VasiaV 7 posts Joined 07/12
27 Jul 2012

As you can see
http://i45.tinypic.com/2eatidi.jpg

there are no any attachment links below my post form

Here is my test case
http://www.filehosting.org/file/details/362374/tst.rar

TST.ctl - fastload script
TSTu8.txt - sample file
TST.out - output

 

width by 3 times didn't help me, i've got the same error.

ulrich 816 posts Joined 09/09
27 Jul 2012

strange - as you can see attached I get it a file attachemnt link.

lets see...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
27 Jul 2012

and sorry - but on mac rar files are not known.

Can you tar, zip, gzip?

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

VasiaV 7 posts Joined 07/12
27 Jul 2012

Here is tar
http://www.filehosting.org/file/details/362386/tst.tar

Here is zip
http://www.filehosting.org/file/details/362388/tst.zip

they are the same

ulrich 816 posts Joined 09/09
27 Jul 2012

Didn't got it working right now.

But some comments:

You need to define the table columns as character set unicode - otherwise you can not load utf-8 data into it.

create multiset table a(a varchar(10), b varchar(10));

If you define column as varchar in the define statement TD expect 2 bytes of column lengths before each varchar field.

But as you have a fixed line file you should be able to define the fields as char(18) and char(9). But this is still not working correctly.

I was able to load the file into ONE column with the following script:

drop table errora_1;
drop table errora_2;
drop table fload_tmp;

create multiset table fload_tmp (a varchar(10) character set unicode);

record 2;

set record vartext "//////";

define
FIELD1 (varchar(27)),
file=TSTu8.txt;
show;
begin loading fload_tmp errorfiles errora_1, errora_2 ;
INSERT into fload_tmp
values (:FIELD1);
end loading;
logoff;

at least this is showing that the data in the file is correct.

I also attached the zip file from you in case someone else like to play arround with this and don't like the fileshare access...

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

VasiaV 7 posts Joined 07/12
27 Jul 2012

Thanks for trying and attach.

I've loaded one column two.

Main problem remains the same. Fields are fixed in terms of chars. First field 6 char, second 3 char. But have varying byte length. Fastload defines fields in bytes, not chars, or i didn't found how to make it think chars.
If fields were delimited with certain delimiters, there weren't be problem to load them. But there are no delimiters.
 

You must sign in to leave a comment.