All Forums Tools
andydoorey 35 posts Joined 05/09
28 May 2014
TPT - Format = text example

Does anyone have an example of a TPT script with format = text?
I'm trying to load a file with variable length records into a table with a single varchar(1000) column.  I tried basing the script on the qstart1 example and just changing 'delimited' to 'text', but I get an error:

$FILE_READER[1]: TPT19108 Data Format 'TEXT' requires all 'CHAR/ANSIDATE' schema

 

The variables script (excluding connection info) is:

---------

,DDLPrivateLogName    = 'ddlprivate.log'

,LoadPrivateLogName   = 'loadprivate.log'

,TargetErrorList      = ['3807']

,WorkingDatabase      = 'dev_sandpit'

,TargetWorkingDatabase = 'dev_sandpit'

,TargetTable          = 'RAW_CODA_LOAD'

,LogTable             = 'RAW_CODA_LOAD_LOG'

,ErrorTable1          = 'RAW_CODA_LOAD_E1'

,ErrorTable2          = 'RAW_CODA_LOAD_E2'

,SourceFileName       = 'coda_mil2.692130.txt'

,SourceFormat         = 'text'

,OpenMode             = 'read'

,DropLogTable         = 'Yes'

------------

The code is:
---------
DEFINE JOB qstartcoda2

(

  APPLY $INSERT TO OPERATOR ($LOAD)

  SELECT * FROM OPERATOR($FILE_READER);

);
---------

 

I've then tried creating my own job where the schema is explicitly defined as CHAR(1000) as follows:
--------------

 

DEFINE JOB qstartcoda1

DESCRIPTION 'LOAD CODA TABLE'

(

DEFINE SCHEMA CODA_SCHEMA

DESCRIPTION 'CODA INFORMATION'

(

CODADATA CHAR(1000)

);

 

DEFINE OPERATOR LOAD_OPERATOR

DESCRIPTION 'TERADATA PARALLEL TRANSPORTER LOAD OPERATOR'

TYPE LOAD

SCHEMA CODA_SCHEMA

ATTRIBUTES

(

VARCHAR TdpId             = @TargetTdpId,

VARCHAR UserName          = @TargetUserName,

VARCHAR UserPassword      = @TargetUserPassword,

VARCHAR TargetTable       = @LoadTargetTable,

VARCHAR LogTable          = @LoadLogTable,

VARCHAR ErrorTable1       = @LoadErrorTable1,

VARCHAR ErrorTable2       = @LoadErrorTable2

   

)

;

 

DEFINE OPERATOR DATACONN

DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DATACONNECTOR OPERATOR'

TYPE DATACONNECTOR PRODUCER

SCHEMA CODA_SCHEMA

ATTRIBUTES

(

VARCHAR FileName = @SourceFileName,

VARCHAR Format = @SourceFormat

);

 

APPLY ('INSERT INTO '|| @LoadTargetTable ||' (:CODADATA);')

TO OPERATOR (LOAD_OPERATOR [1])

SELECT * FROM OPERATOR (DATACONN[1]);

 

);
-----------------------

 

This time I get an error:

DATACONN[1]: TPT19113 Data length implied by Data Schema (1000) is not the same

as record length (226).

 

I then tried changing the code above so CODADATA is defined as VARCHAR(1000) and I get the first error again:

DATACONN[1]: TPT19108 Data Format 'TEXT' requires all 'CHAR/ANSIDATE' schema.

 

I can't find much information in the manuals about the text format.  Am I missing something that allows me to load a variable length string into a table?  The manual basically says that this is what the text format is for:

-----------

'Text' = character data separated by an end-of-record (EOR) marker. The

EOR marker can be either a single-byte linefeed (X'0A') or a double-byte

carriage-return/line-feed pair (X'0D0A'), as defined by the first EOR

marker encountered for the first record.

---------

 

Sorry if this is a basic query, but I'm new to TPT and can't find this information or any examples in the user guide or reference manuals.
Thanks
Andy

Tags:
Fred 1096 posts Joined 08/04
28 May 2014

Use "delimited" (even though there is only one column, so the delimiter does not actually appear in the input records).
The "text" format is only used when all input fields have fixed lengths.

andydoorey 35 posts Joined 05/09
28 May 2014

I did end up using delimited with a delimiter string of 'jkajdnsksklfnsklslkdmjsks'.  I'm hoping that never occurs in the file!
It doesn't seem like the right way to do it though!
 

feinholz 1234 posts Joined 05/08
29 May 2014

If each record has a single column, then you do not need a delimiter.
Curious: why did you pick such a large delimiter?

--SteveF

andydoorey 35 posts Joined 05/09
02 Jun 2014

I don't want to choose a delimiter!  However if I choose an empty delimiter by using:
SourceTextDelimiter  = ''
Then TPT uses the default delimiter of '|' which does exist in the file.
If I leave the SourceTextDelimiter parameter out then, again TPT uses the default demiliter.
The reason I chose such a long delimiter is because I need a delimiter that will never exist in the file. If you can suggect a more sensible way that I can treat the file as a single column of variable length text then that would be very useful.  The way that I'm doing it doesn't seem to be the best way of achieving this.

feinholz 1234 posts Joined 05/08
02 Jun 2014

I am not sure whether a shorter or longer delimiter will affect performance, but I thought if someone did not know what delimiter they should pick, they would choose something with punctuation marks, like "??", "???, "?!?".
 
Nothing wrong with the way you are doing this, I was just curious.
 

--SteveF

feinholz 1234 posts Joined 05/08
05 Jun 2014

What version of TPT are you using?
Can you put the single column data in quotes?
If so, then the processing will ignore anything in the quotes that might be considered the delimiter.

--SteveF

andydoorey 35 posts Joined 05/09
06 Jun 2014

I'm using version 15.0 (I've only recently downloaded it)
I'm not sure if I can put quotes round it.  There could be quotes within the file, and they may be unmatched ones - it's a bit of a strange data format we don't really have any control over.
I also don't really want to have to pre-process the file on the operating system before I load it.  We will be loading files which are used by an existing system from directory every 2 minutes.
What I really want is to always treat each line as a single column.  Is it possible to set the field delimiter to be the unix end of line character, i.e. the same character as the record delimiter?

feinholz 1234 posts Joined 05/08
09 Jun 2014

Will your data have tab characters?
You can specify the delimiter as:
 
VARCHAR TextDelimiter = 'TAB'
 
(Yes, I know; strange)
 
If not, we have an undocumented features (undocumented because it is going to change in the near future) where you can specify the delimtier characters in hex format. This uses a different attribute:
 
VARCHAR TextDelimiterHex = '0a0d'
 
But like I said, we will be changing the feature in the near future, so I do not want you to get tied into this. But you can try it to see if it works.
 
 

--SteveF

You must sign in to leave a comment.