All Forums Tools
tdurbin 2 posts Joined 02/12
20 Feb 2012
bteq export and import for _ET dataparcel

Hi,  I'm new to Teradata and need to do the following:

I'm on a Unix system, using verision 13.0. 

We're inserting data into the "Stage" table from a delimited text file (bell is the delimeter \007)

1.  export the dataparcel field from the _ET table to a file  (delimited text if possible)

2.  fix the contents of the file

3.  import the corrected file into the "Stage" table

This is my bteq export command:   

.SET RECORDMODE ON
.EXPORT DATA FILE=bteq_ET_SELECT.out

LOCKING ROW FOR ACCESS
SELECT DataParcel
FROM DEV_CONTR_STG.R_SLS_ORD_TYP_ET
;

.EXPORT RESET;
.LOGOFF;
.QUIT;

Results:

SQp^F000186^Z2011-05-12-21.20.21.613001^Z2011-05-12- 21.20.21.613001^HDEVETL
SQp^F000192^Z2011-05-12-21.23.56.659506^Z2011-05-12- 21.23.56.659506^HDEVETL

I've tried different combinations using RECORDMODE and INDICDATA with .SET separator "|" , but can never export the DataParcel as delimited text.

There are null fields in the DataParcel, but these null fields don't show up as fields at all in the dataparcel output file.   (e.g. when I use SQL Assistant to view the dataparcel, I do see the null fields)

I read in another post that you don't need to specify the 1st field above on the import because BTEQ adds the "fast load" format to the beginning of the record and to each field in the record.

----

Then I try to import my "exported file" from above

.IMPORT DATA FILE=bteq_ET_SELECT.out
.REPEAT *

using(fld1 varchar(6),
fld2 varchar(26),
fld3 varchar(26),
fld4 varchar(20)
)
insert into dev_contr_stg.ttd_err_tst
(sls_ord_typ_cd,row_insrt_tms,row_updt_tms,row_usr_id)
values (:fld1,
:fld2,
:fld3,
:fld4);

.LOGOFF;
.QUIT;

 *** Starting Row 0 at Mon Feb 20 17:32:35 2012

 *** Failure 2673 The source parcel length does not match data that was defined.
                Statement# 1, Info =1
 *** Total elapsed time was 1 second.

I've tried different definitions,  I've tried even exporting valid data from a table & importing it BUT I get the same error.

Thanks for any help!

feinholz 1234 posts Joined 05/08
20 Feb 2012

We are considering writing a tool to do this and so I have been taking a look at the contents of the DataParcel field.

The DBS places the record, just as the client software sends it, into that field in the table.

That field is a VARBYTE. This means there is no translation taking place by the DBS. The data is in its raw form.

When a customer tries to load the data from a delimited data file, the client application converts the data to a series of VARCHAR fields. Each VARCHAR field has a 2-byte length indicator.

The 2-byte field length indicators will be byte-ordered according to the platform architecture from which the job originated.

Also, the entire record has a 2-byte record length. And if the data was specified in indicator mode, the data will also have indicator bytes.

BTEQ is going to extract out the raw data from the field and write it out. BTEQ, will be providing the record length (and parcel length) indicators and will byte-order them according to the platform architecture on which BTEQ is running.

Thus, you need to know quite a bit of information in order to process that data back to delimited records. You will need to know how to process the indicator bytes, you will need to know how to process the 2-byte field lengths and then process the data and place delimiter characters.

BTEQ cannot so this. You will have to do a lot of this on your own.

 

--SteveF

tdurbin 2 posts Joined 02/12
21 Feb 2012

So currently there is no tool which can extract the dataparcel without the indicator bytes ?

Fast Export / Import or anything like that?

Does anyone have a process for identifying, then reprocessing the failed records before the next daily incremental load?

 

 

feinholz 1234 posts Joined 05/08
21 Feb 2012

No, there is currently no tool.

Although, it is not really that difficult for you to do it on your own. But you either have to have a hex editor or write a simple C program to process the exported data from BTEQ.

--SteveF

FelixBarbalet 11 posts Joined 06/11
21 Feb 2012

Hi

We've just released an alpha version of a tool which can import/export data from teradata to csv - it might be what you're looking for. We have only tested it on our own machines, so ymmv but you can get more info from https://github.com/xlfe/dwhwrapper

feinholz 1234 posts Joined 05/08
22 Feb 2012

I just read the notes on that tool.

Are you telling me that for any Teradata FastLoad job that resulted in rows ending up in the error table, that your tool will be able to issue this command:

$ dwh get output.csv 'select DataParcel from <error table name>'

and the tool would be able to properly extract out the data from that VARBYTE field and write out the data in CSV format?

Please show me an example.

 

--SteveF

FelixBarbalet 11 posts Joined 06/11
22 Feb 2012

Ah sorry - no it won't do that out of the box.

FelixBarbalet 11 posts Joined 06/11
22 Feb 2012

It does decode/encode Fastexport binary format ('VARCHAR','CHAR','DECIMAL','FLOAT', 'INTEGER','SMALLINT','DATE','BYTEINT' currently only) including nulls/indicdata, but you would need to supply the field types (and do some python hacking) because it currently assumes that the fields are defined by the select statement (which is not the case in your example)

 

feinholz 1234 posts Joined 05/08
22 Feb 2012

The topic of the thread was about processing data in a FastLoad error table.

As much as I appreciate your desire to promote your conversion tool, it would not work for the problem described in this thread.

It has quite a few limitations.

 

--SteveF

FelixBarbalet 11 posts Joined 06/11
21 Mar 2012

Thanks for the feedback.

ProudCat 13 posts Joined 09/11
28 Mar 2012

So just to confirm there is no way to decode DataParcel field in _ET table, right?

If so this is very inconvenient because I have no idea how to find out which 50K records failed out of my million record load and why...

For example, I see the error has something to do with a timestamp field but I cannot understand why as there is no way to look at the actual erroneous input data.

Seems like this enhancement should get hundreds of votes...

 

feinholz 1234 posts Joined 05/08
28 Mar 2012

Well, there is a way to decode the data (we are working on a tool for that right now). The tool, however, will work with jobs run by TPT.

However, you do need to know how the data was originally sent to the DBS.

If the data was provided to the loading tool in "VARTEXT" (delimited) format, the decoding of the DataParcel data is quite easy.

If the data was sent as binary data, then you would need to convert the data.

Not impossible. But you will need to know the layout of the data as it was sent to Teradata so that you can parse through to the correct column.

The error code tells you the DBS error. The ErrorFieldName tells you which field it was.

--SteveF

ProudCat 13 posts Joined 09/11
30 Mar 2012

Well, the file is tab delimited text file but when I look at the DataParcel column it looks binary to me.

I know what field has a problem but the data looks binary so I cannot figure out where my column is.

We use tdload to load the data. So how do I decode it?

 

feinholz 1234 posts Joined 05/08
30 Mar 2012

If the file was TAB delimited, then the data was sent in character format (just a series of VARCHARs).

So, I believe DataParcel will first contain a 2-byte record length, followed by the indicator bytes, followed by a series of VARCHARs. Each VARCHAR is made up of a 2-byte length, followed by the character data.

There will be 1 indicator byte for every 8 columns of data sent to the database.

From this information you should be able to extract out the data.

If you scan past the record length and the indicator bytes, and interpret each 2-byte VARCHAR length, I think you will see the characters pretty easily.

--SteveF

suhailmemon84 47 posts Joined 09/10
04 May 2013

Hi Feinholz,
You mentioned earlier that there was going to be some consideration towards building a tool that will potential be able to decode the dataparcel column in the _ET table in a convenient manner.
Is that tool available?
Currently, we import a LOT of web data(30 million records a day) into one of our table using TPT LOAD operator. Since it is web data, the possibility of data being out of format is pretty much there. As a result we have a situation where potentially 20-30k records make their way to the error table due to bad formatting on a daily basis. Though the errorcode and errorfield tell us what and where the error is, it would be great if we could see the actual record too rather than in some binary format.
Due to this limitation, we're actually considering loading all 30 million records into a varchar only NOPI stage table, then insert the same into the target table with the proper datatypes using MERGE. The downside with this approach is the additional casting we need to do for all columns before inserting the records into the target table. Casting can become really expensive from a CPU Standpoint especially when done over a lot of records and a lot of columns. Our target table is both: VERY LONG AND VERY VERY WIDE.
 

suhailmemon84 47 posts Joined 09/10
04 May 2013

One mistake in my previous post is that we're currently using FASTLOAD instead of the TPT LOAD operatore.
But we're also open to converting the ETL of this job to TPT if handling the error records prove to be much easier. (easier being the fact that our PS team can now easily view the error records rather than pull their hair decoding the dataparcel column which they're doing currently every single day)
-Suhail

feinholz 1234 posts Joined 05/08
06 May 2013

The tool is still in development. There is no release date yet.
 

--SteveF

suhailmemon84 47 posts Joined 09/10
06 May 2013

Thank you feinholz for your response.
For the problem I'm having, do you have any other solution except the below one:
"loading all 30 million records into a varchar only NOPI stage table, then insert the same into the target table with the proper datatypes using MERGE. The downside with this approach is the additional casting we need to do for all columns before inserting the records into the target table"
Regards,
Suhail

feinholz 1234 posts Joined 05/08
06 May 2013

Even though the data is in a binary format, the only reason why it is stored in a VARBYTE is so that the endianness character set encoding is preserved. The data itself is in the exact format that we send to the DBS. Thus, if you provided the data as delimited data, then the information in the VARBYTE field is a series of VARCHAR fields. It is not that difficult for you to pull out the data and extract it yourself.
 

--SteveF

suhailmemon84 47 posts Joined 09/10
06 May 2013

In order to pull data from a varbyte column(with delimited data in them with any delimeter...say pipe) and insert into a varchar column, I need to be able to first "split" the data using the delimeter(in this case: pipe) and then somehow cast it to a varchar.
2 problems here:
1. How do I split a varbyte column within Teradata?
2. How do I convert a varbyte to a varchar? type conversions on byte columns aren't allowed.
I think, at this point, exporting the dataparcel column to a file using bteq and importing it using the "using modifier" seems to be the only solution.
Regards,
Suhail

feinholz 1234 posts Joined 05/08
06 May 2013

The data in the varbyte does not have the delimiter. We do not send "delimited" data to Teradata. We process the delimited data and send the row to Teradata as a series of VARCHAR fields.
If you extract the data yourself, you will have access to the VARCHAR data. Our extractor tool will be reconstructing the delimited data and converting VARCHARs back into delimited records.
So, you may want to try to extract out yourself and just process the VARCHAR columns.

--SteveF

suhailmemon84 47 posts Joined 09/10
07 May 2013

ok Thank you Feinholz.
-Suhail

Eitu_teradata 8 posts Joined 09/13
07 Jan 2014

Hi All ,
I have a similar situation where I want to export the data from the error table as a delimited file.
Please suggest if anything in handy
 

AK251783 1 post Joined 07/14
21 Jul 2014

Hi,
 
I am  new to terradata.Importing the output of a sql query to a new table created in Terradata BTEQ through 'cmd' involves DATA mode,VARTEXT mode,REPORT mode.Could you please help me by explaining Report mode in above three modes.I am familiar with vartext mode and DATA mode. 

Rammurthy17 1 post Joined 02/15
05 Nov 2015

Hi SteveF,
You replies on March 28 saying "Well, there is a way to decode the data (we are working on a tool for that right now). The tool, however, will work with jobs run by TPT."
Is this tool available now? If so, please provide the details on how to use the same.
Thanks !!

feinholz 1234 posts Joined 05/08
05 Nov 2015

The tool is called "twbertbl" (the "twb" prefix was to be consistent with "twbstat", "twbkill", "twbrmcp", etc.).
It is documented in the TPT Reference Manual.
 

--SteveF

sham5 1 post Joined 06/16
29 Jun 2016

Hi Feinholz,
  I tried using the TWBERTBL utility to extract the DataParcel Column in the error table and noticed special characters in the output file. Below is the command thats been used . Source is a pipe delimited file, using TPT fastload the Target table is loaded. Please provide your inputs on this. 
twbertbl -u irmsrb1 -t I1991_MLD_ERR_28111532-n 53 -a <accountid> -b<Databasename> -c ASCII -d<|> -e LE -h terat -m<Max nof rows to process> -o C:\TPT\errorlog_0628.txt -r FORMATTED -x C:\TPT\TRACEFILE_0628.txt
 
 

You must sign in to leave a comment.