All Forums Database
davidhfeldman 8 posts Joined 04/10
19 Nov 2014
REGEXP_SPLIT_TO_TABLE Error - Unexpected failure while attempting to convert UTF8 to UTF16.

I'm trying to use the REGEXP_SPLIT_TO_TABLE function using SQL Assistant Java Edition on TD 14.10.3.04 and I'm getting the following error:
[Teradata Database] [TeraJDBC 15.00.00.20] [Error 9134] [SQLState HY000] Unexpected failure while attempting to convert UTF8 to UTF16.
I thought it might have something to do with some special characters (that show up as squares in the output to text) in the column I'm running the function on, but when I removed those characters, I'm still getting the error. Has anyone seen this error before and have any idea on what could be causing it?

Raja_KT 1246 posts Joined 07/09
19 Nov 2014

I do not have SQL assistant. Did you try in bteq?
If you can share the script on what you are doing like 
how your table description is....and sample values...
example like how you select...:

SELECT * from table( regexp_split_to_table(src...,pattern.....,match)

returns (res varchar(100) character set latin)) as tablelllllll

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

davidhfeldman 8 posts Joined 04/10
19 Nov 2014

Sure, here is my script:

 

SELECT * from table(regexp_split_to_table(TEST.CHLD_ORDER_NO,TEST.CID, ',', 'c')

returns (res varchar(250) CHARACTER SET UNICODE, tokennum INTEGER, token varchar(250) CHARACTER SET UNICODE)) t1

 

It works for some sample records from my base table, but when I run it against the entire table, I get the error from my original post.

 

Here is the type of sample data from the base table that worked:

CID

CHLD_ORDER_NO

BCBJVDX10001,P1822570,BCBKGY2B0001,BCBK5FM30001,W0Z50624,W0W81370,W0Z50623,

537244

I'm thinking there are records in the base table with characters that the process doesn't like, but I'm not sure how to find them.

davidhfeldman 8 posts Joined 04/10
19 Nov 2014

Looks like the data didn't paste in properly. I'll try again...
CID value - BCBJVDX10001,P1822570,BCBKGY2B0001,BCBK5FM30001,W0Z50624,W0W81370,W0Z50623,
CHLD_ORDER_NO value - 537244

tomnolan 594 posts Joined 01/08
19 Nov 2014

Please provide the CREATE TABLE command for the TEST table.

davidhfeldman 8 posts Joined 04/10
19 Nov 2014

Sure:

CREATE VOLATILE TABLE TEST

(CIRCUITID VARCHAR(250) CHARACTER SET UNICODE

,CHLD_ORDER_NO VARCHAR(250) CHARACTER SET UNICODE

) ON COMMIT PRESERVE ROWS;

 

INSERT INTO TEST VALUES ('BCBJVDX10001,P1822570,BCBKGY2B0001,BCBK5FM30001,W0Z50624,W0W81370,W0Z50623,','537244');

 

My JDBC connection is using CHARSET UTF8.  I've tried with UTF16 and ANSI, but I get the same error.

tomnolan 594 posts Joined 01/08
19 Nov 2014

If your CIRCUITID column is supposed to contain Unicode characters, then it would be challenging to determine which rows contain the problematic characters.
 
But if your CIRCUITID column is only supposed to contain ASCII characters, then you can use the translate_chk function to find the non-ASCII characters:
 

select
   CHLD_ORDER_NO,
   translate_chk(CIRCUITID using unicode_to_latin) as e2
from TEST
where e2 > 0
order by 1

 

davidhfeldman 8 posts Joined 04/10
19 Nov 2014

That query returned no results when run on the base table, so it looks like the records are all clean.
I tried the below TEST table as well and I'm only getting results for one of the "NA" records...should I not expect to get results for all of them, since they have different CHLD_ORDER_NO values?

CREATE VOLATILE TABLE TEST

(CIRCUITID VARCHAR(250) CHARACTER SET UNICODE

,CHLD_ORDER_NO VARCHAR(250) CHARACTER SET UNICODE

) ON COMMIT PRESERVE ROWS;

 

INSERT INTO TEST VALUES ('BCBJVDX10001,P1822570,BCBKGY2B0001,BCBK5FM30001,W0Z50624,W0W81370,W0Z50623,','537244');

 

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768056');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768054');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768055');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768053');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','767660');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768058');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','767658');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768059');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','767659');

INSERT INTO TEST VALUES ('NA,NA,NA,NA,','768057');

 

SELECT * from table(regexp_split_to_table(TEST.CHLD_ORDER_NO,TEST.CIRCUITID, ',', 'c')

returns (res varchar(220) CHARACTER SET UNICODE, tokennum INTEGER, token varchar(220) CHARACTER SET UNICODE)) t1;

 

 

tomnolan 594 posts Joined 01/08
20 Nov 2014

I executed your SQL statements using Teradata Database 15.0 and I got the following result:
 

res     tokennum  token
------  --------  ------------
768056         1  NA
537244         1  BCBJVDX10001
768056         2  NA
537244         2  P1822570
768056         3  NA
537244         3  BCBKGY2B0001
768056         4  NA
537244         4  BCBK5FM30001
537244         5  W0Z50624
537244         6  W0W81370
537244         7  W0Z50623

 

Raja_KT 1246 posts Joined 07/09
20 Nov 2014

 
You can try for Non ascii chars check, just to double check:
substr(CIRCUITID , 1,1) >= '80'xc
substr(CIRCUITID , 2,1) >= '80'xc
substr(CIRCUITID , 3,1) >= '80'xc
substr(CIRCUITID , 4,1) >= '80'xc
substr(CIRCUITID , n,1) >= '80'xc
There are various conversion functions like if any for converting........ 

GRAPHIC_TO_LATIN

 

 

GRAPHIC_TO_UNICODE

 

 

....and many others
Also, I tested and found that I can enter  '0A'xc into a table and can read it too :) 

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

davidhfeldman 8 posts Joined 04/10
21 Nov 2014

Thanks Raja and Tom.
Raja - I think we can exclude special characters as a possible reason for this issue, as I tested both using Tom's SQL and a check in Excel and I found no special characters.
Tom - you can see in the result you ran that only one of the "NA,NA,NA,NA" records is processed, the one tied to CHLD_ORDER_NO 768056, but none of the other CHLD_ORDER_NO records are processed.  Any idea why that might be?
I also did some more testing on my own and found that certain combinations of records will cause the error, but when I run each record individually, they all work:

CREATE VOLATILE TABLE TEST
(CIRCUITID VARCHAR(250) CHARACTER SET UNICODE
,CHLD_ORDER_NO INTEGER
) ON COMMIT PRESERVE ROWS;

INSERT INTO TEST VALUES ('BCBKP5D10001,BCBH4LFM0001,BCBH4LFT0001,BCBH4LGP0001,',403446);
INSERT INTO TEST VALUES ('BCBH4LDY0001,BCBH4LGF0001,BCBJTTKT0001,BCBKKMT70001,',403243);
INSERT INTO TEST VALUES ('BCBJM3HS0001,BCBH4J250001,,,',683644);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768058);
INSERT INTO TEST VALUES ('C1005492/SERVICEINSTANCEID37907965,E4001116/SERVICEINSTANCEID37907964,C1005471/SERVICEINSTANCEID37907921,E4001104/SERVICEINSTANCEID37907920,',1212450);
INSERT INTO TEST VALUES ('BCBJM3HT0001,,,,',640055);
INSERT INTO TEST VALUES ('BCBK0FLK0001,,,,',537451);
INSERT INTO TEST VALUES ('BCBJM3C60001,BCBJM3C70001,,,',659857);
INSERT INTO TEST VALUES ('BCBKP5D10001,BCBH4LFM0001,BCBH4LFT0001,BCBH4LGP0001,',403443);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768057);
INSERT INTO TEST VALUES ('K0507991,SITEID#U299909',809643);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768054);
INSERT INTO TEST VALUES ('K0507991,SITEID#U299909',809644);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768055);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',767658);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768053);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',767659);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768059);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',768056);
INSERT INTO TEST VALUES ('NA,NA,NA,NA,',767660);

SELECT * from table(regexp_split_to_table(TEST.CHLD_ORDER_NO,TEST.CIRCUITID, ',', 'c')
returns (res INTEGER, tokennum INTEGER, token varchar(220) CHARACTER SET UNICODE)) t1;

 

 

tomnolan 594 posts Joined 01/08
21 Nov 2014

I agree that the behavior is unexpected. If you're a customer, then I recommend opening Teradata Customer Service incident.

davidhfeldman 8 posts Joined 04/10
21 Nov 2014

Already underway :)  Thanks for the help.  I'll reply with Teradata's response.

You must sign in to leave a comment.