All Forums Database
td_admirer 36 posts Joined 07/12
23 Jul 2014
regexp_substr in TD14.10

Hi Gurus,
Hope all is well with you.
I am trying to solve an issue with parsing a text field based on a delimiter and with the same delimiter being part of the double-quotes. I am on DB version TD14.10 and I am trying to use a combination of regexp_substr and regexp_replace, but not have been able to parse the fields properly. Please help.
This is a sample record
a,b,c,"abc,def,dfg",m,n,"hi,how,are you",g,k
 
The output should be separate fields for separate columns.
a
b
c
abc,def,dfg
m
n
hi,how,are you
g
k
 
I tried using
SEL regexp_replace(regexp_replace(regexp_substr('a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k','[^,]+',1,1),'"',''),'"','');
 but I am spliting the data within double-qiotes as well.
 
Any and all help would be appreciated.
 
Thank you.
 
 
 
 

Raja_KT 1246 posts Joined 07/09
23 Jul 2014

Try with a combination of  strtok_split_to_table. 
SELECT * FROM TABLE (strtok_split_to_table(1,'a,b,c,"abc,def,dfg",m,n,"hi,how,are you",g,k', ',/')
RETURNS (outkey integer, tokennum integer, token varchar(15)character set unicode) ) as d
strtok is also there.

Raja K Thaw
My wiki: http://en.wikipedia.org/wiki/User:Kt_raj1
Street Children suffer not by their fault. We can help them if we want.

td_admirer 36 posts Joined 07/12
23 Jul 2014

Thanks Raja. This does not work, as we are vertical pivoting taking the commas inside the double-qutes into consideration. I need the output to be something I can insert into multiple columns. The data within the double-quotes need to go into one column.

Vinodraj 12 posts Joined 01/13
24 Jul 2014

Hi,
 
Check if this would satisfies your needs.
 

SEL oreplace(
    regexp_substr(
    'a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k',
    '((?:,|^)(?!,)(?:"(.*?(?<!\\\\))"|((?:.(?!,))*(?:.(?=,))?))|,())',1,1),
    '"',''),
    oreplace(
    regexp_substr(
    'a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k',
    '((?:,|^)(?!,)(?:"(.*?(?<!\\\\))"|((?:.(?!,))*(?:.(?=,))?))|,())',1,2),
    '"',''),
    oreplace(
    regexp_substr(
    'a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k',
    '((?:,|^)(?!,)(?:"(.*?(?<!\\\\))"|((?:.(?!,))*(?:.(?=,))?))|,())',1,3),
    '"',''),
    oreplace(
    regexp_substr(
    'a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k',
    '((?:,|^)(?!,)(?:"(.*?(?<!\\\\))"|((?:.(?!,))*(?:.(?=,))?))|,())',1,4),
    '"','');

Above SQL is splitted with four column and not as a reocrds. If this doesn't satisfies you, you could also have writing sp is one of the options.
 
 

-
Vinod

dnoeth 4628 posts Joined 11/04
24 Jul 2014

This works for your example:

select 'a,b,c," abc,def,dfg",m,n,"hi,how,are you",g,k' as x
   ,trim(both '"' from regexp_substr(x ,'("[^"]+"|[^,])',1,1,'c'))
   ,trim(both '"' from regexp_substr(x ,'("[^"]+"|[^,])',1,2,'c'))
   ,trim(both '"' from regexp_substr(x ,'("[^"]+"|[^,])',1,4,'c'))
   ,trim(both '"' from regexp_substr(x ,'("[^"]+"|[^,])',1,7,'c'))

But this looks like simple comma-delimited data, why don't you split the data in TPT before inserting them?
 

Dieter

td_admirer 36 posts Joined 07/12
24 Jul 2014

Thank you Dieter and Vinod. This is exactly what I've wanted. Much appreciated.
 
Dieter,
Can you kindly tell me how you would skip the delimiter within the double-quotes in TPT. I am not aware of any parameter in TPT. Kindly help.
 
Thank you.

dnoeth 4628 posts Joined 11/04
24 Jul 2014

Current rleases of TPT support quoted data, you need to use following attributes for the DataConnector:

SourceFormat   = 'Delimited'
TextDelimiter  = ','
QuotedData     = 'Optional'
OpenQuoteMark  = '"'
CloseQuoteMark = '"'

 

Dieter

You must sign in to leave a comment.