All Forums Database
BLUESTAR 3 posts Joined 06/12
03 Mar 2015
Teradata String Manipulation, Extract String from a text field.

Hi,
 
I have a field P_ID which contains data like
'ICHIP00~WICHNA01~KD06'
'CHIP00~WIA01~KD70'
'WICH0~WIA01~KD0~I89'
'WIP00~ICH~KD05'
 
I need an output as such
 
Take the 3rd component of the  p_id, which is always after the 2nd ~.  There should not be a 3rd ~, but if there is a 3rd ~, then take what's between the 2nd and 3rd ~.
Example,  take the K01 of the following P_ID value: NBSIC0~NVCA~K01.
 
Any help will be appriciated.
 
Thanks

BLUESTAR 3 posts Joined 06/12
03 Mar 2015

SO far I have tried these and no luck
 
select P_ID,SUBSTR(P_ID,1,index(P_ID,'~')) from EDW This is giving me 1st component NBSIC0~
select P_ID,SUBSTR(P_ID,LENGTH(P_ID)-1,index(P_ID,'~'))

VandeBergB 182 posts Joined 09/06
03 Mar 2015

Blue,
What you need to do is consider your initial substring component as a unit of string and repeat what you've done to the raw column again.  Yeah, i know clear as mud...
Consider if you will a table consisting of two columns, string1 and string2, the ddl looks like this:

CREATE MULTISET TABLE EDW.stringtest ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      string1 CHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
      string2 VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX strintest_idx_PR ( string1 );

Pretty simple so far.. into this table we'll run a couple of inserts of data similar to what you've got.  A select * from our table looks like:

string1          string2
---------------- ---------------------
NC325~XAbt~C1    DEA~AKDOIWESDK~C2
NC325~XAbt~CDWX1 DEA~AKDOIWESDK~Clkjf2

Two columns of test data...
The following constructs

SELECT
	string2
	,SUBSTRING(string2 FROM (INDEX(string2,'~')+1) FOR 20)
	,SUBSTRING((SUBSTRING(string2 FROM (INDEX(string2,'~')+1) FOR CHAR_LENGTH(string2))) FROM (INDEX(string2,'~')) FOR CHAR_LENGTH(string2)) AS subbed2
 	,SUBSTRING(subbed2 FROM (INDEX(subbed2,'~')+1) FOR CHAR_LENGTH(string2))
FROM 
	edw.stringtest;

are pretty simple, once you understand what's going on.  The first column in the select statement is the raw string, the second column is a substring of the raw string, starting from the position of the tilde + 1 for the length of the string.  The third column applies the same concept but treats the second column in the select as one unit and takes a substring of that unit from the position of the tilde+1 in the second string.  Teradata lets you use the alias from a previously defined column in a succeeding column so the first 'iteration' of the code in the second column is processed as a whole in the same logic to find the string after the second tilde in the raw string.
Cheers 
 

Some drink from the fountain of knowledge, others just gargle.

MaximeV 19 posts Joined 11/13
04 Mar 2015

SEL REGEXP_SUBSTR('NBSIC0~NVCA~K01','([A-Za-z0-9])+',1,3,'i');
 
if components only contain Alphanumeric characters. take the 3rd component of this kind.

dnoeth 4628 posts Joined 11/04
04 Mar 2015

If you're on TD14 the easiest way will be STRTOK:

strtok(P_ID,'~',3)

 

Dieter

You must sign in to leave a comment.