All Forums Database
farhan_aries 9 posts Joined 10/10
25 Sep 2012
Help in Recursive query required

Hi,
I would be needing your help here. I have data coming into a column as A|B|C|D| which I would be needing to split into columns based on pipe delimiter. Now the number of Pipes is unknown so if D is not present, I have to populate the value of C and if C is unknown then I would have to use B and so on. Unfortunately, the level is not unknown so if the string doesnt have all the values of A, B, C and D present, one column could contain the value of A,B,C,D as a merge. Could you please help me with this? Thanks in advance

 

WAQ 158 posts Joined 02/10
25 Sep 2012

Hope this will help you.
http://forums.teradata.com/forum/database/deconcatenation-of-column-into-multiple-columns-0

farhan_aries 9 posts Joined 10/10
25 Sep 2012

Thanks WAQ for the prompt response. I wrote the same solution for this as well but doesnt solve my problem unfortunately :). My String is 5 PIPES long and I have to manipulate the data as if in ROW 1 I have All A, B, C, D, and E present I have to take each one and define levels over it. If in second Row I have A, B, C and D present, I have to take D and populate the resulting column with D and E from the right and so on. Please let me know if this is unclear. I try to illustrate it now.
A|B|C|D|E|
A|B|C|D|
A|B|C|
But one thing to remember is that we have to start parsing it from the right and the right most Pipe should not be considered :).
 
Thanks,

 

ulrich 816 posts Joined 09/09
25 Sep 2012

I still don't get what you wane achive
But also check http://developer.teradata.com/blog/madmac/2010/03/a-few-basic-scalar-string-udfs
These are quite usefull - for you the list functions might be of interesst...
 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

farhan_aries 9 posts Joined 10/10
25 Sep 2012

Hi Ulrich,
I can understand :) and apologies on being unclear. Let me try to elaborate it. I have a String Dataset coming in a column, PIPE delimited. There are multiple rows forming the data into the set of columns. Now this Data should be parsed into columns based on the pipe delimiter. If I take:
A|B|C|D|E|
A|B|C|D|
then COL1 would be A, COL2 would be B, COL3 would be C and so on. The Last Delimiter should not be considered as it does not contain any value.
In case 2, We only have to Parse until D
I hope this helps understanding my problem
Thanks,
 

WAQ 158 posts Joined 02/10
25 Sep 2012

So your ouput for this string would be something like this?
1   A|B|C|D|E|
2   A|B|C|D|
3   A|B|C|
4   A|B|
5   A|

ulrich 816 posts Joined 09/09
25 Sep 2012

Consider the udfs you could directly use them to extract your infos.
Otherwise - I guess A,B,C etc are examples. So these strings can be of variable lenght?
AAA|BB|CCCC|
?
How long is your source column? Varchar(X)?
 
 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

farhan_aries 9 posts Joined 10/10
25 Sep 2012

Hi WAQ,
Yes, youre right. My output should look like the one you mentioned
 
Hi Ulrich,
Strings are of variable Length. My source column is varchar 255.
 
Can we do it with Recursive?
 
Thanks
Farhan

Qaisar Kiani 337 posts Joined 11/05
25 Sep 2012

Here you go... LEFT_WORD in resultset is youre desired output.

DROP TABLE T1;
 
CREATE TABLE T1
(
  row_id INT NOT NULL,
  text VARCHAR(991) NOT NULL
);
 
INSERT INTO T1 VALUES (1,'A|B|C|D|E');

WITH RECURSIVE recursive_T1
 (row_id,
  text,
  len,
  left_word,
  right_word,
  pos
 ) AS (
  SELECT
    row_id,
    text,
    POSITION('|' IN text || '|') AS len,
    SUBSTRING(text FROM 1 FOR len) AS left_word,
    SUBSTRING(text || '|' FROM len+1) AS right_word,
    1 as pos
  FROM T1
  
  UNION ALL
  
  SELECT
    row_id,
    text,
    POSITION('|' IN right_word) AS len_new,
    left_word || SUBSTRING(right_word FROM 1 FOR len_new) as left_word,
    SUBSTRING(right_word FROM len_new + 1) as right_word,
    pos + 1 as pos
  FROM recursive_T1
  WHERE right_word <> ''
 )
SELECT *
FROM recursive_T1

Qaisar Kiani 337 posts Joined 11/05
25 Sep 2012

You need to test it comprehensively as I can only help you with limited test cases :)

farhan_aries 9 posts Joined 10/10
25 Sep 2012

Hi QAKiani,
Thanks very much, quite near to what I am looking for :). Can we Split the String in columns and start parsing from the Right instead of the left? For instance, If I get A|B|C|D|E| in the column, I split the data in multiple columns like:
A|B|C|D|E|
A|B|C|D|
A|B|C|
COL1 COL2 COL3 COL4 COL5
A          B      C       D       E
A          B      C       D
A          B      C
and so on.
One thing what is in our favor is this string could only get upto E so we can hardcode the columns to five and populate empty column when the value is not present, except the last pipe from the right. There are multiple rows coming into the data like what I have shared above.
Perhaps I am creating confusion instead of clarification, my bad. I am at a Duel with the person who wrote the rule :)
Thanks very much guys the help in advance

Qaisar Kiani 337 posts Joined 11/05
26 Sep 2012

Hi Farhan,
Yes you are creating the confusion just like good old day :)
The solution posted earlier solves the problem that was mentioned initially. Re parsing from the right, that is not possible in simple SQL because the SQL function POSITION finds the position of first instance of the substring in the actual string. Parsing from right could be done through a recursive sql itself, but I don't think you will need that.
Now coming to the new requirement, if you are putting all the substring in the columns, does the column sequence matter or it could be in any order?
--Qaisar

Qaisar Kiani 337 posts Joined 11/05
26 Sep 2012

Farhan,
The below SQL will return the 5 columns in one row. Each column has the content between two pipes. Is this what you need?
Have fun...

WITH RECURSIVE recursive_T1
 (row_id,
  text,
  len,
  left_word,
  right_word,
  pos
 ) AS (
  SELECT
    row_id,
    text,
    POSITION('|' IN text || '|') AS len,
    SUBSTRING(text FROM 1 FOR len-1) AS left_word,
    SUBSTRING(text || '|' FROM len+1) AS right_word,
    1 as pos
  FROM T1
  
  UNION ALL
  
  SELECT
    row_id,
    text,
    POSITION('|' IN right_word) AS len_new,
    SUBSTRING(right_word FROM 1 FOR len_new-1) as left_word,
    SUBSTRING(right_word FROM len_new + 1) as right_word,
    pos + 1 as pos
  FROM recursive_T1
  WHERE right_word <> ''
 )
SELECT text, max(case when pos=1 then left_word else null end )as col1
, max(case when pos=2 then left_word else null end ) as col2 
, max(case when pos=3 then left_word else null end ) as col3
, max(case when pos=4 then left_word else null end ) as col4 
, max(case when pos=5 then left_word else null end ) as col5 
from recursive_t1
group by 1

-- Qaisar

ulrich 816 posts Joined 09/09
26 Sep 2012

alternative  code avoiding recurisve query

CREATE volatile TABLE T1
(
  row_id INT NOT NULL,
  text VARCHAR(991) NOT NULL
) unique primary index (row_id)
on commit preserve rows;

INSERT INTO T1 VALUES (1,'A|B|C|D|E|');
INSERT INTO T1 VALUES (2,'AA|BB|CC|DD|');
INSERT INTO T1 VALUES (3,'AAA|BBB|CCC|');
INSERT INTO T1 VALUES (4,'AAaa|BBbb|');
INSERT INTO T1 VALUES (5,'AAaaa|');


select row_id,
       col1,
       col2,
       col3,
       col4,
       case when index(text,'|')>0 then substr(text,1,index(text,'|')-1) else null end as col5
from 
	(
	select row_id,
	       col1,
	       col2,
	       col3,
	       case when index(text,'|')>0 then substr(text,1,index(text,'|')-1) else null end as col4,
	       case when index(text,'|')>0 then substr(text,index(text,'|')+1) else null end as text
	       
	from 
		(
		
		select row_id,
		       col1,
		       col2,
		       case when index(text,'|')>0 then substr(text,1,index(text,'|')-1) else null end as col3,
		       case when index(text,'|')>0 then substr(text,index(text,'|')+1) else null end as text
		       
		from 
			(
			select row_id,
			       col1,
			       case when index(text,'|')>0 then substr(text,1,index(text,'|')-1) else null end as col2,
			       case when index(text,'|')>0 then substr(text,index(text,'|')+1) else null end as text
			from 
				(
				select row_id, 
				       case when index(text,'|')>0 then substr(text,1,index(text,'|')-1) else null end as col1,
				       case when index(text,'|')>0 then substr(text,index(text,'|')+1) else null end as text
				from t1
				) as tmp1 
			) as tmp2
		) as tmp3
	) as tmp4 
order by 1

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

farhan_aries 9 posts Joined 10/10
26 Sep 2012

Hello Qaiser,
Hehe :). Just reconfirmed that sequence of columns is required. 'A' becomes COL A value, 'B' as COL B value, 'C' as COL 'C' and so on and we have to start parsing from the right and we shall pick the column of interest upon need as this Hierarchy data. I understand that this is only possible thru Recursive query but since I have no experience with it, I need help. Hope this time I am clear :)
Thanks in advance

Qaisar Kiani 337 posts Joined 11/05
26 Sep 2012

Farhan,
Does the SQL I posted few minutes ago meets your requirements? If not then whats missing?
-- Qaisar

farhan_aries 9 posts Joined 10/10
26 Sep 2012

Qaiser,
It worked. Thanks very much for the help :).
 
Thanks very much to all who helped :).
 
Farhan

Qaisar Kiani 337 posts Joined 11/05
26 Sep 2012

Goood... Enjoy!! :P
-- Qaisar

You must sign in to leave a comment.