All Forums Database
Bill Scott 2 posts Joined 04/13
08 Apr 2013
concatenate value of multiple rows into one Single row

Hi,
I searched through the forums and i know a similiar question has been asked and answered, but it doesnt work with my requirements. i tried writing a recursive query something like below, but the recursive query for my data is taking a whole lot of time to execute and i believe is highly skewed, is there any other way that i could achieve the below result?

WITH RECURSIVE rec_test(parent,child, location,mail,LVL)
   AS
   (
    SELECT parent_id,MIN(child_id)(VARCHAR(1000)),MIN(city_nm) (VARCHAR(1000)),email_addr, 1
    FROM temp
    GROUP BY parent_id, email_addr
    UNION ALL
    SELECT  parent_id, TRIM(child_id) || ', ' || child, TRIM(city_nm)  || ', ' || location ,email_addr,LVL+1
    FROM temp INNER JOIN rec_test
    ON parent_id = parent
   AND child_id >child
 
 
   )
   SELECT parent,child, location,mail,LVL
   FROM rec_test
QUALIFY RANK() OVER(PARTITION BY parent ORDER BY LVL DESC) = 1;

My actual data has around 4000 rows with 62 unique Parent ID's and the top 10 counts of the Parent ID being
 
1000
952
485
292
140
77
76
76
76
69

 
Since the counts of the parent ID's are huge, the recursive query takes a whole lot of time to execute on the actual set of data, is there any other way that i can achieve a similiar result?
SOURCE DATA:

Parent ID   Child ID                     Location                    Parent_EMAIL_ADDR
1                 Sales                          Portland                     1ABC@XYZ.COM
1                 Finance                      San Francisco             1ABC@XYZ.COM
1                 CC                              New York                   1ABC@XYZ.COM
2                 Risk management      New Orleans              2DEF@XYZ.COM
2                 Healthcare                 Chicago                      2DEF@XYZ.COM
3                 Finance                      Salem                         3GHI@XYZ.COM
3                 CC                              Los Angeles               3GHI@XYZ.COM
4                 Sales                          Houston                     4JKL@XYZ.COM

Expected Output

Parent ID   Child ID                                       Location                                         Parent_EMAIL_ADDR
1                 Sales , Finance, CC                      Portland,San Francisco, New York   1ABC@XYZ.COM
2                 Risk management, Healthcare     New Orleans, Chicago                     2DEF@XYZ.COM
3                 Finance, CC                                  Salem, Los Angeles                         3GHI@XYZ.COM
4                 Sales                                            Houston                                          4JKL@XYZ.COM

 
Thanks,
Bill

dnoeth 4628 posts Joined 11/04
08 Apr 2013

Hi Bill,
you're creating a huge intermediate spool due to the join on parent_id = parent AND child_id >child which is a kind of cross join.
Better use the following approach:

CREATE VOLATILE TABLE vt_temp AS (
 SELECT 
   Parent_ID                     
   ,Child_ID                      
   ,city_nm                       
   ,EMAIL_ADDR                    
   ,ROW_NUMBER() OVER (PARTITION BY parent_id ORDER BY child_id) AS rn
FROM temp
) WITH DATA PRIMARY INDEX(parent_id) ON COMMIT PRESERVE ROWS;


WITH RECURSIVE rec_test(parent,child, location,mail,LVL)
   AS
   (
    SELECT parent_id,child_id (VARCHAR(1000)),city_nm (VARCHAR(1000)),email_addr, 1
    FROM vt_temp
    WHERE rn = 1
    UNION ALL
    SELECT  parent_id, TRIM(child_id) || ', ' || child, TRIM(city_nm)  || ', ' || location ,email_addr,LVL+1
    FROM vt_temp INNER JOIN rec_test
    ON parent_id = parent
   AND vt_temp.rn = rec_test.lvl+1
   )
   SELECT parent,child, location,mail,LVL
   FROM rec_test
QUALIFY RANK() OVER(PARTITION BY parent ORDER BY LVL DESC) = 1;

The QUALIFY could also be replaced by doing a COUNT in the Create Table and a LVL=COUNT.
Dieter

Dieter

Bill Scott 2 posts Joined 04/13
09 Apr 2013

Thanks Dieter, that works like a charm.
 
--Bill

mensssanvi 4 posts Joined 04/13
24 Apr 2013

 
I am trying to create a group of ranking as shown in below table. I've used the row () over partition function it is just doing a row numbering. In fact, I want the min row number in column "clust_n" to be allocated when rows in column wh_no match. (e.g. when 7923 come 3 times I would like clust_n to show 4,4,4 not 4,5,6
 
 

wh_no

cust_no

clust_n

 

7769

2701288

1

 

7771

224853

2

 

7809

701631

3

 

7923

3185005

4

4

7923

3185042

5

4

7923

968162

6

4

7999

2623738

7

 

12186

3263053

8

 

12905

5491540

9

 

13017

1079323

10

 

13017

1078590

11

10

dnoeth 4628 posts Joined 11/04
24 Apr 2013

You should post new questions as a new topic.
 
Based on your narration you might simply do a RANK instead of ROW_NUMBER?
 
Dieter

Dieter

akyz 5 posts Joined 05/14
02 Jun 2014

Hello,
I have been utilizing the code provided by Dieter (works great), but I have a slight variation on the original question.  How would Dieter's code change if we wanted to concatenate all child_ids todether into one field for a given parent_id, however, we want to keep separate records for each Location?
Using the same source data:
Parent ID   Child ID                     Location                    Parent_EMAIL_ADDR
1                 Sales                          Portland                     1ABC@XYZ.COM
1                 Finance                      San Francisco             1ABC@XYZ.COM
1                 CC                              New York                   1ABC@XYZ.COM
2                 Risk management      New Orleans              2DEF@XYZ.COM
2                 Healthcare                 Chicago                      2DEF@XYZ.COM
3                 Finance                      Salem                         3GHI@XYZ.COM
3                 CC                              Los Angeles               3GHI@XYZ.COM
4                 Sales                          Houston                     4JKL@XYZ.COM
 
This time, we would want this expected output:
Parent ID   Child ID                         Location                    Parent_EMAIL_ADDR
1                 Sales, Finance, CC         Portland                     1ABC@XYZ.COM
1                 Sales, Finance, CC         San Francisco             1ABC@XYZ.COM
1                 Sales, Finance, CC         New York                   1ABC@XYZ.COM
2                 Healthcare, Healthcare  New Orleans              2DEF@XYZ.COM
2                 Healthcare, Healthcare  Chicago                      2DEF@XYZ.COM
3                 Finance, CC                    Salem                         3GHI@XYZ.COM
3                 Finance, CC                    Los Angeles               3GHI@XYZ.COM
4                 Sales                              Houston                     4JKL@XYZ.COM
 
Thanks in advance!

-AK

akyz 5 posts Joined 05/14
03 Jun 2014

I have moved my question above to a new thread:
http://forums.teradata.com/forum/database/concatenate-values-from-one-field-in-multiple-rows-into-one-one-field-on-all-rows
Thanks!

-AK

vminc 1 post Joined 07/15
17 Jul 2015

Hi Dieter
I have similar problem..but I have multiple many to one relations
Hi I have a data set with one too many relations. I want to reduce it to 1-1 by concatenatingthe distinct values for each column
Here is an example:
I have a data set called Customer and Product Affiliation (PA). One customer can have multiple PA for different times. 
Here is the input:

╔════════╦═══════╦══════╦══════╗
║ Cust ║PA1 ║ PA2 ║ PA3 ║
╠════════╬═══════╬══════╬══════╣
║ A ║ H ║ M ║ L ║
║ A ║ H ║ L ║ M ║
║ A ║ H ║ M ║ H ║
╚════════╩═══════╩══════╩══════╝

Desired output:

╔════════╦══════╦══════════╦═══════╗
║ Cust ║ PA1 ║ PA2 ║ PA3 ║
╠════════╬══════╬══════════╬═══════╣
║ A ║ H ║ M&L ║ L&M&H ║
╚════════╩══════╩══════════╩═══════╝

i have multiple PA columns. I want to know if there is a generic sql code I can apply Thank you
ven_sam

dnoeth 4628 posts Joined 11/04
18 Jul 2015

Do you need to order the values for concatenation? 'a&b&c' vs. 'b&c&a' 
How many rows and rows per value exists?
What's the actual datatypes?
What's your Teradata release?

Dieter

anil.doobagunta 11 posts Joined 10/11
21 Mar 2016

Hi Dieter I got Similar requirement like ven_sam, in my case I need to order the values for concatenation. 
rows per value can be huge number.
datatype is CHAR(10)
TD15 release

You must sign in to leave a comment.