All Forums Database
suhailmemon84 47 posts Joined 09/10
20 May 2016
UDFCONCAT

I'm looking into usage of function: tdstats.udfconcat. Is there any way to control the order of the values that this function concatenates?
If I have a table: TABLE1 that looks like this:
Column_A               Column_B
55555                     123
55555                     124
55555                     125
55555                     126
55555                     127
77777                     130
77777                     131
77777                     132

and I would like to convert the values in the second column into a comma separated "ordered" list that should look like this:

Column_A                             Column_X
55555                               123,124,125,126,127
77777                               130,131,132
On using udfconcat I get this instead(values in column_x are not in order):
Column_A                             Column_X
55555                               124,123,126,125,127
77777                               132,130,131
How do I make udfconcat generate an ordered list?
if it is not possible to make udfconcat generate the concatenated values in a particular order, what is the most effecient way to rearrange the values in column_x after I apply udfconcat? (can't do recursion because the input table has about 500 million records)
I tried the aternative: xmlagg function (that did the same thing and generated the values in order too) but it is much much slower than udfconcat, consumes far more cpu/io/spool resources when compared to udfconcat (probably because udfconcat is an aggregate udf?)
-Suhail
 
 

dnoeth 4628 posts Joined 11/04
20 May 2016

Hi Suhail,
There's no way to rearrange the data returned by udfconcat efficiently, you might write your own UDF (or do an Enhancement Request to add a LISTAGG/GROUP_CONCAT function).
tdstats.udfconcat is undocumented (used by AutoStats to get the column/fieldid list) and the order is probably based on the internal storage.
 

You might try to apply an order using ROW_NUMBER before aggregation, seems to work for my 4-AMP VM:

SELECT TABLENAME, tdstats.udfconcat(ColumnName)
FROM 
 (
   SELECT TABLENAME, ColumnName,
      ROW_NUMBER() OVER (PARTITION BY TABLENAME ORDER BY ColumnName) AS rn
   FROM dbc.ColumnsV
   WHERE DatabaseName = 'sys_calendar'
) AS dt
GROUP BY 1;

 
Of course there's no guarantee for correct sorting, it might work today and the next patch/version breaks it.
Similar when you create a NOPI table, the final spool is simply materialized as-is: 

CREATE VOLATILE TABLE vt AS
 (
   SELECT TABLENAME, ColumnName,
      ROW_NUMBER() OVER (PARTITION BY TABLENAME ORDER BY ColumnName) AS rn
   FROM dbc.ColumnsV
   WHERE DatabaseName = 'sys_calendar'
 ) WITH DATA
NO PRIMARY INDEX
ON COMMIT PRESERVE ROWS;

SELECT TABLENAME, tdstats.udfconcat(ColumnName)
FROM VT
GROUP BY 1;

 

 

Dieter

suhailmemon84 47 posts Joined 09/10
20 May 2016

Brilliant!! Thank you sir! 
One additional question regarding this comment:
"Of course there's no guarantee for correct sorting, it might work today and the next patch/version breaks it."
If I do not do row_number and generate the concatenated output in any random order, what is an effecient way to rearrange the comma separated values in the string alphabetically?
-Suhail

suhailmemon84 47 posts Joined 09/10
20 May 2016

The reason I ask for an effecient solution is because the input table has about 500 million records.

dnoeth 4628 posts Joined 11/04
20 May 2016

I just can repeat: There's no way to rearrange the data returned by udfconcat efficiently.
Your options are:

  • If the number of rows per value is known and limited, old style MAX(CASE) is still very efficient
  • XMLAGG, way less efficient than udfconcat
  • Implement your own concat-UDF
  • If this was an export you could write a simple OUTMOD to concat the sorted data

Dieter

suhailmemon84 47 posts Joined 09/10
20 May 2016

Thank you Dieter.
-Suhail

You must sign in to leave a comment.