All Forums Database
r.sagner 6 posts Joined 08/05
05 Dec 2013
MVC generated out of V14.00 statistics

The "SHOW STATISTICS VALUES COLUMN col ON db.tab; " shows the Biased values of the statistics. In my opinion this is exactly a very good list for multi value compression.
The following scripts should give a starting point for very fast multi value compression out of actual statistics. It is not tested for every feature the teradata database provides.
In worst case you loose statistics for one table, but this statistics is saved in the dump files and can submitted again.
The success of compression is connected to the amount of actual field statistics.
To limit the number of values a percentage of 1% is used to get the most used values. The number of null values for compression is also checked.

The advantages are:

  • No costs for getting the values for compression
  • Good compression results with easy algorithmus

This easy solution has some disadvantages:

  • Compress is only performed on columns with statistics
  • Statistics have to be actual
  • Procedure doesn't take care of previous values list

First generate as list of commands to get statistics for regenerating statistics:

SELECT
         'SHOW STATISTICS VALUES COLUMN '||(trim (both from a.columnname))||' on '||(trim(both from a.databasename))||'.'||(trim(both from a.tablename))||';' as stmt
FROM
        dbc.ColumnStatsV a
INNER JOIN
        dbc.columns b
ON
a.databasename=b.databasename
        AND
a.tablename=b.tablename
        AND
a.columnname=b.columnname
LEFT OUTER JOIN
        dbc.PartitioningConstraintsV c
ON
a.databasename=c.databasename
        AND
a.tablename=c.tablename
        AND
c.constrainttext LIKE '%'||a.columnname||'%'

WHERE
        c.constrainttext is null
        AND
a.indexnumber is null
        AND
a.databasename='dbtest'
        AND
b.columntype<>'TS'
        AND
(a.databasename,a.tablename,a.columnname) not in (select databasename,tablename,columnname from dbc.indices)
order by a.databasename,a.tablename,a.columnname;
 
Which produces:
 
SHOW STATISTICS VALUES COLUMN col1 on dbtest.tab1;
SHOW STATISTICS VALUES COLUMN col2 on dbtest.tab1;
SHOW STATISTICS VALUES COLUMN col3 on dbtest.tab1;
SHOW STATISTICS VALUES COLUMN col1 on dbtest.tab2;
SHOW STATISTICS VALUES COLUMN col2 on dbtest.tab2;

 
Putting the bteq output of these command in the following gawk script
BEGIN   { CUTPERCENTAGE=1;
          print ".errorlevel (3582) severity 0";
        }
/            COLUMN \(/ { COL=$3; }
/                ON / { DBTAB=$2; }
/^ \/\*\* / { BIASEDON=0; }
/NumOfRows/ { CUTROWS=$4*CUTPERCENTAGE/100;
              BIASED=="";
              if (0+CUTROWS<0+NULLROWS) BIASED="NULL,";
            }
/\/\* NumOfNulls/ { NULLROWS=$4; }
/^ \/\*\* Biased:/ { BIASEDON=1;}
/^ \/\* / { if (BIASEDON==1)
                {
                if (CUTROWS < 0+gensub(".*,","","",gensub(",? ?$","","g")))
                        BIASED=BIASED gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g")) ",";
                }
        }
/^COLLECT STATISTICS/   { COLSTATON=1; }
        {       if (COLSTATON==1) COLSTAT=COLSTAT "\n" $0; }

/^);/   { BIASEDON=0;
        COLSTATON=0;
        if (BIASED=="")
                {
                COLSTAT="";
                next;
                }
        if (DBTAB!=DBTABOLD)
                {
                if (DBTABOLD!="")
                        {
                        print DROPSTATS;
                        print ALTERTABLE ";";
                        print COLSTATALL;
                        COLSTATALL="";
                        }
                ALTERTABLE="ALTER TABLE " DBTAB " ADD " COL " COMPRESS (" gensub(",$","","",BIASED) ")";                                                                                                                                     
                DROPSTATS="DROP STATISTICS COLUMN " COL " ON " DBTAB ";";
                DBTABOLD=DBTAB;
                }
        else
                {
                ALTERTABLE=ALTERTABLE "\n" "        ,ADD " COL " COMPRESS (" gensub(",$","","",BIASED) "                                                                                                                                     )";
                DROPSTATS=DROPSTATS "\n" "DROP STATISTICS COLUMN " COL " ON " DBTAB ";";
                }
        COLSTATALL=COLSTATALL "\n" COLSTAT;
        COLSTAT="";
        BIASED="";
        }

END     {
        print ";";
        }
 
Produces:
 
DROP STATISTICS column col1 on dbtest.tab1;
DROP STATISTICS column col2 on dbtest.tab1;
DROP STATISTICS column col3 on dbtest.tab1;
ALTER TABLE dbtest.tab1 add col1 compress ( ...)
    , add col2 compress ( ...)
    , add col3 compress ( ...)
;
COLLECT STATISTICS COLUMN ( col1 ) ON dbtest.tab1 VALUES (...);
COLLECT STATISTICS COLUMN ( col2 ) ON dbtest.tab1 VALUES (...);
COLLECT STATISTICS COLUMN ( col3 ) ON dbtest.tab1 VALUES (...);

DROP STATISTICS column col1 on dbtest.tab2;
DROP STATISTICS column col2 on dbtest.tab2;
ALTER TABLE dbtest.tab2 add col1 compress ( ...)
    , add col2 compress ( ...)
;
COLLECT STATISTICS COLUMN ( col1 ) ON dbtest.tab2 VALUES (...);
COLLECT STATISTICS COLUMN ( col2 ) ON dbtest.tab2 VALUES (...);
 
Executing this output compresses the fields. Done :-)

Comments, remarks and improvements are very welcomed.

Best Regards,

Roland Sagner
 

M.Saeed Khurram 544 posts Joined 09/12
05 Dec 2013

Hi Roland,
Thank you for sharing your work, It would be great if you share your thaughts in a blog in Developer Exchange.
Good Luck!
 

Khurram

r.sagner 6 posts Joined 08/05
05 Dec 2013

Unfortunately version lacks of two facts:
* Biased values are truncated to 26 Characters in the "SHOW STATISTICS VALUES COLUMNSHOW STATISTICS VALUES COLUMN"
* Script fails when column is part of multi column statistics.
Therefore new awk script
BR
Roland
BEGIN   { CUTPERCENTAGE=1;
          print ".errorlevel (3582) severity 0";
          print ".errorlevel (6956) severity 0";
        }
/            COLUMN \(/ { COL=$3; }
/                ON / { DBTAB=$2; }
/^ \/\*\* / { BIASEDON=0; }
/NumOfRows/ { CUTROWS=$4*CUTPERCENTAGE/100;
              BIASED=="";
              if (0+CUTROWS<0+NULLROWS) BIASED="NULL,";
            }
/\/\* NumOfNulls/ { NULLROWS=$4; }
/^ \/\*\* Biased:/ { BIASEDON=1;}
/^ \/\* / { if (BIASEDON==1)
                {
                if (CUTROWS < 0+gensub(".*,","","",gensub(",? ?$","","g")))
                        if (length (gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g")))<28)
                                BIASED=BIASED gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g"))                                                                                                                                      ",";
                }
        }
/^COLLECT STATISTICS/   { COLSTATON=1; }
        {       if (COLSTATON==1) COLSTAT=COLSTAT "\n" $0; }

/^);/   { BIASEDON=0;
        COLSTATON=0;
        if (BIASED=="")
                {
                COLSTAT="";
                next;
                }
        if (DBTAB!=DBTABOLD)
                {
                if (DBTABOLD!="")
                        {
                        print DROPSTATS;
                        print ALTERTABLE ";";
                        print COLSTATALL;
                        COLSTATALL="";
                        }
                ALTERTABLE="ALTER TABLE " DBTAB " ADD " COL " COMPRESS (" gensub(",$","","",BIASED) ")";                                                                                                                                    
                DROPSTATS="DROP STATISTICS COLUMN " COL " ON " DBTAB ";";
                DBTABOLD=DBTAB;
                }
        else
                {
                ALTERTABLE=ALTERTABLE "\n" "        ,ADD " COL " COMPRESS (" gensub(",$","","",BIASED) "                                                                                                                                     )";
                DROPSTATS=DROPSTATS "\n" "DROP STATISTICS COLUMN " COL " ON " DBTAB ";";
                }
        COLSTATALL=COLSTATALL "\n" COLSTAT;
        COLSTAT="";
        BIASED="";
        }

END     {
        print ";";
        }
 
 

 

You must sign in to leave a comment.