David.Roth's picture
Company:
Country:
Personal Website:
Day Job:
blog icon Recent Articles

Everyone is aware of Teradata’s continued commitment to reducing the footprint and resource usage of data by adding compression capabilities and the result on perm space savings and performance, by reducing block size, IO and spool usage.  

While this new functionality is great, I would like to suggest that you not ignore a basic best practice step in implementing your data warehouse model.  In this case, I would like to suggest that before compression, you look at 'right sizing' your data first. 

The purpose of this series was to give you some basic queries that I use to provide me with a quick snapshot of how well tuned an EDW is from a workload analysis and database tuning perspective.

The four topics were (direct links are provided at the end of the article):

  • Part 1 - Excessive Use of String Manipulation Verbs
  • Part 2 - Analyze Secondary Index Usage
  • Part 3 - Statistics Analysis
  • Part 4 - Compression Analysis

Have you tried them yet??

Ok, so I shouldn’t even need to broach this topic, as I’m sure you have all heard it before: compress, compress, and compress.

However, for some reason, some sites are still not getting it, or, they have not adequately incorporated this pertinent step in their development process, leading to tables still being implemented without adequate compression. Perhaps, the tables were compressed a long time ago, but are not taking advantage of the advances in compression. like multi-value.

In Part 3 of this series, we will take a quick look at how statistics are implemented and maintained at your site.

Statistics Collection can be a complicated and very deep-dive topic, with discussions on the frequency of collection, whether to use sampled stats, automation strategies, etc. This analysis is not going to go that deep, it is a high-level look at the statistics on the tables, and I am looking for just two things:

  • Are statistics applied to the tables or missing?
  • For those that are applied, is there consistency in the application and collection process?

In Part 1 of this series, we looked at the Excessive use of String Manipulation Verbs in a query. Here is that link in case you missed it:

http://developer.teradata.com/database/articles/workload-toolkit-part-i-excessive-use-of-string-manipulation-verbs

For Part 2 of this series, we will look at the implementation of secondary indexes and what impact (or lack of) they may be having.

This article delves into the potential excessive use of Multi-Column (MC) Statistics and a process for identifying those that are not providing benefit and should be removed. This will reduce complexity and save resources on the re-collection of these redundant statistics.

We all know the benefits of identifying and fixing badly performing workloads. Many sites use the common method of identifying the top n highest consuming or skewed queries (often the same queries) and fixing those.

blog icon Recent Reference

David.Roth hasn't created any reference articles.