Extensibility covers the mechanisms by which you, as the user or developer, can extend the functionality of the Teradata Database, for example with the use of User Defined Functions, or UDFs.

Expand All Subscribe to Teradata Developer Exchange - Extensibility content The Latest
Block Level Compression evaluation with the BLC utility

One of the new compression features in Teradata 13.10 is Block Level Compression (BLC), which provides the capability to perform compression on whole data blocks at the file system level before the data blocks are actually written to storage. Like any compression features, BLC helps save space and reduce I/O. 

There is a CPU cost to perform compression on inserting data. And there is a CPU cost to perform decompression on whole data blocks whenever the compressed data blocks are accessed. Even when only one column of a single row is needed, the whole data block must be decompressed. For updates, the compressed data blocks have to be decompressed first and then recompressed. Careful evaluations shall be done before applying BLC in your production systems.

Algorithmic Compression Test Package

The ALC (ALgorithmic Compression) test package contains UDFs simulating TD13.10 built-in compression functions, test templates for Latin and Unicode character columns and step-by-step instructions. It is intended for TD users to run over specific data at column level to determine compression rates of TD 13.10 built-in compression algorithms. The test results provide information for selecting an appropriate algorithm for specific data. These tests use read-only operations and they can be executed on any release that supports UDFs (V2R6.2 & forward). It is recommended to run these tests off peak hours - they will use a significant amount of system resources (CPU bound).

Hadoop MapReduce Connector to Teradata EDW

Hadoop MapReduce programmers often find that it is more convenient and productive to have direct access from their MapReduce programs to data stored in a RDBMS such as Teradata Enterprise Data Warehouse (EDW) because:

  1. There is no benefit to exporting relational data into a flat file.
  2. There is no need to upload the file into the Hadoop Distributed File System (HDFS).
  3. There is no need to change and rerun the scripts/commands in the first two steps when they need to use different tables/columns in their MapReduce programs.
Geocoding 101

As Teradata customers discover and begin to utilize the native Teradata database geospatial capabilities, one of the first questions that inevitably comes up is, how do I “Geocode” my data?  In fact, Geocoding will often be an important first phase of any Geospatial implementation project and sometimes even a barrier to start the project all together.  The purpose of this article is to discuss what Geocoding is, how it works, Geocoding options, precision, and sources available today for Geocoded information.

Selecting an ALC compression algorithm

Teradata 13.10 provides Algorithmic Compression (ALC) feature that allows  users to apply compression / decompression functions on a specific column of character or byte type. The compression / decompression functions may be Teradata built-in functions provided along with ALC or user provided compression / decompression algorithm registered as UDFs.

Adding Geospatial Location Data - 2 Minute Guide

Teradata has added geospatial features to Teradata 13 (and earlier versions with the optional extension package - see my earlier article here).  These features enable powerful location based analytics, but often I'm asked how to get started, especially by customers who already capture Latitude/Longitude location data.  So to help, I've put together this quick 2 minute guide on converting your existing location data to the new ST_Geometry data type in Teradata so that you ca

Quicker Method to Calculate Distances on the Globe

Someone asked a few days ago for an easier and quicker way to calculate distance between two points on a sphere without having to transform to the UTM SRS (Spatial Reference System) from the WGS84 SRS.

First, when using Teradata Geospatial database features all of the ST_GEOMETRY object calculations are based on a Cartesian coordinate system, except for selected distance methods.

Geospatial with Teradata 13

The availability of Teradata's geospatial extension package in 2007 brought these location capabilities to Teradata 12, 6.2 and 6.1.  This package is still available as a free download from Teradata and when installed, adds geospatial functionality as a User Defined Type (UDT) along with a library of User Defined Functions (UDFs).  (See my article on downloading and installing this package).  One of the major highlights of Teradata 13 is the inclusion of these geospatial fe

Hadoop DFS to Teradata

Hadoop systems [1], sometimes called Map Reduce, can coexist with the Teradata Data Warehouse allowing each subsystem to be used for its core strength when solving business problems. Integrating the Teradata Database with Hadoop turns out to be straight forward using existing Teradata utilities and SQL capabilities. There are a few options for directly integrating data from a Hadoop Distributed File System (HDFS) with a Teradata Enterprise Data Warehouse (EDW), including using SQL and Fastload. This document focuses on using a Table Function UDF to both access and load HDFS data into the Teradata EDW. In our examples, there is historical data already in Teradata EDW, presumably derived from HDFS for trend analysis. We will show examples where the Table Function UDF approach is used to perform inserts or joins from HDFS with the data warehouse.

Sessionization Map-Reduce Support in Teradata

Map-reduce, or its open source version Hadoop, is a parallel programming framework for running scripts, Java, C, and other external programming languages on hundreds of nodes. It is popular with Dot.Com companies who have large server farms and need to produce reports on website activity or produce search indexes. In general, Map-reduce applications overlap BI applications and data warehouses. However, Map-reduce applications can coexist with a data warehouse: one parallel processing, the other parallel database. Coexistence allows each subsystem’s best capabilities to be used to complement the other. With Teradata’s in-database processing technology, Map-reduce can become MPP ETL subsystem, or we can run Map-Reduce functions inside the EDW, or using table functions we can directly integrate with the Map-reduce nodes. This article illustrates a commonly used Map-reduce function running inside the Teradata EDW.