Unicode Tool Kit 1.7.2
About this download

Unicode is a core technology for developing and implementing a universal, international language solution. The Unicode Tool Kit has been developed for Teradata customers who migrate the Latin server character set to Unicode and build a global data warehouse based on a universal character set Unicode.

Introduction

Developing a global data warehouse has become a strategic business direction for the success in the international marketplace. Among the many technologies available today for globalization, Unicode is a core technology used to develop and implement a universal language solution. However, many Teradata customers may not implement Unicode on existing systems, as the customers have already implemented the Teradata Latin server character set (even for non-Latin1 languages including Chinese and Korean). These customers start experiencing gaps between the legacy data and Unicode data as well as gaps between the existing ANSI applications and Unicode applications. Those gaps will not allow the customer to access leading edge Teradata Unicode applications.  As of today with TD 16.0/TTU 16.0, migration from the Teradata Latin to Unicode may not be an easy task.  Here are some limitations to the current Teradata system:

 

• ALTER TABLE does not support changing the server character set for character data types

• The TRANSLATE() function only works with Japanese

 

The purpose of this document is to introduce the Unicode tool kit for those customers who migrate the Latin server character set to Unicode and build a global data warehouse based on a universal character set Unicode. The Unicode tool kit consists of the following components:

 

1) User Defined Functions (UDFs) for migrating code page data to Unicode without import/export

2) Site-defined session character sets compatible with Windows code pages or other standards

3) Access Modules for translation and validation to load code page data or UTF8 via the UTF8 session using fastload/Multiload/Tpump/TPT

4) Language translation functions translate texts in the database from/to specified languages

5) Unicode test data and a test application in Java/JDBC

6) Others

 

What's New in Recent Releases

 

Date: 2020-3-9

Version 1.7.2

    Access Modules
    * Support 64bit version for TPT 16.20 on SUSE Linux and Windows
    Translation UDFs
    * Support (2) UDFs for IBM 1147 (France)
      EBCDICToUnicode()
      UnicodeToEBCDIC()

Date: 2019-1-8 

version 1.7.1

Language translation functions

You can translate strings from the source language to the target language in SQL.  Here are some examples:

Select udfdb.udf_translate('Happy Halloween','en','ja'); /* English to Japanese */

楽しいハロウィンをお過ごし下さい

Select udfdb.udf_translate('Happy Birthday','en','ar'); /* English to Arabic */

عيد ميلاد سعيد

Select udfdb.udf_translate('Happy Halloween','en','ko'); /* English to Korean */

즐거운 할로윈 보내세요

Select udfdb.udf_translate('Добрый день, мой друг.','ru','zh-Hans'); /* Russian to Chinese */

下午好, 我的朋友。

 

In this release, the following changes have been made.

* Implemented Microsoft Text Translation APIs version 3

* V3 supports 60+ languages

* V3 supports NMT (Neural Machine Translation) for 40+ languages

Note: 

V2 APIs will retire April 2019. 

If you have translation functions installed using UTK 1.7.0.x, 

please install the new package in this UTK release

 

Date: 2018-8-16

version 1.7.0.1

Language translation functions

* InstallationLangCodes.txt has been corrected for Korean. 

Access Module 

* cp2uni_axm.so for Solaris Sparc 64bit

 

Date: 2017-11-3

version 1.7.0.0

Language translation functions

* New database functions translate texts form/to specified languages

Access Module 

* cp2uni_axm.so for Linux supports a new parameter Timeout=xx to avoid hanging while reading data from named pipes