All Forums Analytics
TimManns 25 posts Joined 05/06
10 Jul 2006
Is there a 'sounds-like' phonetic function ?

Hi all,I'm currently working on a data cleaning exercise, consolidating customers who have separate ID's, but are actually duplicates. The data are manually entered and therefore minor variations in company name may occur.I can perform the analysis outside of Teradata, but the estimated time to completion is literally several days.Is there an easy 'sounds-like' phonetic type of function in Teradata (that is supported by Teradata SQL) that could, for example, match "phone-shop" and "fone shop". I'd like to use this type of feature in a join.Maybe a pipe dream...ThanksTim

Barry-1604 176 posts Joined 07/05
11 Jul 2006

Yes, Teradata has a function called SOUNDEX does what you're looking for. If two strings have the same soundex value, then they are phonetically equivalent.Good luck!

madeelsabir 2 posts Joined 03/06
11 Jul 2006

Barry,What you say about it ?SELECT case when SOUNDEX('phone') = SOUNDEX('fone') then 'both equal' else 'both not equal :-( ' endRegards,M.Adeel Sabir

RGlass 35 posts Joined 09/04
11 Jul 2006

Could be how 'fone' is perceived to be pronounced by TDSELECT case when SOUNDEX('fawn') = SOUNDEX('fone') then 'both equal' else 'both not equal ' endboth equalTbob

Barry-1604 176 posts Joined 07/05
11 Jul 2006

From the manual, it does look like the intent was to be used on surnames....Soundex is a system that codes surnames having the same or similar sounds, but variantspellings. The Soundex system was first used by the National Archives in 1880 to index theUnited States census.Soundex codes begin with the first letter of the surname followed by a three-digit code. Zerosare added to names that do not have enough letters.So, maybe you wouldn't want to apply it to other words.I believe that it's the only function that Teradata has that does something like the phonetic comparison, though.

TimManns 25 posts Joined 05/06
11 Jul 2006

Thanks everyone, I'll give SOUNDEX a try.Tim

dnoeth 4628 posts Joined 11/04
12 Jul 2006

Hi Tim,if you're on V2R5.1+ this could be solved with a UDF implementing a more advanced phonetic algorithm, e.g. metaphone. You just have to find an implentation in C and wrap it into the UDF...Dieter

Dieter

Claradata 8 posts Joined 05/12
24 May 2012

This is an old post....and we are on TD 14.0 now. Do we have any inbuilt functionality in Teradata that simulates Sybase's difference function? Or the only option is building a custom UDF?

-- Sybase Example 1
select difference("smithers", "smothers")

---------
4

-- Sybase Example 2
select difference("smothers", "brothers")

---------
2 

It is basically a difference between two soundex values and returns a value from 0 to 4. The best match is 4.

-- Teradata Example 1
SELECT
       CASE
              WHEN SOUNDEX('smithers') = SOUNDEX('smothers')
              THEN 4
              ELSE 0
       END
-- Returns 4

-- Teradata Example 2
SELECT
       CASE
              WHEN SOUNDEX('smithers') = SOUNDEX('brothers')
              THEN 4
              ELSE 0
       END
-- Returns 0

 

You must sign in to leave a comment.