Teradata UDFs for Oracle
About this download

A collection of UDFs that emulate the most-commonly used Oracle built-in SQL functions.

README

This zip file contains the C source and the installation and test scripts for
the following User-Defined Functions:

	grepc (not an Oracle function)
	capitalize (not an Oracle function)
	add_months
	ascii
	ceil
	chr
	decode
	floor
	greatest
	grepc
	initcap
	instr
	last_day
	least
	length
	lpad
	ltrim
	months_between
	nvl
	replace
	round
	rpad
	rtrim
	sign
	to_char
	to_date
	to_number
	translate
	trunc

While I have tested these as thoroughly as I can imagine how to do so, there
is no test like the user's code.  Since this is new code, you might expect to
see problems which I have not anticipated.  Test all these as well as you can
before putting them into a production system.


INSTALLATION:
Unzip this file with "Restore Directory Structure" turned on.

The source, compile script and test script for each function is contained in 
the directory for that function, in a file or files called (something).btq.
Edit this file(s) and change $PGMPATH$ to the full path name of the directory
where this code resides; then you can run it with bteq. I recommend you .SET
WIDTH to 254.  I also recommend you review the output carefully for any errors.
It is probably best to install one function at a time in this manner.

If you have perl installed on your PC, you can use ChangeAllPaths.bat to change
all the $PGMPATH$ values in the bteq scripts.  Otherwise, you can put all this
on a UNIX system and change the .bat script to a shell script.

Having done that, you can run CompileAll.btq (from bteq) to compile and test
all the functions.  These scripts are all set up to install the UDFs from a PC
or other client to the Teradta server.  If you are transferring the source
files to the server and installing from there, be sure to change all the
EXTERNAL NAME clauses of REPLACE FUNCTIONs from "CS" and "CI" to "SS" and
"SI," respectively.

Note that many of these UDFs use C ASCII string functions.  If the user ID
that installs the functions has a default character set of UNICODE, then the
system will assume that the UDFs can process Unicode strings, and the UDFs
will always see only the first character of each parameter.  (The second byte
will be 0x00, which looks like end-of-string in ASCII.)  If Kanji support was
installed on the Teradata system, then all users will have the UNICODE default
character set.  To be sure that the UDFs work as expected, make sure the
installer ID is using LATIN:

	MODFIY USER <installer-ID> AS DEFAULT CHARACTER SET LATIN;

The user <installer-ID> may be able to perform this operation itself, but it
then needs to logon again before installing the UDFs.  After they are
installed, you can change the user back to UNICODE if you want:

	MODFIY USER <installer-ID> AS DEFAULT CHARACTER SET UNICODE;

From then on, the UDFs will operate in Latin or ASCII mode no matter what the
user's character set is.

Do not alter the UDFs to non-protect mode until you have run the test scripts
and some other samples that you make up yourself.  You can use Alter.btq to
change all the functions except decode to not-protected.  (You'll have to make
your own script for decode.)

Once you are sure they are working, You may want to re-install the UDFs into the
special SYSLIB database so they can be accessed by all users without having to
qualify the function name by databasename.

Please contact me with any questions, problems, or (especially) fixes.

George Coleman
+1 503 738 9050
george.coleman@teradata.com


UPDATE June 11, 2008:
--------------------
* Replace function rewritten by Nicholas Muller.  This version throws an error
  if the result string will be longer than MAXIMUM_LENGTH.  MAXIMUM_LENGTH is
  set to 16000 in the distribution, but you may want to make it shorter.  You
  should not make it longer than 32000.  It should match the length in the
  RETURNS clause of the REPLACE FUNCTION statement in replace.btq.  If you use
  oreplace() to return a column, and there are other columns in the SELECT,
  and the RETURNS length is a large number like 32000, then you will get a
  "row size exceeded" error from Teradata. In this case, you must either
  reduce the maximum length or cast the result to something shorter.

UPDATE May 14, 2008:
-------------------
* Changed the way many functions handle null input.  Most will now return NULL 
  instead of throwing a SQL error.  This should be closer to Oracle's behavior.
* Added create_test_table.bteq and test.data.  Some of the test scripts use
  the test table ("mytable"); these files will provide that.
* Changed all EXTERNAL NAME clauses from Sx to Cx.
* Updated copyright dates.



Miscellaneous  	                                                                
Functions      	Notes                                                           
---------------	----------------------------------------------------------------
grepc          	Usage: grepc(string1, string2)                                  
               	Returns: integer count of occurrences of string2 in string1.    
               	Satisfies RFC 52803.                                            

capitalize     	Capitalize(string) returns a string                             
               	with the first word capitalized.                                
               	This module is in the initcap directory.                        


Oracle         	                                                                
Functions      	Notes                                                           
---------------	----------------------------------------------------------------
add_months     	This works like the Oracle add_months function, such that if the
               	starting date is at end-of-month, or the resulting date exceeds 
               	the end-of-month, then the resulting date is at end-of-month.   
               	Teradata already has an add_months(), so install this as        
               	oadd_months() or another likely name.                           
               	                                                                
ascii          	This rejects null input, but that is probably a bug.            
               	                                                                
ceil           	Defined for FLOAT and DEC(18,5). If you need more than 5 decimal
               	places, you must change this function.                          
               	                                                                
chr            	This rejects null input, but that is probably a bug.            
               	                                                                
decode         	This one presented some real challenges. The Oracle decode()    
               	accepts up to 256 arguments. The only restrictions on data types
               	are that the 2nd, 4th, 6th, ... arguments must be the same data 
               	type as the first, and that all the others must be the same data
               	type, which need not be the same as the first.  If we had UDFs  
               	that accepted all combinations of seven different data types    
               	with from 3 to 256 parameters, we would need ... well, a LOT of 
               	modules (10,668 I think). So I decided to let you decide how    
               	many parameters your users are going to want to use with      
               	decode, and I wrote a perl script (decode.pl) that generates    
               	combinations of CHAR, DEC(9) and INT, up to as many parameters  
               	as you specify. It will also generate a BTEQ script (decode.btq)
               	to install them.                                                
               	                                                                
               	To use it, move decode.pl to a machine that has Perl on it (e.g.
               	MP-RAS) or install Perl on your PC (it's free).  Then do: perl  
               	decode.pl N, where N is the number of parameters you think      
               	you'll need.  8 might be a good number.  If someone tries to    
               	decode() with more arguments than that, or if they try to use a 
               	different data type, they'll get a "Function does not exist"    
               	error message.  If you want to add more data types to this, feel
               	free.  But please forward your code to me so we can share it    
               	with everyone else.                                             
               	                                                                
floor          	Defined for FLOAT and DEC(18,5). If you need more than 5 decimal
               	places, you must change this function.                          
               	                                                                
greatest       	Defined for all pair combinations of CHAR, DEC(9) and INT. This 
               	Oracle function can actually take more than one argument, but I 
               	have not yet seen anyone using more than two. If we start having
               	to support more data types and more arguments, we'll end up with
               	the same problem we have with decode().  It might be easier to  
               	convince your user to use CASE operations instead.  In fact,
               	most uses of greatest() seem to be in combination with decode or
               	sign, and these expressions can be done much more cheaply with  
               	CASE.  Some users may be surprised to learn that Oracle now 
               	supports CASE too.                                              
               	                                                                
initcap        	The initial version of this had some bugs, which should be fixed
               	now.                                                            
               	                                                                
instr          	This rejects null input, but that is probably a bug.            
               	                                                                
last_day       	(of the month.)  This is defined for date, timestamp and        
               	timstamp-with-zone.                                             
               	                                                                
least          	See the notes for greatest().                                   
               	                                                                
length         	Same as Teradata's char_length.  Uses strlen().                 
               	                                                                
lpad           	Offers functionality not available in Teradata SQL.             
               	                                                                
ltrim          	Offers functionality not available in Teradata SQL.             
               	                                                                
months_between 	Unlike "cast((date1-date2 month) as integer)", this returns a   
               	FLOAT value.  Also, if the days in each date are the same or are
               	both end-of-month, the result is a whole number.                
               	                                                                
nvl            	Like COALESCE().  Defined for CHAR, INT, DEC(9) and FLOAT.      
               	                                                                
replace        	Offers functionality not available in Teradata SQL.             
               	Install this as oreplace(), because we already have a REPLACE.  
               	The MAXIMUM_LENGTH defined in the .c files should match the
               	RETURNS length in the replace.btq file.
               	                                                                
round          	Numbers:                                                        
               	Defined for INTEGER, FLOAT and DEC(18,5). If you need more than 
               	5 decimal places, you must change this function.                
               	Dates:                                                          
               	Defined for DATE, TIMESTAMP and TIMESTAMP WITH TIME ZONE.       
               	                                                                
rpad           	Offers functionality not available in Teradata SQL.             
               	                                                                
rtrim          	Offers functionality not available in Teradata SQL.             
               	                                                                
sign           	Defined for DEC(9), FLOAT and INTEGER. Most uses of this would  
               	be better rewritten as CASE operations.                         
               	                                                                
to_char        	Numbers:                                                        
               	Defined for INTEGER, FLOAT and DEC(18,5). If you need more than 
               	5 decimal places, you must change this function. There may be   
               	bugs in this, because I had a hard time understanding Oracle's  
               	format rules. Let me know if anyone complains, but get a good   
               	explanation of how this should work before you call me.         
               	Another note: the NLS_PARAM argument has not been implemented.
               	Dates:                                                          
               	Defined for DATE, TIMESTAMP and TIMESTAMP WITH TIME ZONE.       
               	In Oracle, you can Spell (using SP and optionally TH) any number
               	that occurs in the date or timestamp, but I have only           
               	implemented this for Day Numbers (D, DD and DDD).  If you want  
               	to add more spelling, feel free. I just got really tired of     
               	doing all that checking in the logic for SP and TH.             
               	                                                                
to_date        	Defined for DATE, TIMESTAMP and TIMESTAMP WITH TIME ZONE.       
               	                                                                
to_number      	This returns FLOAT, so when you use it you might want to recast 
               	the result as something more palatible, like INT or DEC(9,2).   
               	                                                                
translate      	Offers functionality not available in Teradata SQL.             
               	Install this as otranslate(), because we already have a         
               	TRANSLATE.                                                      
               	                                                                
trunc          	Defined for FLOAT and DEC(18,5). If you need more than 5 decimal
               	places, you must change this function.                          
               	                                                                



Directory /    	                       	                
Function       	Program(s)             	Comments
---------------	-----------------------	----------------
add_months     	add_months_d.c         	Need to install this as "oadd_months"
               	add_months_t.c         	or something similar, since we already
               	add_months_tz.c        	have an "add_months" function.

ascii          	ascii.c                	

ceil           	ceilDec8.c             	Decimal
               	ceilFloat.c            	Float
               	ceilInt.c              	Integer

chr            	chr.c                  	

decode         	decode.pl              	Use decode.pl perl script to generate
               	decode.test            	C source modules and Bteq scripts.

floor          	floorDec8.c            	Decimal 
               	floorFloat.c           	Float 
               	floorInt.c             	Integer 

greatest       	greatestcc.c           	character - character
               	greatestcd.c           	character - decimal
               	greatestci.c           	character - integer
               	greatestdc.c           	decimal - character
               	greatestdd.c           	decimal - decimal
               	greatestdi.c           	decimal - integer
               	greatestic.c           	integer - character
               	greatestid.c           	integer - decimal
               	greatestii.c           	integer - integer      

grepc          	grepc.c                	RFC 52803 (not an Oracle function)

initcap        	capitalize.c           	(var. of initcap; not an Oracle fn.)
               	initcap.c              	

instr          	instr_1.c              	
               	instr_2d.c             	decimal arg.
               	instr_2i.c             	integer arg.
               	instr_3dd.c            	decimal, decimal
               	instr_3di.c            	decimal, integer
               	instr_3id.c            	integer, decimal
               	instr_3ii.c            	integer, integer

last_day       	last_day_d.c           	date arg.
               	last_day_t.c           	timestamp arg.
               	last_day_tz.c          	timestamp-with-zone arg.

least          	leastcc.c              	character - character 
               	leastcd.c              	character - decimal 
               	leastci.c              	character - integer 
               	leastdc.c              	decimal - character 
               	leastdd.c              	decimal - decimal 
               	leastdi.c              	decimal - integer 
               	leastic.c              	integer - character 
               	leastid.c              	integer - decimal 
               	leastii.c              	integer - integer     

length         	length.c               	

lpad           	lpad_1.c               	integer length
               	lpad_1d.c              	decimal length
               	lpad_2.c               	integer length
               	lpad_2d.c              	decimal length

ltrim          	ltrim1.c               	
               	ltrim2.c               	

months_between 	months_between_dd.c    	date - date
               	months_between_dt.c    	date - timestamp
               	months_between_td.c    	timestamp - date
               	months_between_tt.c    	timestamp - timestamp

nvl            	nvlc.c                 	char
               	nvld.c                 	decimal
               	nvlf.c                 	float
               	nvli.c                 	integer

replace        	oreplace1.c            	"oreplace()" because Teradata
               	oreplace2.c            	already has a "replace"

round          	rounddate.h            	
               	rounddate1.c           	date
               	rounddate2.c           	
               	RoundDec8.c            	decimal
               	RoundDec81.c           	
               	RoundFloat.c           	float
               	RoundFloat1.c          	
               	RoundInt.c             	integer
               	RoundInt1.c            	
               	roundts1.c             	timestamp
               	roundts2.c             	
               	roundtz1.c             	timestamp-with-zone
               	roundtz2.c             	

rpad           	rpad_1.c               	integer length 
               	rpad_1d.c              	decimal length 
               	rpad_2.c               	integer length 
               	rpad_2d.c              	decimal length 

rtrim          	rtrim1.c               	
               	rtrim2.c               	

sign           	signd.c                	decimal
               	signf.c                	float
               	signi.c                	integer

to_char        	to_char_d1.c           	decimal
               	to_char_d2.c           	
               	to_char_date.h         	(header)
               	to_char_dt1.c          	date
               	to_char_dt2.c          	
               	to_char_f1.c           	float
               	to_char_f2.c           	
               	to_char_i1.c           	integer
               	to_char_i2.c           	
               	to_char_num.h          	(header)
               	to_char_ts1.c          	timestamp
               	to_char_ts2.c          	
               	to_char_tz1.c          	timestamp-with-zone
               	to_char_tz2.c          	

to_date        	to_date.h              	(header)
               	to_date1.c             	1 argument
               	to_date2.c             	2 arguments

to_number      	to_number.h            	(header)
               	to_number_1.c          	1 argument
               	to_number_2.c          	2 arguments
               	to_number_3.c          	3 arguments

translate      	otranslate.c           	"otranslate()" because Teradata
               	                       	already has a "translate"

trunc          	truncdate.h            	(header)
               	truncdate1.c           	date
               	truncdate2.c           	
               	truncDec8.c            	decimal
               	truncDec81.c           	
               	truncFloat.c           	float
               	truncFloat1.c          	
               	truncInt.c             	integer
               	truncInt1.c            	
               	truncts1.c             	timestamp
               	truncts2.c             	
               	trunctz1.c             	timestamp-with-zone
               	trunctz2.c