All Forums Connectivity
ulrich 816 posts Joined 09/09
05 Feb 2014
Load Nulls with R & JDBC Fastload Problem

Hi all,
I try to load some data from R directly to Teradata 14.10 via JDBC Fastload.
The data contains some Nulls for numeric values and I have not been able to load this data so far. 
The code below is an example.
Three columns are generated. One integer, one char and one integer again.
The last column will have about 25% nulls.
If you comment out line 48 no nulls will be generated and you can see that the load per se is working.
From what I read so far I understand that nulls need to be set differently and I tried to do that within the msinsert function but obvously without success.
Is this a bug? I saw some other posts on JDBC and Nulls but I am not sure that the issue is the same.
 
Any support is very apprichiated.

Ulrich

library(RJDBC)
################
#def functions
################
myinsert <- function(arg1,arg2,arg3){
  .jcall(ps,"V","setInt",as.integer(1),as.integer(arg1))
  .jcall(ps,"V","setString",as.integer(2),arg2)
  if (is.na(arg3)==TRUE) {
    .jcall(ps,"V","setNull",as.integer(3),Types.INTEGER)
  } else {
    .jcall(ps,"V","setInt",as.integer(3),as.integer(arg3))
  }
  .jcall(ps,"V","addBatch")
}


MHmakeRandomString <- function(n=1, lenght=12)
{
  randomString <- c(1:n)                  # initialize vector
  for (i in 1:n)
  {
    randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
                                    lenght, replace=TRUE),
                             collapse="")
  }
  return(randomString)
}

################
#DB Connect
################
.jaddClassPath("/MyPath/ TeraJDBC__indep_indep.14.10.00.17/terajdbc4.jar")
.jaddClassPath("/MyPath/ TeraJDBC__indep_indep.14.10.00.17/tdgssconfig.jar")
drv = JDBC("com.teradata.jdbc.TeraDriver","/MyPath/ TeraJDBC__indep_indep.14.10.00.17/tdgssconfig.jar","/ MyPath/TeraJDBC__indep_indep.14.10.00.17/ terajdbc4.jar")
conn = dbConnect(drv,"jdbc:teradata://neo/ CHARSET=UTF8,LOG=ERROR,DBS_PORT=1025,TYPE=FASTLOAD,TMO DE=TERA,SESSIONS=1","uli","m00rhuhn") 

################
#main
################

##gen test data
dim = 10000
i = 1:dim
s = MHmakeRandomString(dim,12)
j = sample(1:10000, dim)
i1 <- j %% 4 == 0
#assign some NA
j[i1] <- NaN



## set up table
dbSendUpdate(conn,"drop table foo;")
dbSendUpdate(conn,"create table foo (a int, b varchar(100),c int);")

#set autocommit false
.jcall(conn@jc,"V","setAutoCommit",FALSE)
##prepare
ps = .jcall(conn@jc,"Ljava/sql/PreparedStatement;"," prepareStatement","insert into foo values(?,?,?)")

#start time
ptm <- proc.time()

## batch insert
for(n in 1:dim){ 
  myinsert(i[[n]],s[[n]],j[[n]])
}
#run time
proc.time() - ptm

#apply & commit
.jcall(ps,"[I","executeBatch")
dbCommit(conn)
.jcall(ps,"V","close")
.jcall(conn@jc,"V","setAutoCommit",TRUE)

#get some sample results
dbGetQuery(conn,"select top 100 * from foo")
dbGetQuery(conn,"select count(*) from foo")

#disconnect
dbDisconnect(conn)

 
 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud
tomnolan 594 posts Joined 01/08
05 Feb 2014

The "myinsert" function tests for NA with this line:
if (is.na(arg3)==TRUE) {
 
The is.na test will only return TRUE for NA, and will return false for NaN.
 
On line 48 you assign NaN (which is floating-point "Not a Number") to the array item:
j[i1] <- NaN
 
But I suspect that you actually want to assign NA (the R equivalent to null) to the array item:
j[i1] <- NA
 

ulrich 816 posts Joined 09/09
05 Feb 2014

Hi, 
no thats not the issue. 
is.na(x) checks for NA and NaN. 
You can change j[i1] <- NA and the error is the same.
The message is
"Fehler in .jcall(ps, "V", "setNull", as.integer(3), Types.INTEGER) : 

  Objekt 'Types.INTEGER' nicht gefunden"

 

which would be "Error in ... Objekt 'Types.INTEGER' no found"

 

So the if had been evaluated as it should...

 

Ulrich

 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

tomnolan 594 posts Joined 01/08
06 Feb 2014

Well, I'm not an R programmer, so sorry that the j[i1] <- NA didn't fix the problem for you.
 
Types.INTEGER is a Java constant value equal to 4. Perhaps R does not provide access to the Types.XXX constants? Try specifying 4 instead of Types.INTEGER.

ulrich 816 posts Joined 09/09
06 Feb 2014

Unfortunatly it does not help.
Error message is now:
Fehler in .jcall(ps, "V", "setNull", as.integer(3), 4) :
  method setNull with signature (ID)V not found

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
06 Feb 2014

P.S. I am not a Java programmer.
Does null loads with JDBC Fastload work in a pure Java implementation?
So is this clearly a R problem?
 

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

tomnolan 594 posts Joined 01/08
06 Feb 2014

Yes, JDBC FastLoad supports NULLs.
 
The signature (ID)V means that R is attempting to call a method that has int and double arguments, and returns void. Such a method doesn't exist; hence the error. In other words, the value 4 is treated by R as a double value. Instead, it needs to be an int value.
 
Please try:
.jcall(ps, "V", "setNull", as.integer(3), as.integer(4))

ulrich 816 posts Joined 09/09
06 Feb 2014

Yes,yes,yes! That works!
Thanks a lot!
As I have to deal with different data types. Do you know where I can find the constanst for other data types?
e.g. Types.String etc.?
I didn't find them so far.

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

ulrich 816 posts Joined 09/09
07 Feb 2014

Ok found it myself:
http://www.docjar.com/html/api/java/sql/Types.java.html
But thanks again to tomnolan

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

tomnolan 594 posts Joined 01/08
07 Feb 2014

I'm glad you were able to find the constant values for java.sql.Types.
For reference, here is a link to the official Java documentation from Oracle:
http://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.ARRAY

You must sign in to leave a comment.