The Teradata JDBC Driver and ODBC Driver allow developers to quickly build applications that interact with the Teradata Database. However, many developers are surprised when their fully functioning application suddenly hits a performance roadblock when it is deployed to their production environment. And in many of these cases, the blame is sometimes unfairly placed onto the JDBC and ODBC drivers. This article will highlight the programming techniques available to maximize the performance when interacting with the database and help developers choose the right implementation.


Quick and Easy but Slowest Performance

Many new database developers are more focused on how to create a database connection and pass a SQL statement than they are with performance. A typical first implementation looks something like:

Connection conn = DriverManager.getConnection(url, username, password);
Statement stmt = conn.createStatement();
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(" + custID + ", " + tran_date + ", " + amount + ", " + desc + "')";

stmt.executeUpdate(sql);

stmt.close(); // Your real code should use try-finally blocks to manage resources.
conn.close(); // Let's not even get into connection pools! That's another article.

 

Sure this works for a demo and the beginning programmer is probably pretty happy with the results. But turn on some production volume and this will quickly become a performance bottleneck, especially when your application is processing many SQL inserts such as when batch loading. This type of database coding is pretty much like driving your sports car and staying stuck in first gear!


Drivers Prepare Your Statements

A much better approach is to use Prepared Statements. These will provide significantly better performance by first sending the database the outlines of the SQL statement using variable parameters in place of the actual data. The database prepares the execution steps of the SQL statement to optimize performance, and the prepared statement can then be used over and over again. This avoids recalculating the execution steps for each individual request, which is what happens in the first example.

// These are done once …
Connection conn = DriverManager.getConnection(url, username, password);
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(?,?,?,?)";

PreparedStatement ps = conn.prepareStatement(sql);

// … and these can be repeated many times with different values.
ps.setInt(1, custID);
ps.setDate(2, tran_date);
ps.setBigDecimal(3, amount);
ps.setString(4, desc);

ps.executeUpdate();

  

Batch Ready

Prepared Statement batches take your performance to the next level. In addition to the benefits of reusing the Prepared Statement, batching your input values also reduces the number of round trips to the database. A batch size of roughly 5,000 to 10,000 works well for most applications. Using batches can be 10 to 40 times faster than the previous approach.

// These are done once.
Connection conn = DriverManager.getConnection(url, username, password);
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(?,?,?,?)";
PreparedStatement ps = conn.prepareStatement(sql);

for ( /* Loop through input values */ ) 
{
    for ( /* Loop through a subset of the input values - the desired batch size */ ) 
    {
        ps.setInt(1, custID);
        ps.setDate(2, tran_date);
        ps.setBigDecimal(3, amount);
        ps.setString(4, desc);
        ps.addBatch(); // adds the row of input values to the batch
    }

    // This is done once per the desired batch size.
    ps.executeBatch(); // sends all the batched rows to the database
}

 


Full Speed Ahead

For loading truly huge amounts of data, JDBC FastLoad can provide even better performance. There are a couple of caveats, however. JDBC FastLoad can only insert data into an empty table, and JDBC FastLoad is only recommended for loading large amounts of data -- at least 100,000 rows total.

The nice thing is that your Java code doesn't need to change in order to use JDBC FastLoad. Your application uses the exact same Prepared Statement batches as in the previous example. Just add TYPE=FASTLOAD to your connection parameters, and the Teradata JDBC Driver will use JDBC FastLoad for particular SQL requests, if it can.

Note that the recommended batch size for JDBC FastLoad is much higher than for a regular SQL Prepared Statement batch, which means you may need to increase your JVM heap size. To get top-notch performance, you need to use a batch size of roughly 50,000 to 100,000. Using JDBC FastLoad can be 3 to 10 times faster than the previous approach.


Conclusion

JDBC and ODBC allow C/C++ and Java programmers to easily build database applications with Teradata. Planning for maximizing performance throughput should always be on your mind when you're coding. Its much better to avoid these types of issues early instead of dealing with a fire drill when your new application has already been rolled out to production. I hope that this quick tutorial has given you a good overview of the different coding choices and their performance implications when interacting with Teradata.
 

Discussion
ulrich 51 comments Joined 09/09
16 Mar 2010

Hi, in "Drivers Prepare Your Statements" you mention "and the prepared statement can then be used over and over again". Will this use the statement caching feature of the PE or will it always be use the same plan, independent on what is going on on the system?

feel free to donate bitcoin:12kgAUHFUqvG2sQgaRBXFhCwyf9HXdkGud

tomnolan 21 comments Joined 01/08
17 Mar 2010

Yes, the Teradata Database's statement cache will typically be used.

monisiqbal 17 comments Joined 07/09
29 Jun 2010

Can a statement like this be useful for Prepared Statement?
sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast('?');

mtaponen 1 comment Joined 11/10
25 Nov 2010

Thanks for the nice article. Just like to clarify some confusion:

As per Teradata JDBC Driver User guide
"Highest JDBC FastLoad PreparedStatement batch
using question-mark parameters, with the
recommended batch size. A batch size of
roughly 500 to 1000 works well for most
applications.
"
Still you say:
"To get top-notch performance, you need to use a batch size of roughly 50,000 to 100,000. "

Now -- what is the recommended approach or am I just missing some point here completely.

Should the JDBC Fastload work if I insert small (10K or so rows) batches to make complete 100K lines and just wrap it inside a transaction?

tomnolan 21 comments Joined 01/08
13 Dec 2010

>>> Can a statement like this be useful for Prepared Statement? sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast('?')

That's not the right syntax.

sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast(? as timestamp)

Please note that the cast(? as timestamp) is not needed if your Java application binds a java.sql.Timestamp object to the question-mark parameter marker using the PreparedStatement.setTimestamp method, or the PreparedStatement.setObject method.

>>> A batch size of roughly 500 to 1000 works well for most applications.

That was a typo in that old version of the Teradata JDBC Driver User Guide. We specifically note that typo in the Teradata JDBC Driver FAQ here on Developer Exchange.

So Yes, you should use a much larger batch size, as recommended in the article above.

09 Jun 2012

Since Fastload can load only in empty table, it will not work in batch mode. Each batch needs to be loaded into separate table.

tomnolan 21 comments Joined 01/08
09 Jun 2012

That's incorrect. In fact, JDBC FastLoad *must* be used with a PreparedStatement batch.
Please refer to the Teradata JDBC Driver Reference section about JDBC FastLoad.
http://developer.teradata.com/doc/connectivity/jdbc/reference/current/jdbcug_chapter_2.html#BABFGFAF

mingtotti 1 comment Joined 07/12
14 Jan 2013

Hi, are there similar usages to enable FastLoad/FastExport for Teradata ODBC and ADO.Net provider? Thanks

aimam 2 comments Joined 09/12
03 Dec 2013

Hello,
 
I'm using the FASTLOAD option, and it was working in our old version of TD, but since updating to TD 14, I now get this error unless I have my batch sizes less than 16k (which defeates the purpose)
 

Caused by: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 13.10.00.31] [Error 5966] [SQLState HY000] Too many data records packed in one USING row.

 

It would seem then that it really isn't a data issue and is an issue whenever you load more than ~16k rows.

 

How can I get around this issue?

 

tomnolan 21 comments Joined 01/08
04 Dec 2013

mingtotti, the Teradata ODBC Driver and Teradata .NET Data Provider do not support FastLoad or FastExport at the present time. 
 
aimam, Teradata Database error 5966 applies to PreparedStatement batches using a regular SQL operation. Teradata Database error 5966 does not occur with FastLoad. Please remember that even when your application specifies the TYPE=FASTLOAD connection parameter, FastLoad is only used if your operation qualifies for FastLoad.
 
Please refer to the Teradata JDBC Driver Reference section about JDBC FastLoad, to find information about which SQL operations qualify for FastLoad.
http://developer.teradata.com/doc/connectivity/jdbc/reference/current/jdbcug_chapter_2.html#BABFGFAF
 
In this particular case, since your application is not using FastLoad for some reason, you can avoid Teradata Database error 5966 by reducing the number of rows in the PreparedStatement batch to below 16383, which is the limit.

aimam 2 comments Joined 09/12
05 Dec 2013

So i've added the TYPE=FASTLOAD in the jdbc connection string, and this used to work, but now no longer does.  Is ther e asetting that would ignore the TYPE property in the connection string?
 

tomnolan 21 comments Joined 01/08
07 Feb 2014

TYPE=FASTLOAD instructs the Teradata JDBC Driver to use the FastLoad protocol for INSERT statements that are compatible with FastLoad. A regular SQL INSERT will be performed for INSERT statements that aren't compatible with FastLoad.
 
If your application specifies TYPE=FASTLOAD and the FastLoad protocol was used in the past, but is no longer used, then something must have changed. For example, your application may now be using a data type that is not supported by the FastLoad protocol.
 
There are two possible ways to obtain the reason why your SQL statement does not qualify for FastLoad.
1. You can specify the LOG=INFO connection parameter, and the Teradata JDBC Driver will log to System.out the reason why the SQL statement does not qualify for FastLoad.
2. After calling the Connection.prepareStatement method, your application can call the Connection.getWarnings method, and then walk the SQLWarning chain. A SQLWarning will list the reason why the SQL statement does not qualify for FastLoad.

jhh19 3 comments Joined 04/12
29 Sep 2014

Hello,
This looks like it's probably what I need but I can't figure out how to use it!
How can I adapt the 'batch ready' script to run from Excel VBA?
What I'm trying to do is write some VBA that will load whatever is in a datasheet to a specified teradata table.
I can use ODBC or ADO but don't know anything about JDBC or Java.
Thanks.
 

jhh19 3 comments Joined 04/12
30 Sep 2014

I'm making some progress! I have the code below which works. I would appreciate it if someone could tell me how I can execute the statements in batches (or any other way to make the multi insert efficient).
/code
Sub test_param_this_works()

    
Dim cn As New ADODB.Connection
Dim cmdPrep1 As New ADODB.Command
Dim prm1 As New ADODB.Parameter
Dim prm2 As New ADODB.Parameter
Dim strCn As String
t = Timer
pwd = InputBox("password?")
strCn = "DSN=GDWPROD1;UID=COJNH;DATABASE=NONPERS;Password=" & pwd & "; Session Mode=ANSI;"

cn.Open strCn
Set cmdPrep1.ActiveConnection = cn

cmdPrep1.CommandText = "INSERT INTO jhtest (col1,col2) VALUES (?,?)"
cmdPrep1.CommandType = adCmdText
cmdPrep1.Prepared = True
 
Set prm1 = cmdPrep1.CreateParameter("param_nm1", adInteger, adParamInput, 1, 1)
cmdPrep1.Parameters.Append prm1
 
Set prm2 = cmdPrep1.CreateParameter("param_nm2", adInteger, adParamInput, 1, 2)
cmdPrep1.Parameters.Append prm2
For i = 1 To 1000
cmdPrep1("param_nm1") = i
cmdPrep1("param_nm2") = i + 1
cmdPrep1.Execute
Next i
cn.Close
MsgBox "This took: " & Timer - t

End Sub
code/

tomnolan 21 comments Joined 01/08
30 Sep 2014

This article is primarily about Java and JDBC. You are unlikely to receive Visual Basic help by posting here. I recommend that you repost your question in the Connectivity forum.

jhh19 3 comments Joined 04/12
01 Oct 2014

Will do. Thanks Tom.

01 Jul 2015

How do I do this in .NET?

Sherif_Banna 1 comment Joined 05/15
18 Jul 2015

I am using Batch insert to insert 250,000 records (50,000 record/batch), and I am getting the following error:
java.sql.BatchUpdateException: [Teradata Database] [TeraJDBC 15.10.00.05] [Error 9128] [SQLState HY000] :The transaction exceeded the maximum number of rowhash locks allowed.
any recommendations? I am thinking about moving to FastLoad, will that solve the issue? or its related to Teradata server hardware?
Thanks

tomnolan 21 comments Joined 01/08
20 Jul 2015

Error 9128, and its remedies, was discussed in another thread:
https://forums.teradata.com/forum/database/failure-9128-the-transaction-exceeded-the-maximum-number-of-rowhash-locks-allowed
 
Every row of data that you attempt to insert may require its own row hash lock. The Teradata Database imposes a limit on the number of row hash locks that can be held by a transaction. When your transaction exceeds the maximum number of row hash locks, then the Teradata Database returns error 9128.
 
There are at least 3 ways to avoid error 9128, listed in order from easiest to hardest:

  • Use a smaller batch size and commit the transaction after fewer inserted rows. If you attempt to insert fewer rows per transaction, then you will avoid hitting the Teradata Database limit on row hash locks.
  • Lock the table for writing before doing any inserts, within the same transaction as the inserts. You can lock a table for writing by executing the SQL command "locking table foobar for write". If your transaction has a table locked for writing, then no row hash locks are needed for inserted rows, and you will avoid hitting the Teradata Database limit on row hash locks. Keep in mind, however, than nobody else can use the table while you have it locked for writing.
  • Use JDBC FastLoad, because the FastLoad protocol does not use row hash locks, so JDBC FastLoad is not subject to the Teradata Database limit on row hash locks. The FastLoad protocol has a bunch of limitations and caveats, so be sure that you application will work OK with FastLoad before trying to use it. The table must be empty, and nobody else can use the table, while you are loading data into the table with FastLoad.
     
You must sign in to leave a comment.