All Forums Connectivity
ykazarov 4 posts Joined 08/14
25 Feb 2015
Slow Record Loading into Teradata - with or without PK

I am trying to use the bulk load method using TDCommandBuilder in C# and also compare it to the individual insert statement performance using the following code. I move data back and forth between SQL Server and Teradata. I see speeds of several million rows per minute in SQL Server using sqlcommandbuilder versus about 50000 a minute in Teradata with  or without tdcommandbuilder. What am I doing wrong here? I am using Teradata driver version 14.0.0.0. Thank you.

                using (TdConnection connection = connect())        // initialize connection
                {


                    // set all required for batch elements and fill the schema
                    DataTable dtInsertBatch = new DataTable();
                    TdDataAdapter adapter = new TdDataAdapter();
                    adapter.SelectCommand = connection.CreateCommand();

                    adapter.SelectCommand.CommandText = "SELECT " + String.Join(", ", getColumnNames(loadData)) + " FROM " + tableName;

                    adapter.FillSchema(dtInsertBatch, SchemaType.Source);

                    if (null != _monitor) _monitor.onLogMessage(dtInsertBatch.Columns.Count + " columns populated into source schema using " + Environment.NewLine + adapter.SelectCommand.CommandText, "TeradataFastloadAdapter.insert");

                    if (0 == dtInsertBatch.PrimaryKey.Length)   // if no PK then set own insert command
                    {
                        if (null != _monitor) _monitor.onLogMessage("No primary key detected. Will insert using individual insert statements", "TeradataFastloadAdapter.insert");


                        TdCommand insertCommand = connection.CreateCommand();

                        insertCommand.CommandType = CommandType.Text;
                        insertCommand.CommandText = "INSERT INTO " + tableName + "(" +
                                                    String.Join(", ", getColumnNames(loadData)) + ")"
                                                    + " VALUES (" + String.Join(", ", getColumnNamesAsParameters(loadData)) +
                                                    ")";

                        foreach (DataColumn colParm in loadData.Columns)
                        {
                            // add new parameter
                            insertCommand.Parameters.Add(colParm.ColumnName, colParm.DataType);
                        }

                        if (null != _monitor) _monitor.onLogMessage("Starting insert into " + tableName, "TeradataFastloadAdapter.insert");
                        updateCount = 0;
                        // insert one by one
                        insertCommand.Transaction = connection.BeginTransaction();
                        foreach (DataRow row in loadData.Rows)
                        {
                            foreach (TdParameter parm in insertCommand.Parameters)
                            {
                                parm.Value = row[parm.ParameterName];
                            }
                            if (1 != insertCommand.ExecuteNonQuery())
                            {

                                insertCommand.Transaction.Rollback();
                                if (null != _monitor) _monitor.onLogMessage("Error: Individual statement number " + updateCount + "/" + loadData.Rows.Count + " failed to complete", "TeradataFastloadAdapter.insert");
                                throw new Exception("Unable to load all records using individual statement calls");
                            }
                            updateCount++;
                        }

                        insertCommand.Transaction.Commit();

                    }
                    else      // insert using builder when PK is present
                    {
                        if (null != _monitor) _monitor.onLogMessage("Primary key detected. Will use batch load to insert records", "TeradataFastloadAdapter.insert");

                        TdCommandBuilder builder = new TdCommandBuilder(adapter);
                        adapter.SelectCommand.Transaction = connection.BeginTransaction();

                        // add rows to insert
                        dtInsertBatch.Merge(loadData);

                        foreach (DataRow dataRow in dtInsertBatch.Rows)
                            if (dataRow.RowState == DataRowState.Unchanged) dataRow.SetAdded();

                        if (null != _monitor) _monitor.onLogMessage("Starting batch insert into " + tableName, "TeradataFastloadAdapter.insert");

                        updateCount = adapter.Update(dtInsertBatch);

                        if (updateCount != dtInsertBatch.Rows.Count)
                        {
                            if (null != _monitor) _monitor.onLogMessage("Error: Expected insert count is " + loadData.Rows.Count + " but actual is " + updateCount, "TeradataFastloadAdapter.insert");
                            throw new Exception("Unable to load all records using batch load");
                        }
                        else
                        {
                            if (null != _monitor) _monitor.onLogMessage("Batch insert successfully inserted " + updateCount + " rows", "TeradataFastloadAdapter.insert");
                        }

                        adapter.SelectCommand.Transaction.Commit();
                    }
                }

 

NetFx 346 posts Joined 09/06
25 Feb 2015

Two suggestions:
1- Set TdDataAdapter.UpdateBatchSize property to a number greater than 1. It defaults to 1; therefore the DataAdapter is sending on row at a time to the Teradata Database. 
 
2- Set the TdParameter properties (Size, Precision, Scale) to match the target column when not using the TdCommandBuilder.
 

ykazarov 4 posts Joined 08/14
02 Mar 2015

NetFx,
Thank you for the response.
First suggestion cut the load time in half. Thank you!
One thing I found about the second was that precision and scale are not contained in DataColumn objects and I will have to find another way to speed things up.
https://social.msdn.microsoft.com/Forums/en-US/6b7f82db-abdd-484e-b49f-65475ef0ade9/adonet-datatable-numeric-precision-and-scale?forum=adodotnetdataset

ykazarov 4 posts Joined 08/14
02 Mar 2015

Another update: 
Getting error trying to insert 1,000,000 rows into a table using UpdateBatchSize = 10

Teradata.Client.Provider.TdException was caught

  HResult=-2147467259

  Message=[Teradata Database] [9128] The transaction exceeded the maximum number of rowhash locks allowed.

  Source=Teradata Database

  ErrorCode=-2147467259

  TxRolledBack=true

  StackTrace:

       at System.Data.Common.DbDataAdapter.UpdatedRowStatusErrors(RowUpdatedEventArgs rowUpdatedEvent, BatchCommandInfo[] batchCommands, Int32 commandCount)

       at System.Data.Common.DbDataAdapter.UpdatedRowStatus(RowUpdatedEventArgs rowUpdatedEvent, BatchCommandInfo[] batchCommands, Int32 commandCount)

       at System.Data.Common.DbDataAdapter.Update(DataRow[] dataRows, DataTableMapping tableMapping)

       at System.Data.Common.DbDataAdapter.UpdateFromDataTable(DataTable dataTable, DataTableMapping tableMapping)

       at System.Data.Common.DbDataAdapter.Update(DataTable dataTable)

...

NetFx 346 posts Joined 09/06
02 Mar 2015

This is a Teradata Database error/limit. The application starts a transaction and then attempts to load 1,000,000 rows; each row inserated results in a rowhash lock (the table has a primary-key; i.e. Unique Primary Index, Not Null).
Therefore I suggest to either 1) Use Auto-Commit 2) Commit more often 3) find a way to Lock the Table for Write before starting the Batch Update.

You must sign in to leave a comment.