All Forums Connectivity
helmut.karg 4 posts Joined 01/10
28 Jul 2010
.Net Provider - Column Character Set


Is there a way to tell the character set from a char/varchar column in a TdDataReader?

The DBC tables contain a field "CharType" which is 1 for "character set latin" and 2 for "character set unicode" columns. The "ColumnLength" is also doubled for unicode.

Using the GetSchemaTable command produces identical rows for both character sets.

There is a workarounds by accessing the dbc table (which does not work in every case for generic queries), but is there a more elegant way?


NetFx 346 posts Joined 09/06
28 Jul 2010

TdDataReader.GetSchemaTable method does not return CharType (Character Set not to be confused with Session Character Set). However Columns Schema Collection (TdConnection.GetSchema('Columns' ...) method) returns CharType.

That said there is another way; for example given the following table


TYPE function return the CharType: Select type(GetCharType.C02), type(GetCharType.C03)

Type(C02) Type(C03)

helmut.karg 4 posts Joined 01/10
29 Jul 2010


This requires that the columns exist in a physical table.
I'd like to let an user enter a statement, which is then executed and the result exported to a mssql server. The problem is, that I don't know wheter to use char or nchar for the destination table.

My current workaround consists of:
- parsing the statement to remove an "order by" clause
- creating a (volatile) table with no data
- retrieving the metadata of the table
- submitting the original query

What gives me a headache is parsing the statement, since this is something I'd always try to avoid (and any "order by" may be part of analytic functions).

Is there no better way?


NetFx 346 posts Joined 09/06
03 Aug 2010

I agree that Char_type can help in some scenarios. But Teradata Database has a concept of Session Character Set. All Char/VarChar data is converted to Session Character Set before it is send to the Data Provider. For example assume Table FOO has two columns:


If application sets the Session Character set to KANJISJIS_0S then Unicode data (Col2) is always converted to KanjiSJIS. Therefore in some scenarios the Session Character set is very important and it overshadows the underlying Column-Character-Set. I recommend using SQL Server NCHAR data type and set the Session Character Set to UTF8 or UTF16.

helmut.karg 4 posts Joined 01/10
04 Aug 2010


In some (or most?) cases, this is a complete waste of space - a typical "flag" column (char(1) character set latin not null compress ('Y', 'N')) takes about 2 bit on Teradata and 2 byte as nchar(1) on SQL Server.

SSIS also maps everything to nchar/nvarchar by default :(

- Helmut

omng392 8 posts Joined 06/12
23 Oct 2013

It is dbc. columns, not dbc.tables
If you look in the dbc.columns table you will find what you looking for.  If you find a chartype = 1 then it character set latin and if you find chartype = 2, it is unicode.

RickWassum 3 posts Joined 04/11
04 Jun 2014

It is now almost 4 years since Helmut's post, and I have his exact same issue.  I need to know whether a column is using double byte characters or not so I know whether to declare a char/varchar vs. an nchar/nvarchar in SqlServer.  It would be extremely useful for GetSchemaTable() to return a column that would indicate whether the column used double byte characters.  Even if it returned a ByteSize column, then you could compare ColumnSize to ByteSize and determine if DoubleByte characters are being used.  The fact that ColumnSize returns different values, either maximum number of bytes or characters depending on character set means that it must know, so why not give us a column in the SchemaTable that will let us in on the secret, because otherwise there is no way to tell from the SchemaTable.
From the Teradata.Net documentation for ColumnSize:
The maximum possible length of the column in bytes or characters.

  • Numeric Data Types: This is the size of the data type in bytes (for example, 4 for INTEGER).
  • Character and CLOB Data Types: The maximum length of the column in characters for LATIN, UNICODE and GRAPHIC. The maximum length of the column in bytes (single-byte character) for KANJI1 and KANJISJIS.
  • Byte and BLOB Data Types: The maximum length of the column in bytes.
  • DateTime and Interval Data Types: The maximum length of the column in characters.
  • JSON Data Type: The maximum length of the column in characters.

As Helmut said, currently the only way to get whether a column is Unicode is to try and query the dbc.columns data and check for CharType=2, but that is inefficient and problematic for processing the schema of adhoc queries where determining the database, table and column that was the source of a particular column in order to attempt to query dbc.columns would require a full blown query parser to do it correctly.   The alternative to that as Helmut also suggested is creating an empty table based on the query then getting it's schema info, but that has challenges as well.
Bottom line all the alternatives are ugly and error prone requiring way too much effort for something that should be easy to determine from the SchemaTable returned by TdDataReader.GetSchemaTable()
If anyone is listening, can you throw us a bone? :)

NetFx 346 posts Joined 09/06
04 Jun 2014

A workaround that does not require parsing. The TdDataReader.GetSchemaTable() returns "BaseSchemaName", "BaseTableName" and "BaseColumnName"; you can compose a "HELP COLUMN BaseSchemaName.BaseTableName.BaseColumnName" command; it will reutrn the chartype and it does not require parsing.
We will try to add it to the next release.

RickWassum 3 posts Joined 04/11
06 Jun 2014

Thank you for the quick response, it is very much appreciated.  The use of the Base* values in the schema table does seem to work nicely to get the necessary values for the query to retrieve the CharType, though if you can add it to the SchemaTable that would be awesome, and way more efficient than running additional queries.  
Currently I'm using the Base* values to query against dbc.columns instead of using the HELP COLUMN so I can limit the information returned to just what I need, however can you tell me if there is any reason why I should use HELP COLUMN over querying against dbc.columns?
Querying dbc.columns also allows me to retrieve the CharType and any other info need for all the tables and columns I need in one call limiting it to only characters columns.  If I used HELP TABLE BaseSchemaName.BaseTableName, that would not let me limit to just character columns.
Again, I don't know if there is an advantage of use the HELP statements vs querying dbc.columns that I'm unaware of.  If so, let me know.
Thanks again...

NetFx 346 posts Joined 09/06
07 Jun 2014

DBC.Columns[x] view are obsolete; the replacment system views are DBC.ColumnsV[x]. However you should use HELP COLUMN because DBS.ColumnsV[X] does not return CharType (or any other data type metadata) for View-Columns (vs. Table-Columns). Note that HELP TABLE does not return CharType for View-Columns either. You can either use "HELP COLUMN BaseSchema.BaseTable.*" or "HELP COLUMN BaseSchema.BaseTable.BaseColumnName". Also you can compose a multi-statement request; for example "HELP COLUMN X.Y.C01; HELP COLUMN X.Y.C222".

RickWassum 3 posts Joined 04/11
15 Aug 2014

Thank you again for your comment.  I got pulled off on another project and I'm just getting back to the project relating to this issue.  The problem I have with the help functions is that apparently they require different permissions than are required to run select on the dbc.tables* or dbc.columns* views, because if I attempt to call them for a database object in a database/table which is only exposed to me via a view in another database, I get an error that I have no permissions for the object, so that is a problem.  I can however query dbc.ColumnsV for the BaseSchema, BaseTable, BaseColumnName and that works ok. 
Also, in a previous post, you considered the possiblitity of adding the CharType info to the data returned by TdDataReader.GetSchemaTable().  I don't know where that stands, but I have another field that I need.  The UpperCaseFlag is required to determine if the field is forced to uppercase or not, but more importantly it indicates whether the column is Case Specific or not.  This is important information because if a column requires case sensitivity or not, that is important to know, and like CharType, it is not availlable in the current SchemaTable info returned.  It could be returned as is a char as it is in UpperCaseFlag, or as two separate boolean properties (IsUpperCase, IsCaseSpecific) or whatever convention you use for boolean properties.  In any event, if you add CharType, it would be very helpful to add the UpperCaseFlag info also.

scott.currie 1 post Joined 02/13
11 Feb 2015

We are hitting this same issues with our product that supports Teradata as a data provider.  While some of the workarounds above can be used, they all add substantially to the processing time for column metadata discovery.  When our customers ask why "builds that use Teradata are so slow," we will have to explain to them that Teradata does not provide a mechanism to obtain character type information for columns of queries without hitting the server multiple times.
I also really don't like the extra complexity we must add to our code to group together the HELP COLUMN calls so that we don't have to do them on every single column of every query our shared customers might use.
You've known about this issue for at least 5 years now.  You should really consider fixing it.

You must sign in to leave a comment.