Skip to main content

Teradata Package for Python - teradataml

Details


Teradata Python Package Product Overview

Note: Teradata recommends teradataml pip install from https://pypi.org/project/teradataml/.
Download from downloads.teradata.com location if your organization does not allow you to install directly from https://pypi.org/project/teradataml/.

The Teradata Python Package teradataml combines the benefits of the open source Python language environment with the massive parallel processing capabilities of Teradata Vantage, which includes the Machine Learning Engine analytic functions and the Advanced SQL Engine in-database analytic functions. The Teradata Python package allows users to develop and run Python programs that take advantage of the Big Data and Machine Learning analytics capabilities of Vantage.

The Teradata Python package teradataml is a Python library package like other open source Python packages. The package interface makes available to Python users a collection of functions for analytics that reside on Vantage, so that Python users can perform analytics with no SQL coding required. Specifically, the teradataml package provides functions for data manipulation and transformation, data filtering and sub- setting, and can be used in conjunction with open source Python libraries. The teradataml package uses SQLAlchemy and provides an interface similar to the Pandas Python library.
The Teradata Python Package works over connections to:
    -  Teradata Vantage with Advanced SQL Engine and ML Engine
    -  Teradata Vantage with Advanced SQL Engine only

teradataml is now compatible with SQLAlchemy 2.0.X

 * **Important notes** when user has sqlalchemy version >= 2.0: 
      * Users will not be able to run the `execute()` method on SQLAlchemy engine object returned by 
        `get_context()` and `create_context()` teradataml functions. This is because SQLAlchemy has
        removed the support for `execute()` method on the engine object. 
        Thus, user scripts where `get_context().execute()` and `create_context().execute()`, is used, 
        Teradata recommends to replace those with either `execute_sql()` function exposed by teradataml 
        or `exec_driver_sql()` method on the `Connection` object returned by `get_connection()` function 
        in teradataml.

     from teradataml import execute_sql
     execute_sql("DROP TABLE test_select")

     get_connection().exec_driver_sql("select sessionno from DBC.SessionInfoV where UserName = 'alice';")

      * Now `get_connection().execute()` accepts only executable sqlalchemy object. Refer to 
        `sqlalchemy.engine.base.execute()` for more details.

Download teradadatasqlalchemy from:

https://downloads.teradata.com/download/package/202545

OR

https://pypi.org/project/teradatasqlalchemy/

tdapiclient: Integration of Teradata Vantage with AWS SageMaker and Azure-ML

tdapiclient Python library allows AWS SageMaker and Teradata users to use AWS SageMaker Python library's interface to train/predict using teradataml DataFrame. tdapiclient will transparently convert teradataml DataFrame in S3 address to be used for training and it will also allow user to use teradataml DataFrame as input for inference.

tdapiclient also allows Azure-ML and Teradata Users to use easier interface to train/predict using teradataml DataFrame. tdapiclient will transparently convert teradataml DataFrame to azure-ml dataset or blob store to be used for training and it will allow users to use teradataml DataFrame as input for inference. Additionally , tdapiclient also allows to deploy azure-ml trained models in Teradata Vantage system for in-database scoring using BYOM functionality.

Teradata recommends downloading tdapiclient library from PyPi location : https://pypi.org/project/tdapiclient/.

Download from https://downloads.teradata.com/download/connectivity/tdapiclient-teradata-third-party-analytics-integration-python-library location if your organization does not allow you to install directly from PyPi.

 

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

teradatamlspk is a Python package, built as an extension of teradataml, Teradata Python package. Syntax and user accessibility of teradatamlspk APIs are kept similar to PySpark APIs, allowing, the existing PySpark workloads, that run on Spark engine, can be easily run on Teradata Vantage with minimal changes to migrate PySpark workloads to Vantage.

teradatamlspk offers another function pyspark2teradataml that enables conversion of a PySpark script to a teradatamlspk Python script. It also generates the HTML report for the conversion, that is useful for the user to understand the changes done and also carry out any manual changes in the generated script, so that the script can be run on Vantage.

Teradata recommends downloading teradatamlspk library from PyPi location : https://pypi.org/project/teradatamlspk/.

Download from https://downloads.teradata.com/download/connectivity/teradatamlspk-teradata-python-package-running-spark-workloads-vantage location if your organization does not allow you to install directly from PyPi. 

General product information is available in the Teradata Documentation Website.

Teradata Python Package User Guide – B700-4006

Teradata Python Package Function Reference – B700-4008

For community support, please visit the Connectivity Forum.

For Teradata customer support, please visit Teradata Access.

Not Applicable
OS version

Technical Details

  • Version
  • Released
  • TTU
  • OS
  • Teradata

Teradata Package for Python - teradataml