1. Developer Portal ...
  2. Downloads
  3. connectivity
  4. teradatamlspk - Teradata Python package for running Spark workloads on Vantage
  1. Developer Portal
  2. Downloads
  3. connectivity
  4. teradatamlspk - Teradata Python package for running Spark workloads on Vantage

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

  • python

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

  • python

Details

Overview

teradatamlspk is a Python package, built as an extension of teradataml, Teradata Python package. Syntax and user accessibility of teradatamlspk APIs are kept similar to PySpark APIs, allowing, the existing PySpark workloads, that run on Spark engine, can be easily run on Teradata Vantage with minimal changes to migrate PySpark workloads to Vantage.

teradatamlspk offers another function pyspark2teradataml that enables conversion of a PySpark script to a teradatamlspk Python script. It also generates the HTML report for the conversion, that is useful for the user to understand the changes done and also carry out any manual changes in the generated script, so that the script can be run on Vantage.

Dependent Python Packages: 

  • teradataml >= 20.00.00.03
  • PrettyTable
  • Nbformat
  • pytz
  • Prerequisite: Python >= 3.9.0 on the client machine

 

Not Applicable
OS version

Specifications

  • Version
  • Released
  • TTU
  • OS
  • Teradata

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

  • python