Skip to main content
02 Apr 2024
01 Apr 2024

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

Details


Overview

teradatamlspk is a Python package, built as an extension of teradataml, Teradata Python package. Syntax and user accessibility of teradatamlspk APIs are kept similar to PySpark APIs, allowing, the existing PySpark workloads, that run on Spark engine, can be easily run on Teradata Vantage with minimal changes to migrate PySpark workloads to Vantage.

teradatamlspk offers another function pyspark2teradataml that enables conversion of a PySpark script to a teradatamlspk Python script. It also generates the HTML report for the conversion, that is useful for the user to understand the changes done and also carry out any manual changes in the generated script, so that the script can be run on Vantage.

Dependent Python Packages: 

  • teradataml 20.0.0.0 or Later
  • PrettyTable

 

Technical Details

  • Version
  • Released
  • TTU
  • OS
  • Teradata

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

28 Mar 2024
18 Mar 2024
18 Mar 2024