Python Teradata Connection
-
the
Teradata
Database in Python -
Use
PyODBC
Library to Connect toTeradata
in Python -
Use
Teradata
SQL to Connect toTeradata
in Python -
Use
Teradata
Module to Connect toTeradata
in Python
This Python article will show how to connect to Teradata
using Python. The options to connect to Teradata
and export tables to Pandas are numerous.
This article will discuss some essential proper guidelines and look at how the Teradata
module works before we discuss ways of connecting.
the Teradata
Database in Python
You can script powerful interactions with the Teradata
Database using Python with the Teradata
module.
Adopting the udaSQL
philosophy offers a DevOps-oriented SQL engine in which developers can concentrate on their SQL logic rather than performing external configuration, query banding, and logging.
We can download the Teradata
module under the MIT license. The next step is to install and download the package from PyPI
.
Due to its open-source nature, this module is supported by the community. However, customers and engineers are not provided support for the interoperability of this module with third-party applications, such as Teradata
’s ODBC driver
and sqlalchemy-Teradata
.
Use PyODBC
Library to Connect to Teradata
in Python
The Pandas data frame can be created with SQL and uploaded to Teradata
with a Teradata
data frame.
- An environment with Pandas installed in Python.
- The
Teradata
database’s hostname/IP address and connection method are known. - An
ODBC
driver must be installed on the machine you are attempting to connect toTeradata
.
Alternatively, you can contact your company’s DBA if you are unsure of Teradata
’s database details. Without the Teradata
ODBC
driver, you may need to use JDBC
and PySpark
.
In this tutorial, we will also discuss other methods.
Use the PyODBC
Library
With PyODBC
, you can easily connect to ODBC
databases using Python. Furthermore, this library implements the DB API 2.0 specification with even more Pythonic features.
In your Python environment, you should install the PyODBC
library. Then, you can install the library with either Conda
or Pip
.
pip install pyodbc
OR
conda install pyodbc
A Python package called pip
is used to install packages, libraries, and modules. For example, after pip
has installed the PyODBC
module and Teradata
’s ODBC
driver, run the following Python code to list the existing drivers.
Verify that Teradata
appears in this list by paying attention to the Teradata
driver’s name.
pyodbc.drivers()
Extract Data From Teradata
Into Pandas Data Frame
The following code fence shows how to extract data from SQL into Pandas data frames. But, first, check the next output data frame.
teradata_df.head()
Python’s df.head()
always returns the first above 5
rows. So here, it will show the first five rows from the default table of Teradata
from 0-4
.
teradata_df.info()
There is much pertinent information in the connection string, such as the hostname
, driver
, username
, password
, and authentication protocol
.
You may need to pass optional parameters depending on your Teradata
settings. Parameters that are ODBC-compatible can be passed to PyODBC
.
Upload Data Frame to Teradata
From Pandas
The process of pulling data is simple, but the process of uploading data is more complicated.
- The Pandas data frame needs to be transformed into a schema.
- We should break up the data frame into chunks. The
ODBC
database has a maximum of1MB
per insert, so it will fail if your data frame is large. - We should insert the records sequentially.
You can upload the data frame by following the code below.
cnxn.commit()
print("Query complete. Running time is %s sec/s." % (round(end_time - start_time)))
The cnxn.commit()
command will commit the changes and make them permanent.
Use Teradata
SQL to Connect to Teradata
in Python
To use this package, you do not need to install Teradata
drivers (other than this one).
import teradatasql
with teradatasql.connect(host="name", user="name", password="*****") as connect:
df = pd.read_sql(query, connect)
Once the terasql
imports, the Teradata
will connect with the following parameters host
, username
& password
. Then after a successful connection, the query will be read and executed.
Another way is to use the Giraffez
module. This module has many useful features, such as MLOAD
, FASTLOAD
, BULKEXPORT
, etc. However, there are only a few requirements for beginners (e.g., C/C++ compiler, Teradata CLIv2
, and TPT API headers/lib
files).
Please note that the context manager has been updated to ensure that sessions are closed as of 13-07-2018
. DF
can be used to send data to Teradata
.
Using the rest
method, we can eliminate the 1MB
limit imposed by odbc
and the dependence on the odbc
driver. We should use the host IP address in place of the driver argument.
import teradata
import pandas as pd
udaExec = teradata.UdaExec(appName="webApp", version="1.0", logConsole=False)
with udaExec.connect(
method="rest_one",
system="DB_Name",
username="user_name",
password="*******",
host="HOST_IP_ADDRESS",
) as connect:
data = [tuple(x) for x in df.to_records(index=False)]
connect.executemany(
"INSERT INTO DATABASE.TABLEWITH5COL values(?,?,?,?,?)", data, batch=True
)
To avoid the HY001[ODBC Teradata Driver] Memory allocation error
, chunk your data into less than 1MB
chunks when using the ODBC
Teradata
Driver. For example:
import teradata
import pandas as pd
import numpy as np
udaExec = teradata.UdaExec(appName="test", version="1.0", logConsole=False)
with udaExec.connect(
method="odbc",
system="DBName",
username="User_Name",
password="*******",
driver="Driver_Name",
) as connect:
chunks_df = np.array_split(huge_df, 100)
for i, _ in enumerate(chunks_df):
data = [tuple(x) for x in chuncks_df[i].to_records(index=False)]
connect.executemany(
"INSERT INTO DATABASE.TABLEWITH5COL values(?,?,?,?,?)", data, batch=True
)
Following is another easy way to connect Teradata
with Python.
Use Teradata
Module to Connect to Teradata
in Python
By executing the following command if pip is already installed, you can install this module directly:
pip install Teradata
You can download the package at the following URL if you don’t already have it:https://pypi.org/pypi/teradata.
Once you download the teradata
package, unzip it and then use the command prompt to navigate to the directory that contains setup.py, plus execute the following command to install:
python setup.py install
Example code:
import teradata
import sys
udaExec = teradata.UdaExec(appName="HelloPeople", version="1.0", logConsole=False)
session = udaExec.connect(
method="odbc",
dsn="td16vm",
username="",
password="",
autocommit=True,
transactionMode="Teradata",
)
for row in session.execute("select getqueryband();"):
print(row)
for row in session.execute("select top 20 tablename, tablekind from dbc.tables;"):
print(row)
session.close()
input("Type <Enter> to exit...")
To connect to Teradata
, we must configure these parameters: transaction mode
is Teradata
; ODBC
is the connecting method
(the other option is REST
), DSN
is td16vm
, configured with the following parameters in the computer.
In the next step, you will need to create a virtual machine for Teradata
. Here’s the running result for the above sample code:
As discussed, several ways of connecting Teradata
to Python exist. With step-by-step directions, all the possible ways to connect the Teradata
module in Python have been revealed here.
My name is Abid Ullah, and I am a software engineer. I love writing articles on programming, and my favorite topics are Python, PHP, JavaScript, and Linux. I tend to provide solutions to people in programming problems through my articles. I believe that I can bring a lot to you with my skills, experience, and qualification in technical writing.
LinkedIn