How to Load TSV File Into a Pandas DataFrame
Today, Pandas DataFrames usage is the most popular in Data Science. Using the Pandas library, we can load and read data from different types of files such as csv
, tsv
, xls
, etc.
Most users store their data in a tsv
file format. So, in this case, we should know how to load a tsv
file and read data from this file format.
TSV stands for Tab-separated values. It is a simple text file format used to store data in a tabular structure.
For example, we can store a spreadsheet or database tables in the tsv
format to exchange information between different databases.
The TSV file format is similar to the CSV file format but, in the .tsv
file, data is separated with tabs in plain text format.
We will demonstrate in this tutorial how to load a tsv
file into a Pandas DataFrame
. We will provide different examples to read tsv
file data using Pandas dataframes
.
Basic Syntax for Reading a TSV File Using Pandas
This syntax pd.read_csv(file_path, sep='\t')
is used to read a tsv
file into the pandas DataFrame
.
It is quite a simple process to load tsv
file data using Pandas DataFrame
. First, we will import all required modules and then, using the above syntax, load the tsv
file.
Load a TSV File Using Pandas DataFrame
To load a tsv
file using pandas DataFrame
, use the read_csv()
method.
Load the tsv
file into pandas DataFrame
using the separator \t
.
In the following example, we have loaded a tsv
file using the pandas DataFrame
by using the file path and format specifier \t
inside the method read_csv(file_path, sep='\t')
as arguments.
import pandas as pd
# testdata.tsv is stored in PC
dataframe = pd.read_csv("C:\\Users\\DELL\\OneDrive\\Desktop\\testdata.tsv", sep="\t")
dataframe
Output:
If we do not pass the separator \t
to augment the file path, we will receive the following output on the terminal.
import pandas as pd
# testdata.tsv is stored in PC
dataframe = pd.read_csv("C:\\Users\\DELL\\OneDrive\\Desktop\\testdata.tsv")
dataframe
Output:
Load a tsv
file into a pandas DataFrame
using header argument
We can pass the head as an argument in the read.csv()
method. If the dataset header is present, use the header=0
as an argument.
import pandas as pd
# testdata.tsv is stored in PC
dataframe = pd.read_csv(
"C:\\Users\\DELL\\OneDrive\\Desktop\\testdata.tsv", sep="\t", header=0
)
dataframe
Output:
Similarly, we can also show multiple rows as a header. For example, we want to display the first three rows as header=[1,2,3]
.
To implement this approach, see the below-given example:
import pandas as pd
# testdata.tsv is stored in PC
dataframe = pd.read_csv(
"C:\\Users\\DELL\\OneDrive\\Desktop\\testdata.tsv", sep="\t", header=[1, 2, 3]
)
dataframe
Output:
Conclusion
This tutorial shows how to load a tsv
file into the Pandas DataFrame
. Above, we demonstrated different examples for loading a tsv
file.
Test all the above examples on your python notebook for better understanding.