How to Merge CSV Files in Python
-
Combine Multiple CSV Files in a Single Pandas
DataFrame
Using Merging by Names -
Merge Multiple CSV Files in a Single Pandas
DataFrame
by Merging All Fields - Conclusion
While working with a large dataset in the form of .csv files in Pandas DataFrame
, it might be possible that a single file does not contain the complete information for data analysis. In this case, we need to merge multiple files in a single pandas DataFrame
. Python pandas library provides various methods to solve this problem, such as concat
, merge
, and join
.
In this guide, we will learn two different methods for merging the multiple .csv files into a single Pandas DataFrame
with the help of different examples.
Combine Multiple CSV Files in a Single Pandas DataFrame
Using Merging by Names
To merge multiple .csv files, first, we import the pandas library and set the file paths. Then, using the pd.read_csv()
method reads all the CSV files. The pd.concat()
method takes the mapped CSV files as an argument and then merges them by default along the row axis. The ignore_index=True
argument is used to set the continuous index values for the newly merged DataFrame
.
See the following example we have implemented the approach as mentioned above using pandas python:
Example Code:
import pandas as pd
# set files path
sales1 = "C:\\Users\\DELL\\OneDrive\\Desktop\\salesdata1.csv"
sales2 = "C:\\Users\DELL\\OneDrive\\Desktop\\salesdata2.csv"
print("*** Merging multiple csv files into a single pandas dataframe ***")
# merge files
dataFrame = pd.concat(map(pd.read_csv, [sales1, sales2]), ignore_index=True)
print(dataFrame)
Output:
*** Merging multiple csv files into a single pandas dataframe ***
Product_Name Quantity Sale_Price
0 Acer laptop 3 500$
1 Dell Laptop 6 700$
2 Hp laptop 8 800$
3 Lenavo laptop 2 600$
4 Acer laptop 3 500$
5 Dell Laptop 6 700$
6 Hp laptop 8 800$
7 Lenavo laptop 2 600$
Merge Multiple CSV Files in a Single Pandas DataFrame
by Merging All Fields
To merge all .csv files in a pandas DataFrame
, we used the glob module in this approach. First, we had to import all libraries. After that, we set the path for all files that we need to merge.
In the following example, the os.path.join()
takes the file path as the first argument and the path components or .csv files to be joined as the second argument. Here, the salesdata*.csv
will match and return every file that starts with salesdata
in the specified home directory and ends with the .csv extension. The glob.glob(files_joined)
takes an argument of the merged file names and returns a list of all merge files.
See the following example to merge all .csv files using the glob module:
Example Code:
import pandas as pd
import glob
import os
# merging the files
files_joined = os.path.join(
"C:\\Users\\DELL\\OneDrive\\Desktop\\CSV_files", "salesdata*.csv"
)
# Return a list of all joined files
list_files = glob.glob(files_joined)
print("** Merging multiple csv files into a single pandas dataframe **")
# Merge files by joining all files
dataframe = pd.concat(map(pd.read_csv, list_files), ignore_index=True)
print(dataframe)
Output:
** Merging multiple csv files into a single pandas dataframe **
Product_Name Quantity Sale_Price
0 Acer laptop 3 500$
1 Dell Laptop 6 700$
2 Hp laptop 8 800$
3 Lenavo laptop 2 600$
4 Acer laptop 3 500$
5 Dell Laptop 6 700$
6 Hp laptop 8 800$
7 Lenavo laptop 2 600$
Conclusion
We introduced two approaches to merge multiple CSV files in pandas python in this tutorial. We have seen how we can read .csv files and merge them into a single Pandas DataFrame
using the pd.concat()
method. Moreover, we now know how to use the glob
module in Pandas python code.
Related Article - Python CSV
- How to Import Multiple CSV Files Into Pandas and Concatenate Into One DataFrame
- How to Split CSV Into Multiple Files in Python
- How to Compare Two CSV Files and Print Differences Using Python
- How to Convert XLSX to CSV File in Python
- How to Write List to CSV Columns in Python
- How to Write to CSV Line by Line in Python