How to Concatenate Multiple Files Into a Single File in Python
- Concatenate Multiple Files Into a Single File in Python
- Alternative Methods to Concatenate Multiple Files Into a Single File in Python
- Conclusion
Python is a robust and general-purpose programming language heavily used in many domains these days.
Python’s simple syntax and a torrent of services working behind the scenes make tasks such as object-oriented programming, automated memory management, and file handling seamless.
We can easily create files, read files, append data, or overwrite data in existing files using Python. It can handle almost all the available file types with the help of some third-party and open-source libraries.
This article teaches how to concatenate multiple files into a single file using Python.
Concatenate Multiple Files Into a Single File in Python
To concatenate multiple files into a single file, we have to iterate over all the required files, collect their data, and then add it to a new file. Refer to the following Python code that performs a similar approach.
1.txt
:
This is one.
2.txt
:
This is two.
3.txt
:
This is three.
Example code:
filenames = ["1.txt", "2.txt", "3.txt"]
with open("new-file.txt", "w") as new_file:
for name in filenames:
with open(name) as f:
for line in f:
new_file.write(line)
new_file.write("\n")
print("Concatenation complete. Check 'new-file.txt' for the result.")
The Python code above contains a list of filenames or file paths to the required text files. Next, it opens or creates a new file by new-file.txt
.
Then, it iterates over the list of filenames or file paths. Each file creates a file descriptor, reads its content line by line, and writes it to the new-file.txt
file.
At the end of each line, it appends a newline character or \n
to the new file.
Output:
Concatenation complete. Check 'new-file.txt' for the result.
Note: We can see the file created on the same path where the script file is located.
Alternative Methods to Concatenate Multiple Files Into a Single File in Python
The following methods have a similar code structure and do the same output as the method above.
Let’s explore these four methods: Using the shutil
module, file handling, the os
module, and the cat
command (for Linux/Unix systems).
Using the shutil
Module
The shutil
module in Python provides a high-level interface for file operations. This method simplifies the process of copying the contents of multiple files into a single file.
1.txt
:
This is one.
2.txt
:
This is two.
3.txt
:
This is three.
Example code:
import shutil
def concatenate_files(file_list, output_file):
with open(output_file, "wb") as outfile:
for file in file_list:
with open(file, "rb") as infile:
shutil.copyfileobj(infile, outfile)
# Insert a newline after each file
outfile.write(b"\n")
# Example usage
input_files = ["1.txt", "2.txt", "3.txt"]
output_file = "concatenated_output.txt"
# Call the function to concatenate files
concatenate_files(input_files, output_file)
print(f"Files {input_files} have been concatenated into {output_file}.")
We import the shutil
module to leverage its copyfileobj
function. The concatenate_files
function takes a list of input file names (file_list
) and the desired output file name (output_file
).
Next, it opens the output file in binary write mode ('wb'
) to handle different file types. Then, it iterates through each input file, opens it in binary read mode ('rb'
), and copies its content to the output file.
Note that the binary write mode ('wb'
) and binary read mode ('rb'
) will also be used in the next two methods.
Output:
Files ['1.txt', '2.txt', '3.txt'] have been concatenated into concatenated_output.txt.
Using File Handling
Another approach is to manually handle file operations using basic file handling in Python. This method gives you more control over the process:
1.txt
:
This is one.
2.txt
:
This is two.
3.txt
:
This is three.
Example code:
def concatenate_files(file_list, output_file):
with open(output_file, "wb") as outfile:
for file in file_list:
with open(file, "rb") as infile:
# Read lines from the input file
lines = infile.readlines()
# Write lines to the output file
outfile.write(b"".join(lines))
# Insert a newline after each line
outfile.write(b"\n")
# Example usage
input_files = ["1.txt", "2.txt", "3.txt"]
output_file = "concatenated_output2.txt"
# Call the function to concatenate files
concatenate_files(input_files, output_file)
print(f"Files {input_files} have been concatenated into {output_file}.")
The concatenate_files
function structure is similar to the method above.
This method opens the output file in binary write mode ('wb'
) and iterates through each input file, opens it in binary read mode ('rb'
) and writes its content to the output file.
Output:
Files ['1.txt', '2.txt', '3.txt'] have been concatenated into concatenated_output2.txt.
Using the os
Module
The os
module provides a platform-independent way to interact with the operating system. This method involves reading and writing files using os
.
1.txt
:
This is one.
2.txt
:
This is two.
3.txt
:
This is three.
Example code:
import os
def concatenate_files(file_list, output_file):
with open(output_file, "wb") as outfile:
for file in file_list:
if os.path.exists(file):
with open(file, "rb") as infile:
outfile.write(infile.read())
outfile.write(b"\n") # Insert a newline after each file
else:
print(f"File not found: {file}")
# Example usage
input_files = ["1.txt", "2.txt", "3.txt"]
output_file = "concatenated_output3.txt"
# Call the function to concatenate files
concatenate_files(input_files, output_file)
print(f"Files {input_files} have been concatenated into {output_file}.")
The concatenate_files
function is similar to the method above.
This method also opens the output file in binary write mode ('wb'
) and iterates through each input file, opens it in binary read mode ('rb'
), and writes its content to the output file.
Output:
Files ['1.txt', '2.txt', '3.txt'] have been concatenated into concatenated_output3.txt.
Using the cat
Command (Linux/Unix)
For Linux/Unix systems, the subprocess
module can be employed to execute the cat
command.
The subprocess
module in Python allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
The cat
command is commonly used on Linux/Unix systems to concatenate and display the content of files.
1.txt
:
This is one.
2.txt
:
This is two.
3.txt
:
This is three.
Example code:
import subprocess
def concatenate_files(file_list, output_file):
files = " ".join(file_list)
subprocess.run(f"cat {files} > {output_file}", shell=True)
# Example usage
input_files = ["1.txt", "2.txt", "3.txt"]
output_file = "concatenated_output4.txt"
# Call the function to concatenate files
concatenate_files(input_files, output_file)
print(f"Files {input_files} have been concatenated into {output_file}.")
The concatenate_files
function constructs a space-separated string of input file names (files
).
Then, it uses the subprocess.run
function to execute the cat
command through the shell, concatenating the files and redirecting the output to the specified file.
Output:
Files ['1.txt', '2.txt', '3.txt'] have been concatenated into concatenated_output4.txt.
Conclusion
This article has provided a detailed guide on concatenating multiple files into a single file, demonstrating alternative methods using the shutil
module, file handling, the os
module, and even incorporating Linux/Unix-specific commands like cat
through the subprocess
module.
Python proves to be a versatile language with extensive capabilities across various domains. Its simplicity, coupled with robust features like object-oriented programming and automated memory management, makes tasks like file handling seamless.
The ability to easily create, read, append, or overwrite data in files, along with support for numerous file types through third-party libraries, enhances Python’s file manipulation capabilities.