Python Rsync
In a world of ever-changing technologies, we still and may always encounter instances where we need to transfer or exchange files. Rsync
is a Linux-based tool that can help us specify the transfer details.
This article will explore rsync
and how we can use it from a Python script.
Python Rsync
As mentioned above, rsync
is a powerful tool that helps us specify the transfer details. This means we can determine what files to exclude from a transfer and what kind of shell should be used.
Rsync is typically used for transfers with a high transfer complexity or files being transferred in bulk. It is also possible to automate backups created by rsync
with the help of cron
.
the rsync
Command in Linux
This is what a generic rsync
command format looks like.
rsync [option][origin][destination]
This is a straightforward command when one is familiar with Linux, but we will break it down anyway. Every command starts with the keyword rsync
.
It is followed by an option, which we have a wide range to choose from. Each option specifies the nature of the rsync
we hope to execute.
The origin and destination here are where we wish to transfer our files to (destination) and from where (origin). This means that we have to be wary of what it is we are syncing as well as whether we are syncing from a local or a remote machine because rsync
is often the cause of files being rewritten without much warning.
Here is a list of basic and common options for rsync
.
-a
- This option helps recursively copy files and helps preserve the ownership of the files even after they have been copied.-dry-run
- This option allows us to run a trial for the command to observe the changes that would come about if the command were executed. This option does not bring about any actual changes.-delete
- This option helps delete extraneous files from the destination machine/directory.-e
- This option helps informrsync
about the shell that should be used.-exclude="*.filetype"
- This option helps exclude all of a specific file type from a transfer. In the command above, we replacefiletype
with the actual filetype. For example,-exclude="*.docx"
.-h
- This option helps initiate help forrsync
.-progress
- This option helps show the progress of the transfer as the command runs.-q
- This option runs all the commands in the background or quietly.-v
- This option makes the transfer so the user can read all the processes being run.-z
- This option helps compress synced data.
Use Rsync From a Python Script
There are now two ways to make use of Rsync in Python.
-
Make a call to
subprocess
and specify thersync
command.import subprocess subprocess.call(["rsync", "[option]", "[origin]", "[destination]"])
-
Use the
pyrsync
libraryThat’s right, and Python now offers its library for Rsync. This library is not a wrapper for Rsync but contains a full-fledged functionality of Rsync itself.
We can install this library via
pip
.pip install pyrsync
Initially, rsync
requires the use of MD5 hash, which developers often find outdated compared to the use of SHA256
, which is used by the modernized pyrsync
. SHA256
meets the standard requirements for the security of verification processes.
While pyrsync
has had no major releases since its launch, it can be observed that it has huge potential in the world of development, and currently, this library is not known to have any bugs or vulnerabilities.
Since the library is not available, it must be built from source code, which is available, and installed.
Pyrsync
has the potential to save us hours and hours of development time and resources by not having to build the functionality it provides from scratch.
Its easy-to-read code and Pypi
’s straightforward installation instructions make it very easy to incorporate into our scripts.
We need to run this command if the system has setup tools already installed.
$ sudo python setup.py install
Even if the system does not have setup tools, the setup.py
script will detect the absence and set the default to use Python’s built-in distutils
instead.
An example flow of commands script for this module is as follows:
# In the system with the file that needs patching
>>> import pyrsync2
>>> unpatched = open("unpatched.file", "rb")
>>> hashes = pyrsync2.blockchecksums(unpatched)
# In the remote machine receiving hashes
>>> import pyrsync2
>>> patchedfile = open("patched.file", "rb")
>>> delta = pyrsync2.rsyncdelta(patchedfile, hashes)
# In the origin machine with the unpatched file after receiving delta
>>> unpatched.seek(0)
>>> save_to = open("locally-patched.file", "wb")
>>> pyrsync2.patchstream(unpatched, save_to, delta)
An essential point to note here would be that this library only offers support for Python 3 currently.
We hope you find this article helpful in understanding how to use rsync
in Python.
My name is Abid Ullah, and I am a software engineer. I love writing articles on programming, and my favorite topics are Python, PHP, JavaScript, and Linux. I tend to provide solutions to people in programming problems through my articles. I believe that I can bring a lot to you with my skills, experience, and qualification in technical writing.
LinkedIn