How to Use Pickle to Save and Load Objects in Python
This article demonstrates how to save and reload objects in Python. We will also understand pickling and unpickling using Python. Further, we will see the pros and cons of pickling.
Pickling and Unpickling
Serialization of objects is the way to convert the objects into the form of bits so that we can save the object’s state on the hard drive. Though many languages provide us with a way of object serialization, Python is more flexible in all of them.
In Python, object serialization is called pickling, and de-serialization is called unpickling. We mostly use objects in Python. Therefore, we can serialize/un-serialize almost everything. However, it is inevitable to be careful and must know the purpose of pickling before using it.
Before jumping into theoretical details, let’s demonstrate pickling in Python using the Pickle module.
Pickling in Python
For pickling, first, import the pickle
module.
import pickle
We can serialize almost any object in Python. Let’s take a dictionary object for demonstration purposes.
bio_dict = {"name": "Zohaib", "age": 29, "profession": "Engineer"}
pickle
contains .dump()
function to serialize the object.
with open("bio_dict.pickle", "wb") as file_name:
pickle.dump(bio_dict, file_name)
In the above code, we opened a file bio_dict.pickle
with write permission(i.e., wb
) and then used the .dump()
function to pickle the bio_dict
dictionary into the pickle file. As a result, we successfully converted the dictionary into a byte stream format.
Unpickling in Python
When the pickled object needs to be used again, it can be de-serialized. For that, we can use the pickle.load()
function as demonstrated in the code below:
with open("bio_dict.pickle", "rb") as file_name:
unpickled_dict = pickle.load(file_name)
print("The retrieved dictionary is: ", unpickled_dict)
We opened the previously saved (.pickle
) file and then used the .load()
function to get the object again. The above code produces the following output.
The retrieved dictionary is: {'name': 'Zohaib', 'age': 29, 'profession': 'Engineer'}
We can check whether the pickled and unpickled objects are equal or not using the following code.
assert bio_dict == unpickled_dict
Advantages and Disadvantages of Pickling
Following are the advantages of pickling in Python:
-
Almost any Python object may be serialized with this
pickle
module in Python, unlikeJSON
and other serializing techniques. -
We can use the pickled objects later. For example, if you did a large computation and pickled the objects. Then, next time when you start a new Python session, you do not need to do previously completed large computations again.
Instead, you unpickle the pickled objects and use them in a new computation. This way, we can save time and resources.
-
Pickled objects may efficiently be read by multiple threads resulting in fast parallel computations.
There are multiple consequences of pickling. You should be aware of them before using the pickling.
- We should not unpickle data collected from a non-reliable source. It may contain altered data or some arbitrarily vulnerable code that may cause serious security issues.
- Unlike
JSON
files, pickled files are not human-readable. - The
JSON
file can be used and supported by multiple languages. On the other hand, pickle files in Python may not be supported in many languages, and you may require relevant 3rd party intermediary libraries to serve as an adapter.
What Else Can Be Pickled
It is an important point that needs a lot of time to describe. But, in short, you can pickle many Python things like functions, Pandas
data frames, and many others.
Pickle is also a highly helpful tool for machine learning since it lets you preserve your models, reduce time-consuming retraining, and share, commit, and reload previously trained models.
You can follow this guide for storing your machine learning model using the pickle.