How to Locally Connect to a MongoDB Database Using Python
- Store Data in MongoDB
- Locally Connect to a MongoDB Database Using Python
- Create a Collection in Python
- Insert Documents in Python
- Query in Python
- Index in Python and MongoDB
Python is the most prevalent programming language for data science, and MongoDB, with its flexible and dynamic schema, is an excellent combination for creating modern web applications, JSON APIs, and data processors, to mention a few examples.
MongoDB also includes a native Python driver and a team of engineers committed to ensuring that MongoDB and Python function seamlessly together.
Python provides extensive support for common data manipulation and processing operations. For example, Python’s native dictionary and list data structures place it second only to JavaScript when handling JSON documents, making it ideal for working with BSON.
PyMongo, the official Python MongoDB driver library, is similarly simple and provides an intuitive API for accessing databases, collections, and documents.
Objects fetched from MongoDB using PyMongo are compatible with dictionaries and lists, allowing simple manipulation, iteration, and printing.
Store Data in MongoDB
MongoDB stores data in JSON-like documents:
# Mongodb document (JSON-style)
document_1 = {
"_id": "BF00001CFOOD",
"item_name": "Bread",
"quantity": 2,
"ingredients": "all-purpose flour",
}
Python dictionaries look like this:
# python dictionary
dict_1 = {
"item_name": "blender",
"max_discount": "10%",
"batch_number": "RR450020FRG",
"price": 440,
}
Prerequisites and Installation of Python
Download and install Python on your machine. Type python
in your command line window to confirm if your installation is right.
You should get the following:
Python 3.9.1 (tags/v3.9.1:9cf6752, Feb 5 2021, 10:34:40) [MSC v.1927 64 bit (AMD64)] on win32
>>>
You can follow the Python MongoDB examples in this lesson if you are new to Python.
Locally Connect to a MongoDB Database Using Python
PyMongo provides a suite of libraries for working with MongoDB in Python. To get PyMongo up and running, open the command prompt and type the following:
python -m pip install pymongo
For this Python MongoDB tutorial, you use MongoDB SRV URI. So let’s install dnspython
:
python -m pip install dnspython
Now, you can use PyMongo as a Python MongoDB library in our code with an import
statement. But, first, we create a MongoDB database in Python.
So, the first step to connecting Python is the MongoDB cluster setup.
Next, write the PyMongo code in a pymongo test insert.py
file in any subdirectory. Any simple text editor, such as Textpad/Notepad, will suffice.
Add the following lines to the MongoDB client:
def get_database():
from pymongo import MongoClient
import pymongo
# Provide the mongodb url to connect python to mongodb using pymongo
CONNECTION_STRING = (
"mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/myFirstDatabase"
)
# Create a connection using MongoClient. You can import MongoClient or use pymongo.MongoClient
from pymongo import MongoClient
client = MongoClient(CONNECTION_STRING)
# Create the database for your example
return client["user_shopping_list"]
# This is added so that files can reuse the function get_database()
if __name__ == "__main__":
# Get database
dbname = get_database()
To get the MongoDB database connection, use the CONNECTION_STRING
to create the Mongo client. Change the cluster name, username, and password first.
You’ll make a shopping list and add a few products in this Python MongoDB lesson. You made a database called user shopping list
for this.
On the other hand, MongoDB does not establish a database until you’ve added collections and documents to it. So, next, let’s make a collection.
Create a Collection in Python
For creating a collection, pass the collection name to the database. Ensure the proper indentation while copying the code to your .py
file.
collection_name = dbname["user_1_items"]
This will create a collection named user_1_items
in the user_shopping_list
database.
Insert Documents in Python
Use the PyMongo insert_many()
method to insert many documents at once.
item1 = {
"_id": "U1IT00001",
"item_name": "Blender",
"max_discount": "10%",
"batch_number": "RR450020FRG",
"price": 440,
"category": "kitchen appliance",
}
item2 = {
"_id": "U1IT00002",
"item_name": "Egg",
"category": "food",
"quantity": 12,
"price": 50,
"item_description": "brown country eggs",
}
collection_name.insert_many([item1, item2])
Insert the third document without mentioning the _id
field. This time, you’ll include a date data type field.
Use the Python dateutil
module to add a date in PyMongo. Because ISODate is a Mongo shell function, it will not work with Python.
python -m pip install python-dateutil
is the command to install the package. Then, in pymongo test.py
, add the following:
from dateutil import parser
expiry_date = "2021-07-13T00:00:00.000Z"
expiry = parser.parse(expiry_date)
item3 = {
"item_name": "Bread",
"quantity": 2,
"ingredients": "all-purpose flour",
"expiry_date": expiry,
}
collection_name.insert_one(item3)
The insert one()
method was used to insert a single document.
First, navigate to the location where you saved pymongo test insert.py
using the command line. Then, run the file using the Python pymongo test insert.py
command.
Query in Python
You can view all the documents together using find()
. For that, you will create a separate file pymongo_test_query.py
:
# Get the database using the method you defined in the pymongo_test_insert file
from pymongo_test_insert import get_database
dbname = get_database()
# Create a new collection
collection_name = dbname["user_1_items"]
item_details = collection_name.find()
for item in item_details:
# This does not give a readable output
print(item)
Navigate to the folder where you saved pymongo test query.py
using the command line. You can use the python pymongo test query.py
command and run the program.
The data is viewable, but the format isn’t ideal. So, here are the item names and their categories to print:
print(item["item_name"], item["category"])
Even though MongoDB receives all of the data, you get a python KeyError
on the third document. Use pandas to handle missing data problems in Python DataFrames.
DataFrames are two-dimensional data structures used in data processing. For example, Pymongo’s discover()
method returns dictionary objects that can be transformed into a data frame with just one line of code.
Install pandas library as:
python -m pip install pandas
Replace the for
loop with the following code to handle KeyError
in one step:
from pandas import DataFrame
# convert the dictionary objects to a data frame
items_dfs = DataFrame(item_details)
print(items_dfs)
Remember to comment out the print(item['item name'], item['category'])
. For missing values, NaN
and NaT
are used to replace the mistakes.
Index in Python and MongoDB
The number of documents and collections in a real-world database is continually growing. In an extensive collection, searching for specific papers - for example, records that include all-purpose flour
as an ingredient - can take a long time.
Indexes make database searches faster, more efficient, and less expensive - for example, sort, count, match, etc.
At the collection level, MongoDB defines indexes. It adds new documents to our collection to make the index more logical.
Using the insert many()
method, you can insert multiple documents simultaneously.