How to Use Pipeline in Lookup Operator in MongoDB
- What Is the Aggregation Pipeline
-
What Is the
$lookup
Operator in MongoDB -
Use Pipeline in
$lookup
Operator to Join Conditions in MongoDB
This tutorial teaches how to use the pipeline in the lookup
operator in MongoDB. Before moving on, it is mandatory to have enough knowledge about the aggregation pipeline and $lookup
operator to understand the use of the pipeline in the $lookup
operator while using MongoDB.
If you already know these concepts, you can quickly move to this tutorial’s last two code examples.
What Is the Aggregation Pipeline
It is a procedure of collecting data and returning the computed results. This process collects data from different documents, groups them as per specified conditions, and perform various kinds of operation on the grouped data.
For instance, average, sum, maximum and minimum. It is just like the SQL aggregation functions.
In MongoDB, we can use aggregation in the following three ways.
-
Aggregation Pipeline - contains various stages to transform the provided documents. Every stage accepts the set of documents and generates another set of resultant documents that are further passed to the next stage, and this process continues until the final stage.
-
Map-reduce Function - We use this function to aggregate results on a large scale. It has two functions,
map
andreduce
.The
map
method groups all documents while thereduce
method performs operations on grouped data. -
Single-purpose Aggregation - the simplest form of aggregation used to perform aggregation tasks but lacks some features compared to the aggregation pipeline method. We use this type of aggregation to perform tasks within a particular document, for instance, count the distinct values within a specific document(s).
You may also read this to dive deeper and know aggregation pipelines.
What Is the $lookup
Operator in MongoDB
This operator is used to perform left outer join to merge data from one document to another within the same database. It filters the documents from a joined collection for further processing.
We can also use this operator to add an extra field to an existing document.
The $lookup
operator adds a new array attribute (field) whose values (elements) match documents from the joined collection. Then these transformed documents are passed to the next stage.
The $lookup
operator has three different syntaxes we can use considering the project requirements. This tutorial uses the $lookup
syntax for Join conditions & Subqueries on the Joined Collection
.
To practice with an example code, let’s prepare the sample collections with data.
Example Code:
db.createCollection('collection1');
db.createCollection('collection2');
db.collection1.insertMany([
{"shopId": "001", "shopPosId": "001", "description": "description for 001"},
{"shopId": "002", "description": "description for 002"},
{"shopId": "003", "shopPosId": "003", "description": "description for 003"},
{"shopId": "004", "description": "description for 004"}
]);
db.collection2.insertMany([
{"shopId": "001", "shopPosId": "0078", "clientUid": "474192"},
{"shopId": "002", "shopPosId": "0012", "clientUid": "474193"},
{"shopId": "003", "shopPosId": "0034", "clientUid": "474194"},
{"shopId": "004", "shopPosId": "0056", "clientUid": "474195"}
]);
Now, we can execute the following commands to see the inserted documents in each collection.
db.collection1.find();
db.collection2.find();
Use Pipeline in $lookup
Operator to Join Conditions in MongoDB
To learn how to use pipeline in the $lookup
operator, let’s join documents from two collections where the collection1.shopId
is equal to the collection2.shopId
, and the collection1
does not contain the shopPosId
field.
Only those documents will be joined from both collections that will satisfy both conditions. See the example code given below.
Example Code:
db.collection2.aggregate([
{
"$lookup": {
"from": "collection1",
"let": { "shopId": "$shopId" },
"pipeline": [{
"$match": {
"$and": [
{"$expr": {"$eq": ['$shopId', '$$shopId'] }},
{ "shopPosId": { "$exists": false } }
]
}
}],
"as": "shopDescription"
}
}
]).pretty();
Output:
Did you carefully observe the output given above? Only those documents are joined from both collections that meet both conditions in the pipeline (the collection1.shopId
is equal to the collection2.shopId
, and the collection1
does not contain the shopPosId
field).
Further, those documents that do not match these conditions have an empty array named shopDescription
(see the red boxes in the above results). We can display only those resultant documents containing a non-empty shopDescription
array (see the following query).
Example Code:
db.collection2.aggregate([
{
"$lookup": {
"from": "collection1",
"let": { "shopId": "$shopId" },
"pipeline": [{
"$match": {
"$and": [
{"$expr": {"$eq": ['$shopId', '$$shopId'] }},
{ "shopPosId": { "$exists": false } }
]
}
}],
"as": "shopDescription"
}
},
{
"$match":{
"shopDescription": { $exists: true, $not: {$size: 0} }
}
}
]).pretty();
Output: