Project Nested Fields in MongoDB
- Understanding Nested Fields in MongoDB
-
Use the
$project
Aggregation Stage to Project Nested Fields in MongoDB -
Use the
$unset
Aggregation Stage to Get Nested Fields Excluding the Specified Ones in MongoDB -
Use a
forEach()
Loop to Get Nested Fields in MongoDB -
Use the
mapReduce()
Method to Project Nested Fields in MongoDB -
Use the
$addFields
Aggregation Stage to Project Nested Fields in MongoDB - Use the Dot Notation to Project Nested Fields in MongoDB
-
Use the
$map
and the$mergeObjects
Aggregation Pipeline to Project Nested Fields in MongoDB - Conclusion
MongoDB, a NoSQL database, offers powerful features for handling complex data structures. One common task is projecting nested fields, which involves extracting specific elements from nested documents.
Today, we will learn how to use the $project
, $unset
, and $addFields
aggregation stages, the forEach()
loop, the mapReduce()
method, the dot notation, and the $map
and $mergeObjects
aggregation pipeline to project nested fields while querying data in MongoDB. In this guide, we will explore various methods to achieve this, complete with detailed explanations and example codes.
Understanding Nested Fields in MongoDB
Nested fields in MongoDB refer to documents embedded within other documents. These nested documents can contain their own set of fields, forming a hierarchical structure.
Consider the following example:
{
"_id": 1,
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "New York",
"country": "USA"
}
}
In this document, the address
field is nested, containing subfields like street
, city
, and country
.
In MongoDB, we can retrieve all documents using the find()
method, but what if we only want access to specific nested fields? This is where we use projection.
We can project nested fields in various ways. Here, we will learn about the following solutions to project nested fields:
- Use the
$project
aggregation stage - Use the
$unset
aggregation stage - Use the
forEach()
loop - Use the
mapReduce()
function - Use the
$addFields
aggregation stage - Use the dot notation
- Use the
$map
and$mergeObjects
aggregation pipeline
To learn the above approaches, let’s create a collection named nested
containing one document. You may also use the query given below to follow up with us.
Example Code:
// MongoDB version 5.0.8
db.nested.insertOne(
{
"name": {
"first_name": "Mehvish",
"last_name": "Ashiq",
},
"contact": {
"phone":{"type": "manager", "number": "123456"},
"email":{ "type": "office", "mail": "delfstack@example.com"}
},
"country_name" : "Australien",
"posting_locations" : [
{
"city_id" : 19398,
"city_name" : "Bondi Beach (Sydney)"
},
{
"city_id" : 31101,
"city_name" : "Rushcutters Bay (Sydney)"
},
{
"city_id" : 31022,
"city_name" : "Wolly Creek (Sydney)"
}
],
"regions" : {
"region_id" : 796,
"region_name" : "Australien: New South Wales (Sydney)"
}
}
);
Use the db.nested.find().pretty();
command on the Mongo shell to see the inserted data.
Use the $project
Aggregation Stage to Project Nested Fields in MongoDB
The $project
stage is a fundamental component of the MongoDB Aggregation Pipeline. It allows for the reshaping and transformation of documents, enabling precise control over which fields are included or excluded from the output.
Syntax:
db.collection.aggregate([
{
$project: {
field1: <expression>,
field2: <expression>,
...
}
}
])
db.collection.aggregate([...])
: Initiates the aggregation pipeline on a specific collection.$project
: Specifies the$project
stage.field1: <expression>
,field2: <expression>
, …: Define the fields you want to include in the output. The<expression>
can be a direct field, a transformation, or an evaluation.
When dealing with nested documents, it’s common to need specific fields from within those nested structures. The $project
stage excels at this task.
Example Code:
// MongoDB version 5.0.8
var current_location = "posting_locations";
var project = {};
project["id"] = "$"+current_location+".city_id";
project["name"] = "$"+current_location+".city_name";
project["regions"] = 1;
var find = {};
find[current_location] = {"$exists":true};
db.nested.aggregate([
{ $match : find },
{ $project : project }
]).pretty()
Output:
{
_id: ObjectId("..."),
regions: {
region_id: 796,
region_name: 'Australien: New South Wales (Sydney)'
},
id: [
19398,
31101,
31022
],
name: [
'Bondi Beach (Sydney)',
'Rushcutters Bay (Sydney)',
'Wolly Creek (Sydney)'
]
}
Here, we save the first-level field named posting_locations
in a variable called current_location
.
Then, we use that variable to access the city_id
and city_name
and save them in the project
object while using bracket notation to create properties for the project
object. Additionally, we save the regions
field in the project["regions"]
.
Next, we have another object named find
that we will use in the aggregate()
method to match the documents. In the aggregate()
method, we use the $match
stage to match the documents and $project
to project the fields, whether nested or at the first level.
We use $project
to specify what fields we want to display in the output. We can use the following solution if we are interested in projecting the specified nested fields only without any filter query.
Example Code:
// MongoDB version 5.0.8
var current_location = "posting_locations";
db.nested.aggregate({
$project: {
"_id": 0,
"city_id": "$" + current_location + ".city_id",
"city_name": "$" + current_location + ".city_name",
"regions": 1
}
}).pretty();
Output:
{
regions: {
region_id: 796,
region_name: 'Australien: New South Wales (Sydney)'
},
city_id: [
19398,
31101,
31022
],
city_name: [
'Bondi Beach (Sydney)',
'Rushcutters Bay (Sydney)',
'Wolly Creek (Sydney)'
]
}
Use the $unset
Aggregation Stage to Get Nested Fields Excluding the Specified Ones in MongoDB
The $unset
stage is a crucial component of MongoDB’s aggregation pipeline. It allows you to remove specific fields from documents, effectively excluding them from the final output.
This can be particularly useful when dealing with complex nested documents.
Example Code:
// MongoDB version 5.0.8
db.nested.aggregate({
$unset: ["posting_locations.city_id", "contact", "regions", "name", "_id"]
}).pretty()
Output:
{
country_name: 'Australien',
posting_locations: [
{
city_name: 'Bondi Beach (Sydney)'
},
{
city_name: 'Rushcutters Bay (Sydney)'
},
{
city_name: 'Wolly Creek (Sydney)'
}
]
}
Here, we use the $unset
operator, which is used to delete the specified field or array of fields.
Remember that we use the dot notation to specify the embedded documents or array of documents. The $unset
operator does no operation if the given field does not exist.
When we use $
to match the elements of an array, the $unset
operator replaces matching elements with null
instead of removing them from the array. This behavior assists in keeping the element positions and array size consistent.
Use a forEach()
Loop to Get Nested Fields in MongoDB
The forEach()
loop in MongoDB is a JavaScript method that allows you to iterate over elements in an array or documents in a cursor. This can be particularly useful when dealing with nested arrays or documents.
Syntax For Arrays:
array.forEach(function(currentValue, index, arr), thisValue)
currentValue
: The current element being processed in the array.index
(Optional): The index of the current element being processed in the array.arr
(Optional): The array thatforEach()
is being applied to.thisValue
(Optional): A value to use asthis
when executing the callback function.
Syntax For Cursors (used in queries):
cursor.forEach(function(doc))
doc
: The current document being processed in the cursor.
Note: The callback function used with
forEach()
can take up to three arguments for arrays (currentValue
,index
, andarr
), but for cursors, it typically takes a single argument (doc
) since cursors represent a stream of documents.
Example Code:
// MongoDB version 5.0.8
var bulk = db.newcollection.initializeUnorderedBulkOp(),
counter = 0;
db.nested.find().forEach(function(doc) {
var document = {};
document["name"] = doc.name.first_name + " " + doc.name.last_name;
document["phone"] = doc.contact.phone.number;
document["mail"] = doc.contact.email.mail;
bulk.insert(document);
counter++;
if (counter % 1000 == 0) {
bulk.execute();
bulk = db.newcollection.initializeUnorderedBulkOp();
}
});
if (counter % 1000 != 0) { bulk.execute(); }
Output:
{
acknowledged: true,
insertedCount: 1,
insertedIds: {
'0': ObjectId("...")
},
matchedCount: 0,
modifiedCount: 0,
deletedCount: 0,
upsertedCount: 0,
upsertedIds: {}
}
Next, execute the command below on your Mongo shell to see the projected fields.
// MongoDB version 5.0.8
db.newcollection.find().pretty();
Output:
{
_id: ObjectId("..."),
name: 'Mehvish Ashiq',
phone: '123456',
mail: 'delfstack@example.com'
}
To learn this example code, suppose we want to grab certain nested fields and insert them into a new collection. Here, inserting the transformed fields as a document into a new collection may impact our operations based on the size of the nested collection.
We can avoid this slow insert performance by using a new unordered bulk insert
API. It will streamline the insert operations by sending in bulk and give us feedback in real-time about whether the operation succeeded or failed.
So, we are using the bulk insert
API to insert the desired data structure into the newcollection
collection, where the brand new documents will be created with the nested
collection cursor’s forEach()
loop. To create new properties, we use the bracket notation.
For this code, we assume to have a large amount of data. So, we will send the operations to a server in 1000
’s batches to perform the bulk insert operation.
As a result, it gives us good performance because we are not sending each request but just once for every 1000
requests to the server.
Use the mapReduce()
Method to Project Nested Fields in MongoDB
The mapReduce()
method in MongoDB allows for flexible data processing by applying JavaScript functions to collections. It performs two primary steps: mapping and reducing.
The mapping step processes each document, while the reducing step aggregates and summarizes the mapped data.
Syntax:
db.collection.mapReduce(
<mapFunction>,
<reduceFunction>,
{
out: <output>,
query: <query>,
sort: <sort>,
limit: <limit>,
finalize: <finalize>,
scope: <scope>,
jsMode: <boolean>,
verbose: <boolean>,
bypassDocumentValidation: <boolean>
}
)
Here’s an explanation of the parameters:
<mapFunction>
(function): This is a JavaScript function that processes each document and emits key-value pairs. It takes the formfunction() {...}
.<reduceFunction>
(function): This function aggregates and processes the mapped values. It takes the formfunction(key, values) {...}
. In some cases, this function may be used as a placeholder.out
(string or document): Specifies where to output the results. It can be either a collection name (string) or a document that defines the output options. For example:{ out: "output_collection" }
.query
(document): Specifies the query filter to select documents for processing. This parameter is optional.sort
(document): Specifies the order in which documents are processed. This parameter is optional.limit
(number): Limits the number of documents processed bymapReduce()
. This parameter is optional.finalize
(function): An optional JavaScript function that can be used to further process the result values after thereduce
step.scope
(document): A document that defines the global variables accessible in themap
andreduce
functions.jsMode
(Boolean): When set totrue
, MongoDB performs the map-reduce operation in JavaScript mode. This can be used for compatibility with earlier versions of MongoDB.verbose
(Boolean): When set totrue
, the map-reduce operation provides detailed logging.bypassDocumentValidation
(Boolean): When set totrue
, the map-reduce operation bypasses document validation during the operation.
Keep in mind that both the mapFunction
and reduceFunction
should be valid JavaScript functions. The mapFunction
emits key-value pairs, and the reduceFunction
aggregates the values for a specific key.
Remember to replace placeholders like <mapFunction>
, <reduceFunction>
, etc., with your actual functions or values when using mapReduce()
in MongoDB.
When dealing with nested documents, mapReduce()
provides a powerful mechanism to selectively project specific fields. This can be especially useful when you need to extract and process specific elements from deeply nested data structures.
Example Code:
// MongoDB version 5.0.8
function map() {
for(var i in this.posting_locations) {
emit({
"country_id" : this.country_id,
"city_id" : this.posting_locations[i].city_id,
"region_id" : this.regions.region_id
},1);
}
}
function reduce(id,docs) {
return Array.sum(docs);
}
db.nested.mapReduce(map,reduce,{ out : "map_reduce_output" } )
Now, run the following query to see the output.
// MongoDB version 5.0.8
db.map_reduce_output.find().pretty();
Output:
{
"_id" : {
"country_id" : undefined,
"city_id" : 19398,
"region_id" : 796
},
"value" : 1
}
{
"_id" : {
"country_id" : undefined,
"city_id" : 31022,
"region_id" : 796
},
"value" : 1
}
{
"_id" : {
"country_id" : undefined,
"city_id" : 31101,
"region_id" : 796
},
"value" : 1
}
For this example code, we use the mapReduce()
function to perform map-reduce on all documents of the nested
collection. For that, we have to follow a three-step process briefly explained below.
-
Define the
map()
function to process every input document. In this function, thethis
keyword refers to the current document being processed by the map-reduce operation, and theemit()
function maps the given values to the keys and returns them. -
Here, we define the corresponding
reduce()
function, which is the actual place where aggregation of data takes place. It takes two arguments (keys
andvalues
); our code example takes theid
anddocs
.Remember that the elements of the
docs
are returned by theemit()
function from themap()
method. At this step, thereduce()
function reduces thedocs
array to the sum of its values (elements). -
Finally, we perform map-reduce on all the documents in the
nested
collection by usingmap()
andreduce()
functions. We useout
to save the output in the specified collection, which ismap_reduce_output
in this case.
Use the $addFields
Aggregation Stage to Project Nested Fields in MongoDB
The $addFields
stage in MongoDB’s aggregation framework allows for the addition of new fields to documents in the result set. It is particularly valuable when you want to augment existing documents with additional information or extract nested fields.
The $addFields
stage is used within an aggregation pipeline. Its syntax is as follows:
{
$addFields: {
newField1: expression1,
newField2: expression2,
// ...
}
}
newField
: The name of the new field to be added.expression
: An expression that defines the value of the new field. This can be a direct value, a computation, or a reference to an existing field.
Consider a collection of documents representing users
:
{
"_id": 1,
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "New York",
"country": "USA"
}
}
Here, the address
field is nested, containing subfields like street
, city
, and country
.
Let’s explore how to use the $addFields
stage to extract and project nested fields from our example documents.
db.users.aggregate([
{
$addFields: {
"street": "$address.street",
"city": "$address.city",
"country": "$address.country"
}
}
])
The db.users.aggregate([...])
initiates an aggregation pipeline on the users
collection. Then, the $addFields
stage allows us to add new fields to the documents.
Finally, we are adding three new fields (street
, city
, and country
) to each document. The values of these fields are extracted from the nested address
document.
Output:
{
"_id": 1,
"address": {
"city": "New York",
"country": "USA",
"street": "123 Main St"
},
"city": "New York",
"country": "USA",
"name": "John Doe",
"street": "123 Main St"
}
Use the Dot Notation to Project Nested Fields in MongoDB
Dot notation is a powerful and intuitive approach for projecting specific fields within nested documents. It involves using dots (.
) to navigate through the document’s structure.
Syntax:
{ "outerField.innerField": 1, "outerField.anotherInnerField": 1, ... }
Let’s break down the components:
"outerField.innerField": 1
: This notation indicates that we want to include theinnerField
from theouterField
nested document. The value1
is used to include the field."outerField.anotherInnerField": 1
: Similarly, this includes theanotherInnerField
from theouterField
nested document.
Let’s delve into practical examples to illustrate the application of dot notation for nested field projection in MongoDB.
Example 1: Basic Field Projection
Consider a collection of user documents with nested address
fields:
db.users.insertMany([
{
"_id": 1,
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "New York",
"country": "USA"
}
},
{
"_id": 2,
"name": "Jane Doe",
"address": {
"street": "456 Oak Ave",
"city": "Los Angeles",
"country": "USA"
}
}
])
Now, let’s use dot notation to project-specific fields:
db.users.find({}, { "address.street": 1, "address.city": 1 })
In this example, the query instructs MongoDB to include the street
and city
fields from the address
nested document. The result will be:
{ "_id": 1, "address": { "street": "123 Main St", "city": "New York" } }
{ "_id": 2, "address": { "street": "456 Oak Ave", "city": "Los Angeles" } }
Example 2: Projecting Multiple Nested Fields
You can project multiple nested fields in a single query:
db.users.find({}, { "address.street": 1, "address.city": 1, "address.country": 1 })
The result will include the street
, city
, and country
fields from the address
nested document.
{
"_id": 1,
"address": {
"city": "New York",
"country": "USA",
"street": "123 Main St"
}
},
{
"_id": 2,
"address": {
"city": "Los Angeles",
"country": "USA",
"street": "456 Oak Ave"
}
}
Example 3: Excluding Specific Fields
Dot notation can also be used to exclude specific fields:
db.users.find({}, { "address.country": 0 })
In this query, the country
field from the address
nested document will be excluded.
{
"_id": 1,
"address": {
"city": "New York",
"street": "123 Main St"
},
"name": "John Doe"
},
{
"_id": 2,
"address": {
"city": "Los Angeles",
"street": "456 Oak Ave"
},
"name": "Jane Doe"
}
Use the $map
and the $mergeObjects
Aggregation Pipeline to Project Nested Fields in MongoDB
The $map
operator is a fundamental aggregation function in MongoDB. It applies an expression to each element in an array and returns an array with the modified elements.
The $mergeObjects
operator combines multiple objects into a single document. This is particularly useful when you want to merge fields from different documents or when dealing with nested documents.
Scenario: Extracting Nested Fields
Consider the following example, where each document contains information about a person, including their name and address:
{
"_id": 1,
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "New York",
"country": "USA"
}
}
We want to extract the street
and city
fields from the address
nested document.
Method 1: Using $map
and $mergeObjects
db.people.aggregate([
{
$project: {
address: {
$mergeObjects: [
{
street: { $map: { input: [{}], as: 'el', in: '$address.street' } }
},
{
city: { $map: { input: [{}], as: 'el', in: '$address.city' } }
}
]
}
}
}
])
First, the $project
stage is used to shape the output document.
Then, $mergeObjects
combines multiple objects. Here, we create two separate objects for street
and city
.
Inside each object, $map
iterates over an array with a single empty object {}
. This is done to apply the $address.street
and $address.city
expressions.
Finally, the result is placed in the address
field.
Output:
{
"_id": 1,
"address": {
"city": [
"New York"
],
"street": [
"123 Main St"
]
}
}
Method 2: Combining $map
and $mergeObjects
for Multiple Documents
db.people.aggregate([
{
$project: {
addresses: {
$map: {
input: [{}],
as: 'el',
in: {
$mergeObjects: [
{
street: '$$el.address.street',
city: '$$el.address.city'
}
]
}
}
}
}
}
])
Here, we use $map
to iterate over an array with a single empty object {}
. Inside the $map
, $mergeObjects
combines the street
and city
fields from the address
nested document.
The result is placed in an array called addresses
.
Output:
{
"_id": 1,
"addresses": [
{}
]
}
Method 3: Handling Arrays of Nested Documents
db.people.aggregate([
{
$project: {
addresses: {
$map: {
input: '$addresses',
as: 'el',
in: {
$mergeObjects: [
{
street: '$$el.address.street',
city: '$$el.address.city'
}
]
}
}
}
}
}
])
In this scenario, addresses
is an array of nested documents.
We use $map
to iterate over each element in the addresses
array. Inside the $map
, $mergeObjects
combines the street
and city
fields from the address
nested document for each element.
Output:
{
"_id": 1,
"addresses": null
}
Conclusion
In this guide, we’ve explored seven effective methods for projecting nested fields in MongoDB, a versatile NoSQL database. These methods equip you to handle complex data structures efficiently.
$project
Aggregation Stage: Reshapes and transforms documents, enabling precise control over included/excluded fields.$unset
Aggregation Stage: Removes specific fields from documents, useful for complex nested structures.forEach()
Loop: JavaScript method for iterating over elements in an array or cursor, handy for nested arrays/documents.mapReduce()
Method: Applies JavaScript functions to collections, involving mapping and reducing steps.$addFields
Aggregation Stage: Adds new fields to documents, augmenting them with extra information or extracting nested fields.- Dot Notation: Intuitively navigates through nested documents, enabling precise field selection.
$map
and$mergeObjects
Pipeline: Employs operators to efficiently project specific fields from complex nested structures.
By incorporating these techniques, you’ll adeptly process and transform data to suit your application’s needs.