How to Fuzzy Search in MongoDB
- What Is Fuzzy Search
- Create a Sample Collection in MongoDB
-
Use the
$regex
Operator to Perform Fuzzy Search in MongoDB -
Use the
$text
Query to Perform Fuzzy Search in MongoDB -
Use JavaScript’s
Fuse.js
Library to Perform Fuzzy Search in MongoDB
Today, we will discuss fuzzy search and how we can do a fuzzy search using MongoDB.
We will start by using the $regex
operator and $text
query. Further, we will move towards learning the use of a JavaScript library named Fuse.js
to do a fuzzy search on the documents.
What Is Fuzzy Search
Using fuzzy search, we can search a text that does not match exactly but matches the term closely. It is useful to find relevant results even when the search terms are misspelled.
For instance, Google shows us various web pages relevant to our searched term even when mistyped. The use of regular expressions (also called regex) is also a very beneficial and time-saving approach for implementing a fuzzy search.
Create a Sample Collection in MongoDB
We will start from basic to advance levels to learn fuzzy search. To practice it, let’s create a sample collection named collection_one
that has one field for every document, which is the name
.
The _id
is automatically created; we don’t have to create that. You can use the following queries to do the same.
Example Code:
> db.createCollection('collection_one')
> db.collection_one.insertMany([
{ name : 'Mehvish Ashiq'},
{ name : 'Jennifer Johnson'},
{ name : 'Natalie Robinson'},
{ name : 'John Ferguson'},
{ name : 'Samuel Patterson'},
{ name : 'Salvatore Callahan'},
{ name : 'Mikaela Christensen'}
])
> db.collection_one.find()
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddaf"), "name" : "Jennifer Johnson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb0"), "name" : "Natalie Robinson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb1"), "name" : "John Ferguson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb3"), "name" : "Salvatore Callahan" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }
Use the $regex
Operator to Perform Fuzzy Search in MongoDB
Example Code:
> db.collection_one.find({"name": /m/})
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
In this code, we performed a fuzzy search on the name
field and retrieved all documents where the name
field contains the letter m
.
As you can see, we only got one record containing the m
letter, but there are two more documents that start with M
(capital letter). To handle this, we can use the i
modifier as follows, which performs the case-insensitive search.
Example Code:
> db.collection_one.find({"name": /m/i})
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }
It showed that having a correctly designed regular expression is very important; otherwise, we may get misleading results. We can do the same in the following way as well.
Example Code (case-insensitive search):
> db.collection_one.find({'name': {'$regex': 'm','$options': 'i'}})
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }
Similarly, we can get all the documents where the name
ends at a combination of two letters as on
.
Example Code:
> db.collection_one.find({name:{'$regex' : 'on$', '$options' : 'i'}})
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddaf"), "name" : "Jennifer Johnson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb0"), "name" : "Natalie Robinson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb1"), "name" : "John Ferguson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
Use the $text
Query to Perform Fuzzy Search in MongoDB
The $text
query will not work on our sample collection named collection_one
because that does not has the text index. So, we create the index as follows.
Example Code:
> db.collection_one.createIndex({name:"text"});
The above statement will also create the specified collection if it does not exist already. Remember that we can create an index on one or multiple fields separated by a comma.
See the following example.
db.collection_name.createIndex({name:"text", description:"text"});
Once the index is created, we can do a fuzzy search as given below.
Example Code:
> db.collection_one.find({ $text: { $search: "Mehvish" } } )
OUTPUT:
{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
Use JavaScript’s Fuse.js
Library to Perform Fuzzy Search in MongoDB
Example Code (the fuzzysearch.js
file code):
const Fuse = require('fuse.js')
var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/';
MongoClient.connect(url, function(err, db) {
if (err) throw err;
var dbo = db.db('FuseFuzzySearch');
var personObj = [
{name: 'Mehvish Ashiq'}, {name: 'Jennifer Johnson'},
{name: 'Natalie Robinson'}, {name: 'John Ferguson'},
{name: 'Samuel Patterson'}, {name: 'Salvatore Callahan'},
{name: 'Mikaela Christensen'}
];
dbo.collection('person').insertMany(personObj, function(err, res) {
if (err) throw err;
});
const options = {includeScore: true, keys: ['name']}
const fuse = new Fuse(personObj, options);
const result = fuse.search('jahson');
console.log(result);
db.close();
});
OUTPUT:
[
{
item: { name: 'Jennifer Johnson', _id: 6293aa0340aa3b21483d9885 },
refIndex: 1,
score: 0.5445835311565898
},
{
item: { name: 'John Ferguson', _id: 6293aa0340aa3b21483d9887 },
refIndex: 3,
score: 0.612592665952338
},
{
item: { name: 'Natalie Robinson', _id: 6293aa0340aa3b21483d9886 },
refIndex: 2,
score: 0.6968718698752637
},
{
item: { name: 'Samuel Patterson', _id: 6293aa0340aa3b21483d9888 },
refIndex: 4,
score: 0.6968718698752637
}
]
In this code example, we first imported the fuse.js
library. Next, we connected to MongoDB.
If it is not connected for any reason, then throw an error. Otherwise, create a database named FuseFussySearch
.
Then, create an object named personObj
containing all the documents we want to insert into the person
collection. An error will be generated if there is any issue while inserting the data.
Create the object of Fuse
, pass the array of objects personObj
and options
having keys
and includeScore
to perform the fuzzy search and get the results, as given above.
Here, the keys
specify the fields on which the search will be performed. The includeScore
is optional, but better to have it because it tells the matching score.
If it is 0
, the program finds the perfect match, while a score of 1
shows the complete mismatch. You can find all the options here.
Finally, do not forget to close the connection. There are many other libraries that you can also explore.