How to Use Grep With Multiple Strings
As a Bash scriptwriter, you may find yourself in a situation where you need to parse a wall of text for relevant information. Sometimes that information is unordered, which requires you to figure out a pattern to catch all relevant data.
The best tool for this job in Linux is grep
written by Ken Thompson around 1973. grep
is available across all modern UNIX systems.
This tutorial will extensively cover the use of grep
from basic examples such as capturing a single phrase to capturing multiple patterns using RegEx or fixed strings, assuming a Bash command line.
Using grep
to Capture Simple Phrases
The simplest way to use grep
is to find the occurrences of a phrase in a file. Given a target word and a file, we can search for the word in the file as shown.
user@linux:~$ cat file.txt
UNIX
tutorial
word
words
sword
tests
Linux
user@linux:~$ grep word file.txt
word
words
sword
As you can see above, all words that contain the substring word
are captured.
You can also capture the output of a program and grep
the output for a phrase, as shown. We’ll continue to use the file as an example, but you can do this with any program that prints to stdout
.
user@linux:~$ cat file.txt | grep word
word
words
sword
If you prefer that grep
only print the phrases that match exactly (i.e., have spaces around them and are not substrings of other words), you can use the -w
/--word-regexp
flag to enable whole word matching.
You can use the same idea to match phrases if they appear as a single line, with -x
/--line-regexp
.
user@linux:~$ cat file.txt | grep -w word
word
grep
With Multiple Strings
To use multiple phrases, separated by newlines, to capture relevant matches in a file or text stream from a program, you can use the -F
/--fixed-strings
to specify them.
You can pass in a string shown below instead for a small number of matches, with a dollar sign indicating a newline.
grep -F "words$word" file.txt
# or
fgrep "words$word" file.txt
For a larger list from a file, you can use cat
to print out the file as an argument to grep
and reuse the same syntax.
user@linux:~$ cat match.txt
word
sword
user@linux:~$ fgrep "$(cat match.txt)" file.txt
word
words
sword
grep
With RegEx
This section will come in extremely handy if you’re familiar with RegEx. Using -E
/--extended-regexp
, you can specify a RegEx pattern to capture more complicated phrasing that cannot catch with a single or multiple phrases.
Given a file that contains email addresses and URLs at random, we may wish to filter out lines that match emails, or URLs, with separate invocations of the grep
command.
A simple, mostly naive, RegEx pattern to capture emails would be [^\@]+\@[^\.]+.*
. To use this with grep
, you can do the following:
user@linux:~$ cat file.txt
user@linux.com
linux@torvalds.com
not a URL or email
https://www.google.com/
https://apple.com/
not an email or URL
user@linux:~$ egrep '[^\@]+\@[^\.]+.*' file.txt
user@linux.com
linux@torvalds.com
Another example that utilizes RegEx is to specify multiple patterns to see if every single one of them exists in a file. To do this, we have the following pattern.
Note the comparison between the two RegEx patterns used in the example below - one uses the OR
operator, and the other is written such that the line must contain all three words.
user@linux:~$ cat file.txt
apple banana grape
bus lamppost bench
apple bench grape
bus grape lamppost
yellow apple bus
user@linux:~$ grep -P 'apple|banana|grape' file.txt
apple banana grape
apple bench grape
banana grape apple
bus grape lamppost
yellow apple bus
user@linux:~$ grep -P '^(?=.*apple)(?=.*banana)(?=.*grape)' file.txt
apple banana grape
banana grape apple
Remember that grep
is not the only string matching tool available to you in UNIX systems. You may also make use of awk
to capture complicated patterns. You can also use sed
to replace phrases based on matching criteria.
This tutorial derived information from the grep
manual page, which you can access by typing man grep
in any UNIX terminal or on this page.