grep command is a string and pattern matching utility that displays matching lines from multiple files. It also works with piped output from other commands. We show you how.
The Story Behind grep
grep command is famous in Linux and Unix circles for three reasons. Firstly, it is tremendously useful. Secondly, the wealth of options can be overwhelming. Thirdly, it was written overnight to satisfy a particular need. The first two are bang on; the third is slightly off.
Ken Thompson had extracted the regular expression search capabilities from the
ed editor (pronounced ee-dee) and created a little program—for his own use—to search through text files. His department head at Bell Labs, Doug Mcilroy, approached Thompson and described the problem one of his colleagues, Lee McMahon, was facing.
McMahon was trying to identify the authors of the Federalist papers through textual analysis. He needed a tool that could search for phrases and strings within text files. Thompson spent about an hour that evening making his tool a general utility that could be used by others and renamed it as
grep. He took the name from the
ed command string
g/re/p , which translates as “global regular expression search.”
Simple Searches With grep
To search for a string within a file, pass the search term and the file name on the command line:
Matching lines are displayed. In this case, it is a single line. The matching text is highlighted. This is because on most distributions
grep is aliased to:
alias grep='grep --colour=auto'
Let’s look at results where there are multiple lines that match. We’ll look for the word “Average” in an application log file. Because we can’t recall if the word is in lowercase in the log file, we’ll use the
-i (ignore case) option:
grep -i Average geek-1.log
Every matching line is displayed, with the matching text highlighted in each one.
We can display the non-matching lines by using the -v (invert match) option.
grep -v Mem geek-1.log
There is no highlighting because these are the non-matching lines.
We can cause
grep to be completely silent. The result is passed to the shell as a return value from
grep. A result of zero means the string was found, and a result of one means it was not found. We can check the return code using the
$? special parameters:
grep -q average geek-1.log
grep -q howtogeek geek-1.log
Recursive Searches With grep
To search through nested directories and subdirectories, use the -r (recursive) option. Note that you don’t provide a file name on the command line, you must provide a path. Here we’re searching in the current directory “.” and any subdirectories:
grep -r -i memfree .
The output includes the directory and filename of each matching line.
We can make
grep follow symbolic links by using the
-R (recursive dereference) option. We’ve got a symbolic link in this directory, called
logs-folder. It points to
ls -l logs-folder
Let’s repeat our last search with the
-R (recursive dereference) option:
grep -R -i memfree .
The symbolic link is followed and the directory it points to is searched by
Searching for Whole Words
grep will match a line if the search target appears anywhere in that line, including inside another string. Look at this example. We’re going to search for the word “free.”
grep -i free geek-1.log
The results are lines that have the string “free” in them, but they’re not separate words. They’re part of the string “MemFree.”
grep to match separate “words” only, use the
-w (word regexp) option.
grep -w -i free geek-1.log
This time there are no results because the search term “free” does not appear in the file as a separate word.
Using Multiple Search Terms
-E (extended regexp) option allows you to search for multiple words. (The
-E option replaces the deprecated
egrep version of
This command searches for two search terms, “average” and “memfree.”
grep -E -w -i "average|memfree" geek-1.log
All of the matching lines are displayed for each of the search terms.
You can also search for multiple terms that are not necessarily whole words, but they can be whole words too.
-e (patterns) option allows you to use multiple search terms on the command line. We’re making use of the regular expression bracket feature to create a search pattern. It tells
grep to match any one of the characters contained within the brackets “.” This means
grep will match either “kB” or “KB” as it searches.
Both strings are matched, and, in fact, some lines contain both strings.
Matching Lines Exactly
-x (line regexp) will only match lines where the entire line matches the search term. Let’s search for a date and time stamp that we know appears only once in the log file:
grep -x "20-Jan--06 15:24:35" geek-1.log
The single line that matches is found and displayed.
The opposite of that is only showing the lines that don’t match. This can be useful when you’re looking at configuration files. Comments are great, but sometimes it’s hard to spot the actual settings in amongst them all. Here’s the
We can effectively filter out the comment lines like this:
sudo grep -v "#" /etc/sudoers
That’s much easier to parse.
Only Displaying Matching Text
There may be an occasion when you don’t want to see the entire matching line, just the matching text. The
-o (only matching) option does just that.
grep -o MemFree geek-1.log
The display is reduced to showing only the text that matches the search term, instead of the entire matching line.
Counting With grep
grep isn’t just about text, it can provide numerical information too. We can make
grep count for us in different ways. If we want to know how many times a search term appears in a file, we can use the
-c (count) option.
grep -c average geek-1.log
grep reports that the search term appears 240 times in this file.
You can make
grep display the line number for each matching line by using the
-n (line number) option.
grep -n Jan geek-1.log
The line number for each matching line is displayed at the start of the line.
To reduce the number of results that are displayed, use the
-m (max count) option. We’re going to limit the output to five matching lines:
grep -m5 -n Jan geek-1.log
Being able to see some additional lines—possibly non-matching lines—for each matching line is often useful. it can help distinguish which of the matched lines are the ones you are interested in.
To show some lines after the matching line, use the -A (after context) option. We’re asking for three lines in this example:
grep -A 3 -x "20-Jan-06 15:24:35" geek-1.log
To see some lines from before the matching line, use the
-B (context before) option.
grep -B 3 -x "20-Jan-06 15:24:35" geek-1.log
And to include lines from before and after the matching line use the
-C (context) option.
grep -C 3 -x "20-Jan-06 15:24:35" geek-1.log
Showing Matching Files
To see the names of the files that contain the search term, use the
-l (files with match) option. To find out which C source code files contain references to the
sl.h header file, use this command:
grep -l "sl.h" *.c
The file names are listed, not the matching lines.
And of course, we can look for files that don’t contain the search term. The
-L (files without match) option does just that.
grep -L "sl.h" *.c
Start and End of Lines
We can force
grep to only display matches that are either at the start or the end of a line. The “^” regular expression operator matches the start of a line. Practically all of the lines within the log file will contain spaces, but we’re going to search for lines that have a space as their first character:
grep "^ " geek-1.log
The lines that have a space as the first character—at the start of the line—are displayed.
To match the end of the line, use the “$” regular expression operator. We’re going to search for lines that end with “00.”
grep "00$" geek-1.log
The display shows the lines that have “00” as their final characters.
Using Pipes with grep
Of course, you can pipe input to
grep , pipe the output from
grep into another program, and have
grep nestled in the middle of a pipe chain.
Let’s say we want to see all occurrences of the string “ExtractParameters” in our C source code files. We know there’s going to be quite a few, so we pipe the output into
grep "ExtractParameters" *.c | less
The output is presented in
This lets you page through the file listing and to use
less's search facility.
If we pipe the output from
wc and use the
-l (lines) option, we can count the number of lines in the source code files that contain “ExtractParameters”. (We could achieve this using the
-c (count) option, but this is a neat way to demonstrate piping out of
grep "ExtractParameters" *.c | wc -l
With the next command, we’re piping the output from
grep and piping the output from
sort . We’re listing the files in the current directory, selecting those with the string “Aug” in them, and sorting them by file size:
ls -l | grep "Aug" | sort +4n
Let’s break that down:
- ls -l: Perform a long format listing of the files using
- grep “Aug”: Select the lines from the
lslisting that have “Aug” in them. Note that this would also find files that have “Aug” in their names.
- sort +4n: Sort the output from grep on the fourth column (filesize).
We get a sorted listing of all the files modified in August (regardless of year), in ascending order of file size.
grep: Less a Command, More of an Ally
grep is a terrific tool to have at your disposal. It dates from 1974 and is still going strong because we need what it does, and nothing does it better.
grep with some regular expressions-fu really takes it to the next level.