grep
(search files for text)
Video
A useful tool for searching through files is grep
. This works like
"find in file" (Ctrl-F or Command-F) search in Microsoft Word, or search
box on your computer's files and folders. The general syntax for grep is
grep <pattern> <file(s) to search>
There is more to grep than just normal text matching. grep
especially shines is in matching complex patterns expressed as regular
expressions (commonly shortened to regex). Standard grep
is quite
feature-rich, so we will highlight a few more key features to get you started.
Adding the color flag (grep --color
) will highlight the matched portion of
the line in red. The red highlight makes it possible to see exactly which
part of the string is matched. Some people find it so helpful that they define
a shell
alias to make grep
expand into grep --color
so they never have to be without
it. You may want to do the same.
Here are some of the core metacharacters that you will often use:
. matches any character
e.g, 'a..' matches 'abc' and also 'adf'
* matches zero or more repeats of char to left of *
e.g., 'ab*'' matches 'abbbbb' and also 'a'
^ matches the beginning of the line
$ matches the end of the line
Note that the *
symbol does what we call greedy matching. This means that
it tries to match as many characters as possible. So therefore, ab*
tries
to match as many b
s as possible instead of not matching any b
s at all.
However, naive greedy matching strategy will sometimes miss matches.
For example, when using ab*b
to match abb
, if one
matches the b*
to bb
, then there is no text left to match the final b
to. Grep is smart enough to backtrack after this failed match, and tries to
match b*
to the next longest string, b
so that the final b
in the
search string can match the final b
in the text.
A dictionary word list is available on myth in the file /usr/share/dict/words
. This file is a good one to grep for practice, e.g. try grep joy /usr/share/dict/words
or grep 'b.b' /usr/share/dict/words
and see which words matched. Here are some suggested exercises to use as practice in forming regular expressions:
- match all words that end with
zy
- match all words that start with
k
and end withk
- match all words that are exactly 7 letters long
Note that certain punctuation characters such as *
and $
have special meaning to the shell and may get transformed before passing these arguments along to the program. It is best to get into the habit of enclosing pattern argument in single-quotes when invoking grep
to ensure the pattern is received as intended.