Unix Reference

grep (search files for text)

Video A useful tool for searching through files is grep. This works like "find in file" (Ctrl-F or Command-F) search in Microsoft Word, or search box on your computer's files and folders. The general syntax for grep is

    grep <pattern> <file(s) to search> 

There is more to grep than just normal text matching. grep especially shines is in matching complex patterns expressed as regular expressions (commonly shortened to regex). Standard grep is quite feature-rich, so we will highlight a few more key features to get you started.

Adding the color flag (grep --color) will highlight the matched portion of the line in red. The red highlight makes it possible to see exactly which part of the string is matched. Some people find it so helpful that they define a shell alias to make grep expand into grep --color so they never have to be without it. You may want to do the same.

Here are some of the core metacharacters that you will often use:

 .  matches any character
    e.g, 'a..' matches 'abc' and also 'adf'

 *  matches zero or more repeats of char to left of *
    e.g., 'ab*'' matches 'abbbbb' and also 'a'

 ^  matches the beginning of the line

 $  matches the end of the line

Note that the * symbol does what we call greedy matching. This means that it tries to match as many characters as possible. So therefore, ab* tries to match as many bs as possible instead of not matching any bs at all.

However, naive greedy matching strategy will sometimes miss matches. For example, when using ab*b to match abb, if one matches the b* to bb, then there is no text left to match the final b to. Grep is smart enough to backtrack after this failed match, and tries to match b* to the next longest string, b so that the final b in the search string can match the final b in the text.

A dictionary word list is available on myth in the file /usr/share/dict/words. This file is a good one to grep for practice, e.g. try grep joy /usr/share/dict/words or grep 'b.b' /usr/share/dict/words and see which words matched. Here are some suggested exercises to use as practice in forming regular expressions:

Note that certain punctuation characters such as * and $ have special meaning to the shell and may get transformed before passing these arguments along to the program. It is best to get into the habit of enclosing pattern argument in single-quotes when invoking grep to ensure the pattern is received as intended.

Back to contents