Friday, March 19, 2010

Shell Script: How to use GREP utility?

The grep command selects and prints lines from a file (or a bunch of files) that match a pattern. Let's say your friend Bill sent you an email recently with his phone number, and you want to call him ASAP to order some books. Instead of launching your email program and sifting through all the messages, you can scan your in-box file, like this:

The most useful grep flags are shown here:

-i Ignore uppercase and lowercase when comparing.
-v Print only lines that do not match the pattern.
-c Print only a count of the matching lines.
-n Display the line number before each matching line.

When grep performs its pattern matching, it expects you to provide a regular expression for the pattern. Regular expressions can be very simple or quite complex, so we won't get into a lot of details here. Here are the most common types of regular expressions:

abc Match lines containing the string "abc" anywhere.
^abc Match lines starting with "abc."
abc$ Match lines ending with "abc."
a..c Match lines containing "a" and "c" separated by any two characters (the dot matches any single character).
a.*c Match lines containing "a" and "c" separated by any number of characters (the dot- asterisk means match zero or more characters).


Regular expressions also come into play when using vi, sed, awk, and other Unix commands. If you want to master Unix, take time to understand regular expressions. Here is a sample poem.txt file and some grep commands to demonstrate regular-expression pattern matching:

Mary had a little lamb
Mary fried a lot of spam
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print all lines containing spam (respecting uppercase and lowercase), enter

grep 'spam' poem.txt
Mary fried a lot of spam
Jill had a lamb spamwich

To print all lines containing spam (ignoring uppercase and lowercase), enter

grep -i 'spam' poem.txt
Mary fried a lot of spam
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print just the number of lines containing the word spam (ignoring uppercase and lowercase), enter

grep -ic 'spam' poem.txt
3

To print all lines not containing spam (ignoring uppercase and lowercase), enter

grep -i -v 'spam' poem.txt
Mary had a little lamb

To print all lines starting with Mary, enter

grep '^Mary' poem.txt
Mary had a little lamb
Mary fried a lot of spam

To print all lines ending with ich, enter

grep 'ich$' poem.txt
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print all lines containing had followed by lamb, enter

grep 'had.*lamb' poem.txt
Mary had a little lamb
Jill had a lamb spamwich

No comments:

Post a Comment