Grep (short for GlobalRegular Expression Print) is a command that is used extensively to as a text search tool in text files. It searches for a pattern in a file and prints the corresponding line, which contains the matching pattern. Itscans files for specified patterns and can be used with regular expressions, as well as text strings.Its syntax is as follows:
$ grep [options] pattern [files]
The following table demonstrates when the grep
command is used:
Command
Usage
grep 'student' /etc/passwd
Search for a string,student
, in a file,/etc/passwd
, and print all matching lines
grep -v 'student' /etc/passwd
Print all lines that do not contain the stringstudent
grep -i 'STUDENT' /etc/passwd
Search for a string,STUDENT
, in a case-insensitive manner and print all matching lines (-i
ignore case)
grep -c ‘student’ /etc/passwd
Print the total number of lines that contain the textstudent
in the /etc/passwd
file
grep -rl 'student' /etc/
Search the directory recursively and print the filenames that have the stringstudent
grep -rL ‘student’ /etc/
Search the directory recursively and print the filenames that don’t have the stringstudent
grep -n 'student' /etc/passwd
Print the line number, along with the line containing the patternstudent
grep -A1 'student' /etc/passwd
Print an additional one line after the match
grep -B1 'student' /etc/passwd
Print an additional one line before the match
grep -C1 'student' /etc/passwd
Print an additional one line after, and one line before, the match
grep -a 'dir' /bin/mkdir
Search inside the/bin/mkdir
binary file and print the line containing the stringdir
grep 'root' /etc/passwd
Print the line containing the stringroot
anywhere on a line
grep '^root' /etc/passwd
Print the line that begins with the stringroot
grep 'bash$' /etc/passwd
Print the line that ends with the stringbash
grep '^$' <filename>
Print the empty lines from the file
grep -v '^$' <filename>
Print only non-empty lines from the file
grep '[br]oot' /etc/passwd
Print the lines that contain either string beginning with the characters b
orr
, and followed by the string oot
, anywhere on a line in the /etc/passwd
file
who | grep 'student'
Print the line containing the stringstudent
by reading input fromstdin
An example of matching a string in a file using grep
is shown in the following screenshot:
An example of printing those lines that do not contain the specified string using grep
is shown in the following screenshot (some output stripped):
Thegrep
command can be used with the -c
option to count the occurrence of a specified pattern. The following example shows how to count the number of CPU cores in a system usinggrep
command:
$ grep -c name /proc/cpuinfo (count the number of cpu cores in system)
The following screenshot shows how to use grep
command to count the occurrence of root
string in the /etc/passwd
file:
An example of printing the line number, along with the matching lines using the grep
, is shown in the following screenshot:
An example of printing the lines that begin with a specified string is shown in the following screenshot:
Text extraction using sed and awk
It is very often necessary to extract the same text repeatedly from a file. For such an operation, where we need to edit a file at the same place, or extract the same text from multiple files, we use sed
and awk
. There are multiple text extraction utilities. However, these utilities use fewer system resources, execute faster, and are simpler to use.
sed
This is one of the oldest and most popular Unix text processing tools. It is a non-interactive stream editor. Itis typically used forfiltering text, as well as performingtextsubstitutionand the non-interactive editing of text files.There are two main ways of invoking the sed
command, as follows:
sed -e command <filename>
: Specify editing commands at the command line, operate on the filename specified, and display the output on the Terminal. Here, the-e
command option allows us to specify multiple editing commands simultaneously at the command line.sed -f scriptfile <filename>
: Specify a script file containingsed
commands to operate on a specified filename and display the output on the Terminal.
Now, we discuss the most popular operations performed using sed
, for example, substitution. The following table lists the basic syntax for substitution operations:
Command
Usage
sed 's/original_string/new_string/s file
Substitute the first occurrence of the original string in each line with a new string
sed 's/original_string/new_string/g' file
Substitute all occurrences of the original string in each line with a new string specified
sed '1,3s/original_string/new_string/g' file
Substitute all occurrences of the original string in each line with a new string from line one to line three in the same file
sed -i 's/original_string/new_string/g' file
Substitute all occurrences of the original string with a new string in each line in the same file
Using the sed
utility with the print command:
The p
command will print the matching lines and the -n
option suppresses standard output so that only matching lines are displayed, as shown in the following example:
$ sed -n '1,3' /etc/passwd
$ sed -n '/^root/' /etc/passwd
Using the sed
utility with the substitutecommand:
The s
command will replace the matching string with a new string. The s
option can be prefixed with a range to restrict the replacement to a specified number of lines, as shown in the following example:
$ sed '/^student/s/bash/sh/' /etc/passwd
Using the sed
utility with delete command:
In the following example, the sed d
command will delete the empty and commented lines from ntp.conf
and create a backup file of ntp.conf
with the extension backup as ntp.conf.backup
, as shown in the following command line:
$ sed -i.backup '/^#/d;/^$/d' /etc/ntp.conf
Note:
Use the -i
option with caution, because the changes, once made inside the file, are not reversible. It is always a better way to first use sed
without the -i
option and then redirect the output to a new file.
awk
The awk
command is used to extract data from a file and print specific contents. It is quite often used to restructure the data and construct reports. Its name is derived from the last names of its creators: Alfred Aho, Peter Weinberger, and Brian Kernighan. Its main features include the following:
- It is an interpreted programming language similar to C
- It is used for data manipulation in files, and for retrieving and processing text from files
- It views files as records and fields
- It has arithmetic and string operators
- It has variables, conditional statements, and loops
- It reads from a file or from a standard input device and outputs to a standard output device such as a Terminal
Its general invoking syntax is as follows:
$ awk '/pattern/{command}' <filename>
The printing of a selected column or row from a file is the basic task generally performed usingawk
.
In the following example, the awk
command is used to print the contents of a file line by line until the end of the file is reached:
$ awk '{ print $0}' /etc/passwd
In the following example, awk
command is used to print the first field (column) of the line containing the username student
. Here -F
option is usedto set the field separator as :
$ awk -F: '/student/{ print $1}' /etc/passwd
In the following example, the awk
command is used to print selective fields from the line containing the matching pattern in file /etc/passwd
:
$ awk -F: '/student/{print "Username :", $1, "Shell :", $7}' /etc/passwd