How to Use The awk Command in Linux [With Examples]

awk command linux

This guide shows you how to use the awk command in Linux, with plenty of useful everyday examples.

AWK is a tool and language for searching and manipulating text available for the Linux Operating System.

The awk command and the associated scripting language search files for text defined by a pattern and perform a specific action on the text which matches the pattern.

awk is a useful tool for extracting data and building reports from large text files or large numbers of text files – for example processing logs, or the output of data-recording devices like temperature probes, which have collected a lot of data over a period of time. It can also be used on the output from database queries.

There’s no need to install awk; it should already be available on your Linux System.

awk Syntax

The syntax for using the awk command in the terminal is as follows:

awk [PROGRAM] [INPUT FILES]

Note that:

  • [PROGRAM] is the search pattern and actions to take – it’s the program you want awk to run on the supplied files
    • It can also be supplied as a text file rather than inline by using the -f option
  • [INPUT FILES] are the files you wish awk to work on – it can be several files separated by a space, or the path to a directory, or a pattern of files to match
    • If no input files are specified, awk will work on the piped output from another command

awk Options

The following options can be supplied to the awk command:

-f program-file Program text is read from file instead of from the command-line. Multiple -f options are accepted.
-F value Sets the field separator, FS, to value.
-v var=value Assigns value to program variable var.

For more implementation-specific options for your version of awk, you can check the manual by running:

man awk

Program Actions & Variables

The program you supply to awk will determine what it does to the text files you supply to it. An awk program takes the following format:

CONDITION { ACTION }
CONDITION { ACTION }
...

Where CONDITION is the pattern of text to match and ACTION is the action to take on the matched text. You can have as many conditions and actions as you please.

Actions

The actions supplied are commands that can include calculations, variables, and calling functions. Some built-in functions are implementation-specific, so it’s best to check your manual for these.

Records

awk generally treats each new line in a text file as a record unless otherwise specified via OPTIONS.

Fields

awk will use whitespace (spaces, tabs) to denote the fields in a record unless otherwise specified via OPTIONS.

Variables

awk has many built-in variables that you can use without having to define them yourself, which cover some common scenarios:

Variable Meaning
$0 Represents the entire record
$1, $2, $3 … Field variables – hold the text/values for the individual text fields in a record
NR / Number of Records Current count of the number of input records read so far from all files
FNR / File Number of Records Current count of the number of input records read so far in the current file – Automatically reset to zero each time a new file is started
NF / Number of Fields Number of fields in the current input record – The last field in a record can be referenced using $NF, the 2nd to last field using $(NF-1) and so on
FILENAME Name of the current input file
FS / Field Separator The character(s) used to separate the fields in a record. By default includes any space and tab characters
RS / Record Separator The character(s) used to separate the records in a file. New line by default
OFS / Output Field Separator Character(s) used to separate fields in Awk output. The default is a single space
ORS / Output Record Separator Character(s) used to separate fields in Awk output. The default is a new line
OFMT / Output ForMaT Format for numeric output – Default format is “%.6g”

awk Usage Examples

For these examples, we will work on a single text file called flowers.txt, which contains the following text:

red rose
yellow daffodil
pink flamingo
white rose
blue iris
white lily
red peony
yellow orchid
purple foxglove

Print File Contents

The following awk command will output the contents of a file to the terminal using the awk print function:

awk '{print}' flowers.txt

Print Number of Records (Lines) in File

awk 'END { print NR }' sample.txt

This example will output the number of lines in the file:

9

Search for Text in File Using Regular Expressions

The following command will output the lines in a file describing only types of rose:

awk '/rose/' flowers.txt

Note that REGEX (Regular Expressions) syntax is used to define the text to search.

This command will output:

red rose
white rose

More Regex

awk '/^p/' flowers.txt

This command will only output records starting with p:

pink flamingo
purple foxglove

Using Field Variables

By using field Variables, you can output only the first field for records starting with p:

awk '/^p/ {print $1;}' flowers.txt

Which will output:

pink
purple

Processing Output from other Programs

You can pipe output from other Linux shell programs into awk for processing. This example takes the output from the ls -l command, which lists the contents of the current directory and returns the contents of the 5th field (the size of the file):

ls -l | awk '{print $5}'

Which will output something like:

3104
3072
224
256

…(depending on how many files are in the current directory and how big they are).

Using Built-In Variables

awk '{print NR "-" $2 }' flowers.txt

This command will print the current record number (file line number) followed by the second field – the name of the flower:

1-red rose
2-yellow daffodil
3-pink flamingo
4-white rose
5-blue iris
6-white lily
7-red peony
8-yellow orchid
9-purple foxglove

Combining Actions

Conditions and actions can be combined using && This command will print all records where the first field contains the text red and the second field has less than 5 characters:

awk '$1 ~ /red/ && length($NF) < 5 { print }' flowers.txt

Note:

  • The use of $NF to get to the second field as an alternative to using $2 – possible as it’s the last field and thus equal to the NF (Number of Fields)
  • The length() function is used to calculate the length of the field

So it returns a single matching record from the example file:

red rose

Conclusion

awk is included pretty much universally with Linux for a reason – it’s a staple tool for searching and processing text, which you can use for quickly finding log entries if something goes wrong on your system or for processing captured data for research use.

If you’ve ever tried to do anything more than a simple find/replace on a large collection of text files, you’ll know the value in being able to specifically make replacements or updates to all of your text programmatically, without having to run individual find/replace commands.

Check out our other Linux Tips!

SHARE:
nv-author-image

Brad Morton

I'm Brad, and I'm nearing 20 years of experience with Linux. I've worked in just about every IT role there is before taking the leap into software development. Currently, I'm building desktop and web-based solutions with NodeJS and PHP hosted on Linux infrastructure. Visit my blog or find me on Twitter to see what I'm up to.

Leave a Reply

Your email address will not be published. Required fields are marked *