قالب وردپرس درنا توس
Home / Tips and Tricks / How to use the awk command on Linux

How to use the awk command on Linux



  A Linux laptop with lines of code in a terminal window.
Fatmawati Achmad Zaenuri / Shutterstock

On Linux, awk is a command-line text manipulation dynamo, as well as a powerful scripting language. Here is an introduction to some of the coolest functions.

How awk got its name

The command awk was named with the initials of the three people who wrote the original version in 1

977: Alfred Aho, Peter Weinberger and Brian Kernighan. These three men came from the legendary AT&T Bell Laboratories Unix Pantheon. With the contributions of many others since then, awk has continued to evolve.

It is a complete scripting language, as well as a complete command line text manipulation toolkit. If this article arouses your appetite, you can view every detail about awk and its functionality.

Rules, patterns and actions

awk works on programs that contain rules consisting of patterns and actions. The action is performed on the text that matches the pattern. Patterns are enclosed in braces ( {} ). Together a pattern and an action form a rule. The entire program awk is enclosed in single quotes ( & # 39; ).

Let's look at the simplest type of program awk . It has no pattern, so it corresponds to every line of text entered. This means that the action is performed on every line. We use it on the output of the command that .

Here is the standard output of die :

  die 

  The

Maybe we don't need all that information, but just want to see the names in the accounts. We can send the output of that into awk and then tell awk to print only the first field.

Standard awk regards a field as a series of characters surrounded by whitespace, the beginning of a line, or the end of a line. Fields are identified by a dollar sign ( $ ) and a number. Thus, $ 1 represents the first field that we will use with the print action to print the first field.

We type the following:

  who | awk & # 39; {print $ 1} & # 39; 

  The

awk prints the first field and throws away the rest of the line.

We can print as many fields as we want. If we add a comma as a separator, awk prints a space between each field.

We also type the following to print the time that the person registered (field four):

  who | awk & # 39; {print $ 1, $ 4} & # 39; 

  The

There are a few special field IDs. These represent the full text line and the last field in the text line:

  • $ 0 : represents the full text line.
  • $ 1 : represents the first field.
  • $ 2 : represents the second field.
  • $ 7 : represents the seventh field.
  • $ 45 : represents the 45th field.
  • $ NF : stands for "number of fields," and represents the last field.

We type the following to open a small text file with a short quote attributed to Dennis Ritchie:

  cat dennis_ritchie.txt 

  The

We want that awk to print the first, second and last field of the quote.Note that although it is wrapped in the terminal window, it is only a single line text is.

We type the following command:

  awk & # 39; {print $ 1, $ 2, $ NF} & # 39; dennis_ritchie.txt 

  The

We know that & # 39; simplicity & # 39; is the 18th field in the text line, and we don't care, but what we do know is that this is the last field and that we $ NF can use to get the value. The period is simply considered another character in the body of the field.

Add output field separators

U k unt awk also tell to print a certain character between fields instead of the standard space sign. The standard output of the command date is somewhat strange because the time is exactly in the middle of this. However, we can type the following and use awk to extract the desired fields:

  date 
  date | awk & # 39; {print $ 2, $ 3, $ 6} & # 39; 

  The

We use the variable OFS (output field separator) to make a separator between the month, the day and the year. Note that we put the command below between single quotes ( & # 39; ), not braces ( {} ):

  date | awk & # 39; OFS = "/" {print $ 2, $ 3, $ 6} & # 39; 
  date | awk & # 39; OFS = "-" {print $ 2, $ 3, $ 6} & # 39; 

  The

The BEGIN and END rules

A BEGIN rule is executed once before word processing starts. In fact, it is executed before awk even reads any text. An END rule is executed after all processing has been completed. You can have multiple BEGIN and END lines, and they are executed in order.

For our example of a START line, we will print the full quote from the file dennis_ritchie.txt that we used previously with a title above it.

We type this command:

  awk & # 39; BEGIN {print "Dennis Ritchie"} {print $ 0} & # 39; dennis_ritchie.txt 

  The

Note the BEGIN rule has its own set of actions enclosed in its own set of braces ( {}

We can use the same technique with the command we used to execute that to awk transit, we type the following:

  who | awk & # 39; BEGIN {print "Active Sessions"} {print $ 1, $ 4} & # 39; [19659056] The "who | awk 'BEGIN {print "Active Sessions"} {print $1,$4}'" command in a terminal window." width="646" height="167" src="/pagespeed_static/1.JiBnMqyl6S.gif" onload="pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);" onerror="this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);"/> 

Input Field Separators

If you want awk to work with text that doesn't, & # 39; To use whitespace to separate fields, you must specify which character the text is used as a field separator, for example, the file / etc / passwd uses a colon (: ) to separate fields.

We use d at file and the -F (separator string) option to tell awk to use the colon (: ) as a separator. We type the following to tell awk to print the name of the user account and the home folder:

  awk -F: & # 39; {print $ 1, $ 6} & # 39; / etc / passwd 

  The

The output contains the name of the user account (or application or daemon name) and the home folder (or the location of the application).

 Output of the

Add Patterns

If we are only interested in regular user accounts, we can add a pattern to our print action to filter out all other listings. Because user ID numbers are equal to or greater than 1,000, we can base our filter on that information.

We type the following to perform our print action only when the third field ( $ 3 ) has a value of 1,000 or higher:

  awk -F: & # 39; $ 3> = 1000 {print $ 1, $ 6} & # 39; / etc / passwd 

  The = 1000 {print $ 1, $ 6} & # 39; / etc / passwd "command in a terminal window." width = "646" height = "147" src = "/ pagespeed_static / 1.JiBnMqyl6S.gif" onload = "pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" onerror = "this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" [>19659006] The pattern must immediately precede the action to which it is linked.

We can use the BEGIN rule to give a title for our small report. We type the following, using the ( n ) notation to insert a new character in the title string:

  awk -F: & # 39; BEGIN {print "User Accounts  n ------ ------- "} $ 3> = 1000 {print $ 1, $ 6} & # 39; / etc / passwd 

  The command = 1000 {print $ 1 , $ 6} & # 39; / etc / passwd "in a terminal window." width = "646" height = "212" src = "/ pagespeed_static / 1.JiBnMqyl6S.gif" onload = "pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" onerror = "this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" />

Patterns are full-fledged regular expressions and they are one of the glory of awk . [19659006] Let's say we want to see the universally unique identification data (UUID & # 39; s) of the linked file systems. If we look through the file / etc / fstab for occurrences of the string & # 39; UUID & # 39 ;, it should return that information for us.

We use the search pattern & # 39; / UUID / & # 39; on our behalf:

  awk & # 39; / UUID / {print $ 0} & # 39; / etc / fstab 

  The

It finds all copies of "UUID" and prints those lines. We would have achieved the same result without the action print because the standard action prints the whole line of text. For the sake of clarity, however, it is often useful to be explicit. When you look through a script or your history file, you will be happy that you have left clues for yourself.

The first line that was found was a comment line and although the string & # 39; UUID & # 39; it says in the middle, awk has still found it. We can adjust the regular expression and tell awk to only process lines that begin with "UUID". To do this, we type the following, including the start of line token ( ^ ):

  awk & # 39; / ^ UUID / {print $ 0} & # 39; / etc / fstab 

  The

That's better! Now we only see real mount instructions. To further refine the output, we type the following and limit the display to the first field:

  awk & # 39; / ^ UUID / {print $ 1} & # 39; / etc / fstab 

  The

If we had mounted multiple file systems on this machine, we would get a neat table of their UUID & # 39; s.

Built-in functions

awk has many functions that you can call up and use in your own programs, both from the command line and in scripts. If you do some digging, you will find it very fruitful.

To demonstrate the general technique for calling a function, let's look at some numeric. For example, the following prints the square root of 625:

  awk & # 39; BEGIN {print sqrt (625)} & # 39; 

This command prints the arc tangents of 0 (zero) and -1 (which happens to be the mathematical constant), pi):

  awk & # 39; BEGIN {print atan2 (0, -1)} & # 39; 

In the following command we change the result of the function atan2 () before we print it:

  awk & # 39; BEGIN {print atan2 (0, -1) * 100} & # 39; 

Functions can accept expressions as parameters. For example, here is a complicated way to request the square root of 25:

  awk & # 39; BEGIN {print sqrt ((2 + 3) * 5)} & # 39; 

 The

awk Scripts

If your command line becomes complicated or if you develop a routine that you know you want to use again, you can use your command awk in a script.

In our example script, we are already going to do the following:

  • Tell the shell which executable file should be used to run the script.
  • Prepare awk for using the field separation variable FS to read input text with fields separated by colons (: ).
  • Use the OFS output field separator to tell awk to use colons (: ) to separate fields in the output.
  • Set a counter to 0 (zero).
  • Set the second field of each text line to an empty value (it is always an "x", so we don't need to see this). [1 9659101] Print the line with the changed second field.
  • Increase the counter.
  • Print the value of the counter.

Our script is shown below.

 Example of an awk script in an editor

The BEGIN line performs the preparatory steps, while the END line displays the counter value. The middle line (which has no name or pattern to match each line) changes the second field, prints the line, and increments the counter.

The first line of the script tells the shell which executable file to use ( awk in our example) to execute the script. It also passes the option -f (filename) to awk that indicates that the text that it will process comes from a file. We pass the file name on to the script when we execute it.

We have included the script below as text so that you can cut and paste:

  #! / Usr / bin / awk -f

TO START {
# set the input and output field separators
FS = ":"
OFS = ":"
# zero the accounting counter
bills = 0
}
{
# set field 2 to nothing
$ 2 = ""
# print the entire line
print $ 0
# count another account
accounts ++
}
END {
# print the results
accounts "accounts.  n" print
} 

Save this in a file named omit.awk . To make the script executable, we type the following with chmod :

  chmod + x omit.awk 

 The

Now we will execute it and pass it file / etc / passwd to the script. This is the file awk that is processed for us, using the lines in the script:

  ./ omit.awk / etc / passwd 

 Het [19659006] The file is processed and each line is displayed, as shown below.

 Output of the

The "x" items in the second field have been removed, but keep in mind that the field separators are still present. The lines are counted and the total is displayed at the bottom of the output.

awk does not mean awkward

awk does not mean awkward; it stands for elegance. It is described as a processing filter and a report writer. To be precise, they are both, or rather a tool that you can use for both tasks. In just a few lines, achieved awk which requires extensive coding in a traditional language.

That power is exploited by the simple concept of rules that contain patterns, which select the text to be processed, and actions that define the processing.




Source link