قالب وردپرس درنا توس
Home / Tips and Tricks / How to use the uniq command on Linux

How to use the uniq command on Linux



  A shell prompt on a Linux computer.
Fatmawati Achmad Zaenuri / Shutterstock

The Linux uniq command floats through your text files in search of unique or duplicated lines. In this guide we cover its versatility and functions, as well as how you can get the most out of this handy tool.

Find matching lines of text on Linux

The uniq command is fast, flexible, and good at what it does. However, like many Linux commands, it has a few oddities ̵

1; which is great, as long as you know about it. If you take the plunge without a little prior knowledge, you can easily scratch your head on the results. We will point out these peculiarities to you as we continue.

The uniq command is perfect for those in the one-sided, designed to do one thing and do it well. camp. That is why it is particularly suitable to work with pipes and play its role in command pipelines. One of the most common employees is sort because uniq must sort the input to be worked on.

Let's start!

RELATED: [19659009] How to use pipes on Linux

Run Uniq without options

We have a text file with the text of the Robert Johnson song I Believe I & # 39; ll Dust My Broom . Let's see what uniq makes of it.

We type the following to direct the output to less :

  uniq dust-my-broom.txt | less 

  The

We get the whole number, including double lines, in less :

 The output of the

That does not look like be either the unique lines or the double lines.

Exactly – because this is the first whim. If you execute uniq without options, it behaves as if you have used the option -u (unique lines). This tells uniq to print only the unique lines from the file. The reason that you see duplicate lines is because, for uniq to consider a line as a duplicate, it must be adjacent to its duplicate, which is where type enters.

When we sort the file, it groups the double lines and uniq treats them as duplicates. We use sort in the file, pipe the sorted output to uniq and then direct the last output to less .

To do this, we type the following:

  sort dust-my-broom.txt | uniq | less 

  The

A sorted list of rules appears in less .

 Export of sort dust-my-broom.txt | uniq | less in less in a terminal window

The line: "I believe I will dust my broom", certainly appears more than once in the issue. It is even repeated twice in the first four lines of the song.

So why does it appear in a list of unique rules? Because the first time a line appears in the file, it is unique; only the following entries are duplicates. You can see it as the first appearance of each unique rule.

Let's sort again and redirect the output to a new file. In this way we do not have to use sort in every assignment.

We type the following command:

  sort dust-my-broom.txt> sort.txt 

  The sort.txt "command in a terminal window." width = "646" height = "57" src = "/ pagespeed_static / 1.JiBnMqyl6S.gif" onload = "pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" onerror = "this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" />

Now we have a pre-sorted file to work with.

Count Duplicates

You can use the -c (number) option to print the number of times each line appears in a file.

Type the following command:

  uniq -c sort.txt | less 

  The

Each line starts with the number of times that line appears in the file. However, you will notice that the first line is empty. This indicates that there are five blank lines in the file.

 Output of the "uniq -c sort.txt | less" command in less in a terminal window.

If you want the output sorted in numerical order, you can enter the output from uniq in sort . In our example, we will use the options -r (reverse) and -n (numeric sorting), and pass the results to less . [19659006] We type the following:

  uniq -c sort.txt | sort -rn | less 

  The

The list is sorted in descending order based on the frequency of the appearance of each line.

 Export of uniq -c sort.txt | sort -rn | less in less in a terminal window

Show double lines only

If you only want to see the lines that are repeated in a file, you can use the -d (repeated) option. It does not matter how often a line is duplicated in a file, it is only mentioned once.

To use this option, we type the following:

  uniq -d sort.txt 

  The

The double lines are displayed for us. You see the blank line at the top, which means that the file contains duplicate blank lines – it is not a space left by uniq to cosmetically compensate the entry.

 Output of the

We can also combine the options -d (repeated) and -c (count) and output by sort . This gives us a sorted list of the rules that appear at least twice.

Type the following to use this option:

  uniq -d -c sort.txt | sort -rn 

  The

Show all duplicated rules

If you want to see a list of each duplicated rule, as well as an entry for every time a rule appears in the file, you can use the option -D (all double lines).

To use this option, type the following:

  uniq -D sort.txt | less 

  The

The list contains an entry for each double line.

 Export of uniq -D sort.txt | less in less in a terminal window

If you use the - group option, each duplicated line is printed with a blank line before ( prepend ) or after each group ( add ), or both before and after ( both ) each group.

We use add as our modifier, so we type the following:

  uniq --group = add sorted.txt | less 

  The

The groups are separated by blank lines for easier reading.

 Execution of the command "uniq --group = append sort.txt | less" in less in a terminal window

Checking a certain number of characters

Checking by default uniq the full length of each line. If you want to limit the checks to a certain number of characters, you can use the -w (check boxes) option.

In this example, we repeat the last command, but limit the equations to the first three characters. For this we type the following command:

  uniq -w 3 --group = append sort.txt | less 

  The

The results and groupings that we receive are very different.

 Execution of the command "uniq -w 3 --group = append sort.txt | less" in a terminal window.

All lines that begin with "I b" are grouped because those parts of the lines are identical, so they are considered duplicates.

Similarly, all lines that begin with "I & # 39; m & # 39; are treated as duplicates, even if the rest of the text is different.

Ignore a certain number of characters

There are some cases where it may be useful to skip a certain number of characters at the beginning of each line, for example, when lines in a file are numbered, or suppose you need uniq to jump over a timestamp and check the lines from character six instead of from the first character.

Below is a version of our sorted file with numbered lines.

 A numbered and sorted file of double lines in less in a terminal window.

If we want uniq to start the comparison checks at character three, we can use the option -s (skip characters) by typing the following: [19659012] uniq -s 3 -d -c number ed.txt

  The

The lines are detected as duplicates and counted correctly. Note that the line numbers shown are those of the first copy of each duplicate.

You can also skip fields (a series of characters and some white space) instead of characters. We use the option -f (fields) to tell uniq which fields should be ignored.

We type the following to tell uniq the first field:

  uniq -f 1 -d -c numbered.txt 

  The

We get the same results as when we told uniq to skip three characters at the start of each line.

Ignore case

By default, is uniq case sensitive. If the same letter is covered and displayed in lowercase, uniq considers the rules to be different.

For example, view the output of the following command:

  uniq -d -c sort.txt | sort -rn 

  The

The rules "I believe I will dust my broom" and "I believe I will dust my broom" are not treated as duplicates because of the difference in case on the "B "believe in".

If we include the option -i (ignore capital), these lines are treated as duplicates. We type the following:

  uniq -d -c -i sorted.txt | sort -rn 

  The

The rules are now treated as duplicates and grouped.


Linux puts a multitude of special utilities at your disposal. Like many of them, uniq is not a tool that you use every day.

That is why much of Linux becoming proficient is remembering which tool will solve your current problem, and where you can find it again. However, if you practice, you are well on your way.

Or you can always just find How-To Geek – we probably have an article about this.




Source link