uniq command floats through your text files in search of unique or duplicated lines. In this guide we cover its versatility and functions, as well as how you can get the most out of this handy tool.
Find matching lines of text on Linux
uniq command is fast, flexible, and good at what it does. However, like many Linux commands, it has a few oddities ̵
uniq command is perfect for those in the one-sided, designed to do one thing and do it well. camp. That is why it is particularly suitable to work with pipes and play its role in command pipelines. One of the most common employees is
uniq must sort the input to be worked on.
RELATED:  How to use pipes on Linux
Run Uniq without options
We have a text file with the text of the Robert Johnson song I Believe I & # 39; ll Dust My Broom . Let's see what
uniq makes of it.
We type the following to direct the output to
uniq dust-my-broom.txt | less
We get the whole number, including double lines, in
That does not look like be either the unique lines or the double lines.
Exactly – because this is the first whim. If you execute
uniq without options, it behaves as if you have used the option
-u (unique lines). This tells
uniq to print only the unique lines from the file. The reason that you see duplicate lines is because, for
uniq to consider a line as a duplicate, it must be adjacent to its duplicate, which is where
When we sort the file, it groups the double lines and
uniq treats them as duplicates. We use
sort in the file, pipe the sorted output to
uniq and then direct the last output to
To do this, we type the following:
sort dust-my-broom.txt | uniq | less
A sorted list of rules appears in
The line: "I believe I will dust my broom", certainly appears more than once in the issue. It is even repeated twice in the first four lines of the song.
So why does it appear in a list of unique rules? Because the first time a line appears in the file, it is unique; only the following entries are duplicates. You can see it as the first appearance of each unique rule.
again and redirect the output to a new file. In this way we do not have to use
sort in every assignment.
We type the following command:
sort dust-my-broom.txt> sort.txt
sort.txt "command in a terminal window." width = "646" height = "57" src = "/ pagespeed_static / 1.JiBnMqyl6S.gif" onload = "pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" onerror = "this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);" />
Now we have a pre-sorted file to work with.
You can use the
-c (number) option to print the number of times each line appears in a file.
Type the following command:
uniq -c sort.txt | less
Each line starts with the number of times that line appears in the file. However, you will notice that the first line is empty. This indicates that there are five blank lines in the file.
If you want the output sorted in numerical order, you can enter the output from
sort . In our example, we will use the options
-r (reverse) and
-n (numeric sorting), and pass the results to
less .  We type the following:
uniq -c sort.txt | sort -rn | less
The list is sorted in descending order based on the frequency of the appearance of each line.
Show double lines only
If you only want to see the lines that are repeated in a file, you can use the
-d (repeated) option. It does not matter how often a line is duplicated in a file, it is only mentioned once.
To use this option, we type the following:
uniq -d sort.txt
The double lines are displayed for us. You see the blank line at the top, which means that the file contains duplicate blank lines – it is not a space left by
uniq to cosmetically compensate the entry.
We can also combine the options
-d (repeated) and
-c (count) and output by
sort . This gives us a sorted list of the rules that appear at least twice.
Type the following to use this option:
uniq -d -c sort.txt | sort -rn
Show all duplicated rules
If you want to see a list of each duplicated rule, as well as an entry for every time a rule appears in the file, you can use the option
-D (all double lines).
To use this option, type the following:
uniq -D sort.txt | less
The list contains an entry for each double line.
If you use the
- group option, each duplicated line is printed with a blank line before (
prepend ) or after each group (
add ), or both before and after (
both ) each group.
add as our modifier, so we type the following:
uniq --group = add sorted.txt | less
The groups are separated by blank lines for easier reading.
Checking a certain number of characters
Checking by default
uniq the full length of each line. If you want to limit the checks to a certain number of characters, you can use the
-w (check boxes) option.
In this example, we repeat the last command, but limit the equations to the first three characters. For this we type the following command:
uniq -w 3 --group = append sort.txt | less
The results and groupings that we receive are very different.
All lines that begin with "I b" are grouped because those parts of the lines are identical, so they are considered duplicates.
Similarly, all lines that begin with "I & # 39; m & # 39; are treated as duplicates, even if the rest of the text is different.
Ignore a certain number of characters
There are some cases where it may be useful to skip a certain number of characters at the beginning of each line, for example, when lines in a file are numbered, or suppose you need
uniq to jump over a timestamp and check the lines from character six instead of from the first character.
Below is a version of our sorted file with numbered lines.
If we want
uniq to start the comparison checks at character three, we can use the option
-s (skip characters) by typing the following:  uniq -s 3 -d -c number ed.txt
The lines are detected as duplicates and counted correctly. Note that the line numbers shown are those of the first copy of each duplicate.
You can also skip fields (a series of characters and some white space) instead of characters. We use the option
-f (fields) to tell
uniq which fields should be ignored.
We type the following to tell
uniq the first field:
uniq -f 1 -d -c numbered.txt
We get the same results as when we told
uniq to skip three characters at the start of each line.
is uniq case sensitive. If the same letter is covered and displayed in lowercase,
uniq considers the rules to be different.
For example, view the output of the following command:
uniq -d -c sort.txt | sort -rn
The rules "I believe I will dust my broom" and "I believe I will dust my broom" are not treated as duplicates because of the difference in case on the "B "believe in".
If we include the option
-i (ignore capital), these lines are treated as duplicates. We type the following:
uniq -d -c -i sorted.txt | sort -rn
The rules are now treated as duplicates and grouped.
Linux puts a multitude of special utilities at your disposal. Like many of them,
uniq is not a tool that you use every day.
That is why much of Linux becoming proficient is remembering which tool will solve your current problem, and where you can find it again. However, if you practice, you are well on your way.
Or you can always just find How-To Geek – we probably have an article about this.