How to Sort Data Based on the Second Column of a File in Bash
This article educates about sorting data based on the second column of a file in bash.
Overview of sort
Command in Bash
A file is sorted using the sort
command, which places the records in a specific order. By default, the sort
command sorts files assuming they contain ASCII
data.
Numerical sorting is another option available through the sort
command. The sort
command includes the following features:
- Lines that begin with a number are displayed before lines that begin with letters.
- Lines that begin with a letter that comes later in alphabetical order will come after lines that start with an earlier letter.
- Lines that begin with a letter in uppercase are displayed before those that start with the same letter in lowercase.
Suppose we create a data file with the name filename.csv
. And open that file in bash using the cat
command.
Syntax to Open a File in Bash :
$ cat filename.csv
Sort a Mixed case, Uppercase and Lowercase File:
In a mixed file with both uppercase and lowercase letters, the uppercase letters are sorted first, then the lowercase characters.
There are numerous choices for sorting, including:
-
-k
- This option helps to sort data based on any specified column. For example, the argument-k5
will sort starting with the fifth field in each line, not the fifth character in each line (notice that column here is defined as a comma delimited field). -
-n
- This option designates a numericsort
, which means that the column should be read as a row of numbers rather than text. -
-r
- The sorting order is reversed with the-r
option. Another way to write it is-reverse
. -
-i
- This option ignores characters that cannot be printed. Another way to write it is-ignore-nonprinting
. -
-b
- This option doesn’t consider leading blank spaces, which is helpful because the number of rows is calculated using white spaces. Another way to write it isignore-leading-blanks
. -
-f
- This option disregards capitalization,A==a
. Another way to write it isignore-case
. This option causes the preprocessing to use an operator other than space:-t [new separator]
.Another way to write it is
-field-separator
.
Sort Data Based on Second Column of a File in Bash
Using the -k
option, sort
enables us to sort
a file by columns. Let’s begin by making a file with multiple columns. In sorting, a single space is used to divide each column.
For example, we use -k 2
to sort
on the second column. We have already created a file named filename.csv
. The data available in that file is provided below:
Bonie,22
Julie,23
Henry,15
Flamingo,34
Peter,11
Use the cat
command to view the items in the file before sorting.
$ cat filename.csv
OUTPUT:
Bonie,22
Julie,23
Henry,15
Flamingo,34
Peter,11
Syntax to Sort Data Based on Second Column of a File in Bash:
sort -k2 -n filename.csv
Output after sorting the Second Column:
Bonie,22
Flamingo,34
Henry,15
Julie,23
Peter,11