How to Sort Pandas DataFrame by One Column's Values
We will introduce the pandas.DataFrame.sort_values
method to sort the DataFrame
values, and its options like ascending
to specify the sorting order and na_position
that determines the position of NaN
in the sorted result.
Consider the following DataFrame
,
import pandas as pd
df = pd.DataFrame(
{
"col1": ["g", "t", "n", "w", "n", "g"],
"col2": [5, 2, 5, 1, 3, 6],
"col3": [0, 7, 2, 8, 1, 2],
}
)
print(df)
If you run this code, you will get the output as following which is not sorted yet.
col1 col2 col3
0 g 5 0
1 t 2 7
2 n 5 2
3 w 1 8
4 n 3 1
5 g 6 2
Now we could sort the DataFrame
with the codes below.
import pandas as pd
df = pd.DataFrame(
{
"col1": ["g", "t", "n", "w", "n", "g"],
"col2": [5, 2, 5, 1, 3, 6],
"col3": [0, 7, 2, 8, 1, 2],
}
)
print(df.sort_values(by=["col1"]))
We sort DataFrame
by col1
. After running the above code, you will get the following output.
col1 col2 col3
0 g 5 0
5 g 6 2
2 n 5 2
4 n 3 1
1 t 2 7
3 w 1 8
We can use more than one column for sorting as well. Let’s change the last line of the above codes as follow,
print(df.sort_values(by=["col1", "col2"]))
Output:
col1 col2 col3
0 g 5 0
5 g 6 2
4 n 3 1
2 n 5 2
1 t 2 7
3 w 1 8
Now DataFrame
is further sorted by col2
as well.
DataFrame
Sorting Order - Argument ascending
By default, sorting is in ascending order. To sort DataFrame
in descending order, we need to set the flag ascending=False
.
print(df.sort_values(by=["col1", "col2"], ascending=False))
Output:
col1 col2 col3
3 w 1 8
1 t 2 7
2 n 5 2
4 n 3 1
5 g 6 2
0 g 5 0
DataFrame
Sorting Order - Argument na_position
na_position
specifies the position of NaN
after sorting i.e. last
puts NaN
at the end. Its default value is first
that puts NaN
at the beginning of the sorted result.
Consider the following DataFrame
,
import numpy as np
import pandas as pd
s = pd.Series([np.nan, 2, 4, 10, 7])
print(s.sort_values(na_position="last"))
After running the code, we will get the following output.
1 2.0
2 4.0
4 7.0
3 10.0
0 NaN