How to Apply a Function to Multiple Columns in Pandas DataFrame
This article will introduce how to apply a function to multiple columns in Pandas DataFrame. We will use the same DataFrame as below in all the example codes.
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[5, 6, 7, 8], [1, 9, 12, 14], [4, 8, 10, 6]], columns=["a", "b", "c", "d"]
)
Output:
a b c d
0 5 6 7 8
1 1 9 12 14
2 4 8 10 6
Use apply()
to Apply Functions to Columns in Pandas
The apply()
method allows to apply a function for a whole DataFrame, either across columns or rows. We set the parameter axis
as 0 for rows and 1 for columns.
In the examples shown below, we will increment the value of a sample DataFrame using the function which we defined earlier:
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[5, 6, 7, 8], [1, 9, 12, 14], [4, 8, 10, 6]], columns=["a", "b", "c", "d"]
)
def x(a):
return a + 1
df_new = df.apply(x, axis=1)
print("The original dataframe:")
print(df)
print("The new dataframe:")
print(df_new)
Output:
The original dataframe:
a b c d
0 5 6 7 8
1 1 9 12 14
2 4 8 10 6
The new dataframe:
a b c d
0 6 7 8 9
1 2 10 13 15
2 5 9 11 7
We can also apply a function to multiple columns, as shown below:
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[5, 6, 7, 8], [1, 9, 12, 14], [4, 8, 10, 6]], columns=["a", "b", "c", "d"]
)
print("The original dataframe:")
print(df)
def func(x):
return x[0] + x[1]
df["e"] = df.apply(func, axis=1)
print("The new dataframe:")
print(df)
Output:
The original dataframe:
a b c d
0 5 6 7 8
1 1 9 12 14
2 4 8 10 6
The new dataframe:
a b c d e
0 5 6 7 8 11
1 1 9 12 14 10
2 4 8 10 6 12
The new appended e
column is the sum of data in column a
and b
. The DataFrame itself is the hidden argument passed to the function. The columns could be accessed with the index like in the above example, or with the column name, as shown below.
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[5, 6, 7, 8], [1, 9, 12, 14], [4, 8, 10, 6]], columns=["a", "b", "c", "d"]
)
print("The original dataframe:")
print(df)
df["e"] = df.apply(lambda x: x.a + x.b, axis=1)
print("The new dataframe:")
print(df)
It performs the same operation as the above example. We use a lambda
function here. x.a
and x.b
refer to the column a
and b
in the dataframe.
Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.
LinkedIn