Pandas DataFrame DataFrame.apply() Function
-
Syntax of
pandas.DataFrame.apply()
: -
Example Codes:
DataFrame.apply()
Method -
Example Codes: Apply Function to Each Column With
DataFrame.apply()
-
Example Codes: Apply Function to Each Row With
DataFrame.apply()
Method -
Example Codes:
DataFrame.apply()
Method Withresult_type
Parameter
pandas.DataFrame.apply()
function applies the input function to every element along row or column of the caller Pandas DataFrame
.
Syntax of pandas.DataFrame.apply()
:
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)
Parameters
func |
The function to be applied to each row or column |
axis |
apply function along the row (axis=0) or column (axis=1) |
raw |
Boolean. Row/Column passed as a Series object(raw=False ) or ndarray object(raw=True ) |
result_type |
{'expand' , 'reduce' , 'broadcast' , 'None' } type of the output of operation only applicable for axis=1 (columns)New in version 0.23.0 |
args |
Positional arguments for the function func . |
**kwds |
Keyword arguments for the function func . |
Return
It returns the DataFrame
after applying the input function along the specified axis.
Example Codes: DataFrame.apply()
Method
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: x**2)
print(modified_df)
Output:
X Y
0 1 4
1 2 1
2 3 8
X Y
0 1 16
1 4 1
2 9 64
We apply a lambda
function lambda x: x**2
to all the elements of DataFrame
using DataFrame.apply()
method.
Lambda functions are simpler ways to define functions in Python.
lambda x: x**2
represents the function that takes x
as an input and returns x**2
as output.
Example Codes: Apply Function to Each Column With DataFrame.apply()
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(np.sum)
print("Modified DataFrame")
print(modified_df)
Output:
Original DataFrame
X Y
0 1 4
1 2 1
2 3 8
Modified DataFrame
X 6
Y 13
dtype: int64
Here, np.sum
is applied to each column because axis=0
(default value) in this case.
So, we get the sum of elements in each column after using the df.apply()
method.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: (x**2) if x.name == 'X' else x)
print(modified_df)
Output:
X Y
0 1 4
1 2 1
2 3 8
X Y
0 1 4
1 4 1
2 9 8
If we wish to apply the function only to certain columns, we modify our function definition using the if statement to filter columns. In the example, the function modifies the value of only the columns with the column name X
.
Example Codes: Apply Function to Each Row With DataFrame.apply()
Method
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(np.sum, axis=1)
print("Modified DataFrame")
print(modified_df)
Output:
Original DataFrame
X Y
0 1 4
1 2 1
2 3 8
Modified DataFrame
0 5
1 3
2 11
dtype: int64
Here, np.sum
is applied to each row at a time as we have set axis=1
in this case.
So, we get the sum of individual elements of all rows after using the df.apply()
method.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: (x**2) if x.name in [0,1] else x,
axis=1)
print(modified_df)
Output:
X Y
0 1 4
1 2 1
2 3 8
X Y
0 1 16
1 4 1
2 3 8
If we wish to apply the function only to certain rows, we modify our function definition using the if statement to filter rows. In the example, the function modifies the values of only the rows with index 0
and 1
i.e. the first and second rows only.
Example Codes: DataFrame.apply()
Method With result_type
Parameter
If we use the default value of result_type
parameter i.e. None
, it will return the DataFrame
without any modification.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(lambda x:[1,1],axis=1)
print("Modified DataFrame")
print(modified_df)
Output:
Original DataFrame
X Y
0 1 4
1 2 1
2 3 8
Modified DataFrame
0 [1, 1]
1 [1, 1]
2 [1, 1]
dtype: object
In the above example, each row is passed into function at a time, and the value of row is set to [1,1]
.
If we wish to modify the type of result after function operates on DataFrame
, we can set values for result_type
according to our needs.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(lambda x:[1,1],
axis=1,
result_type='expand')
print("Modified DataFrame")
print(modified_df)
Output:
Original DataFrame
X Y
0 1 4
1 2 1
2 3 8
Modified DataFrame
0 1
0 1 1
1 1 1
2 1 1
Setting result_type='expand'
will expand all list-like values to columns of a Dataframe.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn