How to Iterate Through Rows of a DataFrame in Pandas
-
index
Attribute to Iterate Through Rows in Pandas DataFrame -
loc[]
Method to Iterate Through Rows of DataFrame in Python -
iloc[]
Method to Iterate Through Rows of DataFrame in Python -
pandas.DataFrame.iterrows()
to Iterate Over Rows Pandas -
pandas.DataFrame.itertuples
to Iterate Over Rows Pandas -
pandas.DataFrame.apply
to Iterate Over Rows Pandas
We can loop through rows of a Pandas DataFrame using the index
attribute of the DataFrame. We can also iterate through rows of DataFrame Pandas using loc()
, iloc()
, iterrows()
, itertuples()
, iteritems()
and apply()
methods of DataFrame objects.
We will use the below dataframe as an example in the following sections.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
print(df)
Output:
Date Income_1 Income_2
0 April-10 10 20
1 April-11 20 30
2 April-12 10 10
3 April-13 15 5
4 April-14 10 40
5 April-16 12 13
index
Attribute to Iterate Through Rows in Pandas DataFrame
Pandas DataFrame index
attribute gives a range object from the top row to the bottom row of a DataFrame. We can use the range to iterate over rows in Pandas.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
for i in df.index:
print(
"Total income in "
+ df["Date"][i]
+ " is:"
+ str(df["Income_1"][i] + df["Income_2"][i])
)
Output:
Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25
It adds Income_1
and Income_2
of each row and prints total income.
loc[]
Method to Iterate Through Rows of DataFrame in Python
The loc[]
method is used to access one row at a time. When we use the loc[]
method inside the loop through DataFrame, we can iterate through rows of DataFrame.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
for i in range(len(df)):
print(
"Total income in "
+ df.loc[i, "Date"]
+ " is:"
+ str(df.loc[i, "Income_1"] + df.loc[i, "Income_2"])
)
Output:
Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25
Here, range(len(df))
generates a range object to loop over entire rows in the DataFrame.
iloc[]
Method to Iterate Through Rows of DataFrame in Python
Pandas DataFrame iloc
attribute is also very similar to loc
attribute. The only difference between loc
and iloc
is that in loc
we have to specify the name of row or column to be accessed while in iloc
we specify the index of the row or column to be accessed.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
for i in range(len(df)):
print(
"Total income in " + df.iloc[i, 0] + " is:" + str(df.iloc[i, 1] + df.iloc[i, 2])
)
Output:
Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25
Here the index 0
represents the 1st column of DataFrame i.e. Date
, the index 1
represents the Income_1
column and index 2
represents the Income_2
column.
pandas.DataFrame.iterrows()
to Iterate Over Rows Pandas
pandas.DataFrame.iterrows()
returns the index of the row and the entire data of the row as a Series
. Hence, we could use this function to iterate over rows in Pandas DataFrame.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
for index, row in df.iterrows():
print(
"Total income in "
+ row["Date"]
+ " is:"
+ str(row["Income_1"] + row["Income_2"])
)
Output:
Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25
pandas.DataFrame.itertuples
to Iterate Over Rows Pandas
pandas.DataFrame.itertuples
returns an object to iterate over tuples for each row with the first field as an index and remaining fields as column values. Hence, we could also use this function to iterate over rows in Pandas DataFrame.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
for row in df.itertuples():
print("Total income in " + row.Date + " is:" + str(row.Income_1 + row.Income_2))
Output:
Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25
pandas.DataFrame.apply
to Iterate Over Rows Pandas
pandas.DataFrame.apply
returns a DataFrame
as a result of applying the given function along the given axis of the DataFrame.
Syntax:
DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)
Where, func
represents the function to be applied and axis
represents the axis along which the function is applied. We can use axis=1
or axis = 'columns'
to apply function to each row.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]
df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})
print(
df.apply(
lambda row: "Total income in "
+ row["Date"]
+ " is:"
+ str(row["Income_1"] + row["Income_2"]),
axis=1,
)
)
Output:
0 Total income in April-10 is:30
1 Total income in April-11 is:50
2 Total income in April-12 is:20
3 Total income in April-13 is:20
4 Total income in April-14 is:50
5 Total income in April-16 is:25
dtype: object
Here, lambda
keyword is used to define an inline function that is applied to each row.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedInRelated Article - Pandas DataFrame
- How to Get Pandas DataFrame Column Headers as a List
- How to Delete Pandas DataFrame Column
- How to Convert Pandas Column to Datetime
- How to Convert a Float to an Integer in Pandas DataFrame
- How to Sort Pandas DataFrame by One Column's Values
- How to Get the Aggregate of Pandas Group-By and Sum
Related Article - Pandas DataFrame Row
- How to Get the Row Count of a Pandas DataFrame
- How to Randomly Shuffle DataFrame Rows in Pandas
- How to Filter Dataframe Rows Based on Column Values in Pandas
- How to Get Index of All Rows Whose Particular Column Satisfies Given Condition in Pandas
- How to Find Duplicate Rows in a DataFrame Using Pandas