Pandas iterrows() - Iterate over DataFrame Rows - Examples
Pandas - Iterate over Rows - iterrows()
To iterate over rows of a Pandas DataFrame, use DataFrame.iterrows() function which returns an iterator yielding index and row data for each row.
In this tutorial, we will go through examples demonstrating how to iterate over rows of a DataFrame using iterrows().
Syntax of iterrows()
The syntax of iterrows() is
DataFrame.iterrows(self)
iterrows yields
- index - index of the row in DataFrame. This could be a label for single index, or tuple of label for multi-index.
- data - data is the row data as Pandas Series.
- it - it is the generator that iterates over the rows of DataFrame.
Examples
1. Pandas iterrows() - Iterate over rows of DataFrame
In this example, we will initialize a DataFrame with four rows and iterate through them using Python For Loop and iterrows() function.
Python Program
import pandas as pd
#create dataframe
df_marks = pd.DataFrame({
'name': ['apple', 'banana', 'orange', 'mango'],
'calories': [68, 74, 77, 78]})
#iterate through each row of dataframe
for index, row in df_marks.iterrows():
print(index, ': ', row['name'], 'has', row['calories'], 'calories.')
During each iteration, we are able to access the index of row, and the contents of row.
Explanation
- The program imports the
pandas
library, which is essential for creating and manipulating tabular data. - A DataFrame named
df_marks
is created using thepd.DataFrame()
function. It contains two columns:'name'
, listing the names of fruits, and'calories'
, listing their respective calorie values. - The
iterrows()
method is used to iterate through each row of the DataFrame. This method generates an iterator that yields anindex
(representing the row index) and arow
(a Series object containing the data of that row). - Inside the loop, each row's
'name'
and'calories'
values are accessed using the column names as keys. - A formatted string is printed for each row, showing the index, the fruit's name, and its corresponding calorie value in a descriptive sentence.
- The output for this program will look like:
0 : apple has 68 calories.
1 : banana has 74 calories.
2 : orange has 77 calories.
3 : mango has 78 calories.
Output
0 : apple has 68 calories.
1 : banana has 74 calories.
2 : orange has 77 calories.
3 : mango has 78 calories.
Please note that the calories information is not factual. The example is for demonstrating the usage of iterrows().
2. iterrows() yeilds index, Series
In the previous example, we have seen that we can access index and row data.
In this example, we will investigate the type of row data that iterrows() returns during iteration.
Python Program
import pandas as pd
#create dataframe
df_marks = pd.DataFrame({
'name': ['apple', 'banana', 'orange', 'mango'],
'calories': [68, 74, 77, 78]})
#iterate through each row of dataframe
for index, row in df_marks.iterrows():
print(type(index), type(row))
Explanation
- The program imports the
pandas
library, which is used for data analysis and manipulation. - A DataFrame named
df_marks
is created using thepd.DataFrame()
function. It contains two columns:'name'
, which lists fruit names, and'calories'
, which lists the corresponding calorie values. - The
iterrows()
method is used to iterate over each row in the DataFrame. This method generates an iterator that yields a tuple for each row. The tuple contains two elements: index
: The index of the current row.row
: A Series object representing the data of the current row.- The program iterates through the rows of the DataFrame using a
for
loop, printing the data types of theindex
androw
for each iteration. - The output shows that
index
is of typeint
androw
is of typepandas.Series
.
Output
<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>
We did not provide any index to the DataFrame, so the default index would be integers from zero and incrementing by one. So, iterrows() returned index as integer.
iterrows() returns the row data as Pandas Series.
Summary
In this Pandas Tutorial, we used DataFrame.iterrows() to iterate over the rows of Pandas DataFrame, with the help of detailed example programs.