Pandas iterrows() - Iterate over DataFrame Rows - Examples


Pandas - Iterate over Rows - iterrows()

To iterate over rows of a Pandas DataFrame, use DataFrame.iterrows() function which returns an iterator yielding index and row data for each row.

https://youtu.be/Z8myvjgSSJs

In this tutorial, we will go through examples demonstrating how to iterate over rows of a DataFrame using iterrows().


Syntax of iterrows()

The syntax of iterrows() is

DataFrame.iterrows(self)

iterrows yields

  • index - index of the row in DataFrame. This could be a label for single index, or tuple of label for multi-index.
  • data - data is the row data as Pandas Series.
  • it - it is the generator that iterates over the rows of DataFrame.

Examples

1. Pandas iterrows() - Iterate over rows of DataFrame

In this example, we will initialize a DataFrame with four rows and iterate through them using Python For Loop and iterrows() function.

Python Program

import pandas as pd

#create dataframe
df_marks = pd.DataFrame({
    'name': ['apple', 'banana', 'orange', 'mango'],
	'calories': [68, 74, 77, 78]})

#iterate through each row of dataframe
for index, row in df_marks.iterrows():
    print(index, ': ', row['name'], 'has', row['calories'], 'calories.')

During each iteration, we are able to access the index of row, and the contents of row.

Explanation

  1. The program imports the pandas library, which is essential for creating and manipulating tabular data.
  2. A DataFrame named df_marks is created using the pd.DataFrame() function. It contains two columns: 'name', listing the names of fruits, and 'calories', listing their respective calorie values.
  3. The iterrows() method is used to iterate through each row of the DataFrame. This method generates an iterator that yields an index (representing the row index) and a row (a Series object containing the data of that row).
  4. Inside the loop, each row's 'name' and 'calories' values are accessed using the column names as keys.
  5. A formatted string is printed for each row, showing the index, the fruit's name, and its corresponding calorie value in a descriptive sentence.
  6. The output for this program will look like:
    • 0 : apple has 68 calories.
    • 1 : banana has 74 calories.
    • 2 : orange has 77 calories.
    • 3 : mango has 78 calories.

Output

0 :  apple has 68 calories.
1 :  banana has 74 calories.
2 :  orange has 77 calories.
3 :  mango has 78 calories.

Please note that the calories information is not factual. The example is for demonstrating the usage of iterrows().


2. iterrows() yeilds index, Series

In the previous example, we have seen that we can access index and row data.

In this example, we will investigate the type of row data that iterrows() returns during iteration.

Python Program

import pandas as pd

#create dataframe
df_marks = pd.DataFrame({
    'name': ['apple', 'banana', 'orange', 'mango'],
	'calories': [68, 74, 77, 78]})

#iterate through each row of dataframe
for index, row in df_marks.iterrows():
    print(type(index), type(row))

Explanation

  1. The program imports the pandas library, which is used for data analysis and manipulation.
  2. A DataFrame named df_marks is created using the pd.DataFrame() function. It contains two columns: 'name', which lists fruit names, and 'calories', which lists the corresponding calorie values.
  3. The iterrows() method is used to iterate over each row in the DataFrame. This method generates an iterator that yields a tuple for each row. The tuple contains two elements:
    • index: The index of the current row.
    • row: A Series object representing the data of the current row.
  4. The program iterates through the rows of the DataFrame using a for loop, printing the data types of the index and row for each iteration.
  5. The output shows that index is of type int and row is of type pandas.Series.

Output

<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>
<class 'int'> <class 'pandas.core.series.Series'>

We did not provide any index to the DataFrame, so the default index would be integers from zero and incrementing by one. So, iterrows() returned index as integer.

iterrows() returns the row data as Pandas Series.


Summary

In this Pandas Tutorial, we used DataFrame.iterrows() to iterate over the rows of Pandas DataFrame, with the help of detailed example programs.




Python Libraries