Pandas DataFrame.iterrows


Pandas DataFrame.iterrows

The DataFrame.iterrows method in pandas is used to iterate over rows of a DataFrame as (index, Series) pairs. This is a convenient way to perform row-wise operations on a DataFrame.


Syntax

The syntax for DataFrame.iterrows is:

DataFrame.iterrows()

Here, DataFrame refers to the pandas DataFrame over which the iteration is performed.


Returns

An iterator that yields tuples of the form (index, Series), where:

  • index: The index of the row.
  • Series: A pandas Series containing the data for the row.

Examples

Iterating Over Rows in a DataFrame

Use iterrows to iterate through each row of a DataFrame and access its data.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Iterate over rows
print("Iterating over rows:")
for index, row in df.iterrows():
    print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}, Salary: {row['Salary']}")

Output

Iterating over rows:
Index: 0, Name: Arjun, Age: 25, Salary: 70000.5
Index: 1, Name: Ram, Age: 30, Salary: 80000.0
Index: 2, Name: Priya, Age: 35, Salary: 90000.0

Modifying Rows Using iterrows

You can use iterrows to modify or process rows dynamically. However, changes to the row object won't affect the original DataFrame.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Add a custom message for each row
print("Adding custom messages dynamically:")
for index, row in df.iterrows():
    print(f"{row['Name']} earns {row['Salary']} at the age of {row['Age']}.")

Output

Adding custom messages dynamically:
Arjun earns 70000.5 at the age of 25.
Ram earns 80000.0 at the age of 30.
Priya earns 90000.0 at the age of 35.

Using iterrows to Create a New Column

Generate new column values based on row-wise operations using iterrows.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Add a new column based on row data
discounted_salaries = []
for index, row in df.iterrows():
    discounted_salary = row['Salary'] * 0.9  # Apply a 10% discount
    discounted_salaries.append(discounted_salary)

df['Discounted Salary'] = discounted_salaries
print("DataFrame with Discounted Salaries:")
print(df)

Output

DataFrame with Discounted Salaries:
    Name  Age   Salary  Discounted Salary
0  Arjun   25  70000.5           63000.45
1    Ram   30  80000.0           72000.00
2   Priya   35  90000.0           81000.00

Performance Consideration

iterrows is not the most efficient method for processing large DataFrames because it creates a new Series object for each row. For better performance, consider vectorized operations or itertuples.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Use itertuples for better performance
print("Using itertuples for better performance:")
for row in df.itertuples():
    print(f"Name: {row.Name}, Age: {row.Age}, Salary: {row.Salary}")

Output

Using itertuples for better performance:
Name: Arjun, Age: 25, Salary: 70000.5
Name: Ram, Age: 30, Salary: 80000.0
Name: Priya, Age: 35, Salary: 90000.0

Summary

In this tutorial, we explored the DataFrame.iterrows method in pandas. Key takeaways include:

  • Using iterrows to iterate over rows in a DataFrame as (index, Series) pairs.
  • Modifying rows dynamically (though changes won't affect the original DataFrame).
  • Considering performance implications for large DataFrames.

While iterrows is useful for row-wise operations, for larger datasets, vectorized operations or itertuples are often more efficient.


Python Libraries