Pandas DataFrame.iterrows
Pandas DataFrame.iterrows
The DataFrame.iterrows method in pandas is used to iterate over rows of a DataFrame as (index, Series) pairs. This is a convenient way to perform row-wise operations on a DataFrame.
Syntax
The syntax for DataFrame.iterrows is:
DataFrame.iterrows()Here, DataFrame refers to the pandas DataFrame over which the iteration is performed.
Returns
An iterator that yields tuples of the form (index, Series), where:
index: The index of the row.Series: A pandas Series containing the data for the row.
Examples
Iterating Over Rows in a DataFrame
Use iterrows to iterate through each row of a DataFrame and access its data.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Iterate over rows
print("Iterating over rows:")
for index, row in df.iterrows():
print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}, Salary: {row['Salary']}")Output
Iterating over rows:
Index: 0, Name: Arjun, Age: 25, Salary: 70000.5
Index: 1, Name: Ram, Age: 30, Salary: 80000.0
Index: 2, Name: Priya, Age: 35, Salary: 90000.0Modifying Rows Using iterrows
You can use iterrows to modify or process rows dynamically. However, changes to the row object won't affect the original DataFrame.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Add a custom message for each row
print("Adding custom messages dynamically:")
for index, row in df.iterrows():
print(f"{row['Name']} earns {row['Salary']} at the age of {row['Age']}.")Output
Adding custom messages dynamically:
Arjun earns 70000.5 at the age of 25.
Ram earns 80000.0 at the age of 30.
Priya earns 90000.0 at the age of 35.Using iterrows to Create a New Column
Generate new column values based on row-wise operations using iterrows.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Add a new column based on row data
discounted_salaries = []
for index, row in df.iterrows():
discounted_salary = row['Salary'] * 0.9 # Apply a 10% discount
discounted_salaries.append(discounted_salary)
df['Discounted Salary'] = discounted_salaries
print("DataFrame with Discounted Salaries:")
print(df)Output
DataFrame with Discounted Salaries:
Name Age Salary Discounted Salary
0 Arjun 25 70000.5 63000.45
1 Ram 30 80000.0 72000.00
2 Priya 35 90000.0 81000.00Performance Consideration
iterrows is not the most efficient method for processing large DataFrames because it creates a new Series object for each row. For better performance, consider vectorized operations or itertuples.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Use itertuples for better performance
print("Using itertuples for better performance:")
for row in df.itertuples():
print(f"Name: {row.Name}, Age: {row.Age}, Salary: {row.Salary}")Output
Using itertuples for better performance:
Name: Arjun, Age: 25, Salary: 70000.5
Name: Ram, Age: 30, Salary: 80000.0
Name: Priya, Age: 35, Salary: 90000.0Summary
In this tutorial, we explored the DataFrame.iterrows method in pandas. Key takeaways include:
- Using
iterrowsto iterate over rows in a DataFrame as (index, Series) pairs. - Modifying rows dynamically (though changes won't affect the original DataFrame).
- Considering performance implications for large DataFrames.
While iterrows is useful for row-wise operations, for larger datasets, vectorized operations or itertuples are often more efficient.