Pandas DataFrame.apply: Apply a Function to Rows or Columns of a DataFrame
Pandas DataFrame.apply
The DataFrame.apply method in pandas is used to apply a function along the axis (rows or columns) of a DataFrame. This method allows you to perform complex operations on DataFrame elements, rows, or columns.
Syntax
The syntax for DataFrame.apply is:
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), by_row='compat', engine='python', engine_kwargs=None, **kwargs)Here, DataFrame refers to the pandas DataFrame on which the function is applied.
Parameters
| Parameter | Description |
|---|---|
func | The function to apply to each row or column of the DataFrame. This can be a built-in function, a lambda function, or a custom function. |
axis | Specifies the axis along which the function is applied. Use 0 or 'index' to apply the function to each column, and 1 or 'columns' to apply the function to each row. Defaults to 0. |
raw | Determines whether the function receives a Series (False) or a NumPy array (True). Defaults to False. |
result_type | Specifies the type of the result. Options include 'expand', 'reduce', 'broadcast', or None. Defaults to None. |
args | A tuple of additional arguments to pass to the function. |
by_row | If 'compat', the function is applied to each row or column as a Series. If False, the function is applied to the entire DataFrame at once. Defaults to 'compat'. |
engine | Specifies the engine to use for computation. Options include 'python' or 'numba'. Defaults to 'python'. |
engine_kwargs | A dictionary of additional keyword arguments to pass to the engine. |
**kwargs | Additional keyword arguments to pass to the function. |
Returns
A Series or DataFrame resulting from applying the function to the specified axis of the DataFrame.
Examples
Applying a Function to Each Column of a DataFrame
This example demonstrates how to use apply to calculate the sum of each column in a DataFrame.
Python Program
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
# Define a function to calculate the sum of a column
def column_sum(column):
return column.sum()
# Apply the function to each column of the DataFrame
result = df.apply(column_sum)
print(result)Output
A 6
B 15
C 24
dtype: int64Applying a Function to Each Row of a DataFrame
This example shows how to use apply to calculate the sum of each row in a DataFrame.
Python Program
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
# Define a function to calculate the sum of a row
def row_sum(row):
return row.sum()
# Apply the function to each row of the DataFrame
result = df.apply(row_sum, axis=1)
print(result)Output
0 12
1 15
2 18
dtype: int64Applying a Lambda Function to a DataFrame
This example demonstrates how to use a lambda function with apply to multiply each element in a DataFrame by 2.
Python Program
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
# Apply a lambda function to each element of the DataFrame
result = df.apply(lambda x: x * 2)
print(result)Output
A B C
0 2 8 14
1 4 10 16
2 6 12 18Applying a Function with Additional Arguments to a DataFrame
This example shows how to pass additional arguments to a function using apply. The function adds a specified value to each element in the DataFrame.
Python Program
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
# Define a function to add a value to each element
def add_value(element, value):
return element + value
# Apply the function with an additional argument
result = df.apply(add_value, args=(10,))
print(result)Output
A B C
0 11 14 17
1 12 15 18
2 13 16 19Summary
In this tutorial, we explored the DataFrame.apply method in pandas. Key takeaways include:
- Using
applyto apply a function to each row or column of a DataFrame. - Applying lambda functions for quick, inline operations.
- Passing additional arguments to the function using the
argsparameter. - Understanding the flexibility of
applyfor performing complex operations on DataFrames. - Using the
by_row,engine, andengine_kwargsparameters for advanced use cases.