Pandas DataFrame.memory_usage
Pandas DataFrame.memory_usage
The DataFrame.memory_usage method in pandas is used to determine the memory usage of a DataFrame's index and columns. This method is useful for optimizing memory usage when working with large datasets.
Syntax
The syntax for DataFrame.memory_usage is:
DataFrame.memory_usage(index=True, deep=False)Here, DataFrame refers to the pandas DataFrame whose memory usage is being calculated.
Parameters
| Parameter | Description |
|---|---|
index | If True, includes the memory usage of the DataFrame's index. Defaults to True. |
deep | If True, performs a deep introspection to include memory usage of objects within object dtype columns. Defaults to False. |
Returns
A pandas Series with the memory usage of each column (in bytes). If index=True, the index memory usage is included.
Examples
Basic Memory Usage
Use memory_usage to calculate the memory usage of each column in a DataFrame, including the index.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Calculate memory usage
print("Memory Usage:")
print(df.memory_usage())Output
Memory Usage:
Index 128
Name 192
Age 24
Salary 24
dtype: int64Excluding Index Memory Usage
Set index=False to exclude the memory usage of the DataFrame's index.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Exclude index memory usage
print("Memory Usage (Excluding Index):")
print(df.memory_usage(index=False))Output
Memory Usage (Excluding Index):
Name 192
Age 24
Salary 24
dtype: int64Deep Introspection
Enable deep=True to calculate the memory usage of objects within object dtype columns.
Python Program
import pandas as pd
# Create a DataFrame with object dtype columns
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)
# Perform deep introspection
print("Deep Memory Usage:")
print(df.memory_usage(deep=True))Output
Deep Memory Usage:
Index 128
Name 192
Age 24
Salary 24
dtype: int64Comparing Memory Usage for Different DataFrames
You can compare memory usage of multiple DataFrames to identify potential optimizations.
Python Program
import pandas as pd
# Create two DataFrames with different data types
data1 = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000.5, 80000.0, 90000.0]
}
data2 = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': ['70000.5', '80000.0', '90000.0']
}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
# Compare memory usage
print("Memory Usage of Numeric DataFrame:")
print(df1.memory_usage(deep=True))
print("\nMemory Usage of Object DataFrame:")
print(df2.memory_usage(deep=True))Output
Memory Usage of Numeric DataFrame:
Index 128
Name 192
Age 24
Salary 24
dtype: int64
Memory Usage of Object DataFrame:
Index 128
Name 192
Age 24
Salary 176
dtype: int64Summary
In this tutorial, we explored the DataFrame.memory_usage method in pandas. Key takeaways include:
- Using
memory_usageto calculate memory usage of a DataFrame - Excluding the index memory usage
- Performing deep introspection for object dtype columns
The DataFrame.memory_usage method is a valuable tool for understanding and optimizing memory usage in pandas DataFrames.