Pandas DataFrame.memory_usage


Pandas DataFrame.memory_usage

The DataFrame.memory_usage method in pandas is used to determine the memory usage of a DataFrame's index and columns. This method is useful for optimizing memory usage when working with large datasets.


Syntax

The syntax for DataFrame.memory_usage is:

DataFrame.memory_usage(index=True, deep=False)

Here, DataFrame refers to the pandas DataFrame whose memory usage is being calculated.


Parameters

ParameterDescription
indexIf True, includes the memory usage of the DataFrame's index. Defaults to True.
deepIf True, performs a deep introspection to include memory usage of objects within object dtype columns. Defaults to False.

Returns

A pandas Series with the memory usage of each column (in bytes). If index=True, the index memory usage is included.


Examples

Basic Memory Usage

Use memory_usage to calculate the memory usage of each column in a DataFrame, including the index.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Calculate memory usage
print("Memory Usage:")
print(df.memory_usage())

Output

Memory Usage:
Index      128
Name        192
Age          24
Salary       24
dtype: int64

Excluding Index Memory Usage

Set index=False to exclude the memory usage of the DataFrame's index.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Exclude index memory usage
print("Memory Usage (Excluding Index):")
print(df.memory_usage(index=False))

Output

Memory Usage (Excluding Index):
Name      192
Age        24
Salary     24
dtype: int64

Deep Introspection

Enable deep=True to calculate the memory usage of objects within object dtype columns.

Python Program

import pandas as pd

# Create a DataFrame with object dtype columns
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
df = pd.DataFrame(data)

# Perform deep introspection
print("Deep Memory Usage:")
print(df.memory_usage(deep=True))

Output

Deep Memory Usage:
Index      128
Name        192
Age          24
Salary       24
dtype: int64

Comparing Memory Usage for Different DataFrames

You can compare memory usage of multiple DataFrames to identify potential optimizations.

Python Program

import pandas as pd

# Create two DataFrames with different data types
data1 = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000.5, 80000.0, 90000.0]
}
data2 = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': ['70000.5', '80000.0', '90000.0']
}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

# Compare memory usage
print("Memory Usage of Numeric DataFrame:")
print(df1.memory_usage(deep=True))
print("\nMemory Usage of Object DataFrame:")
print(df2.memory_usage(deep=True))

Output

Memory Usage of Numeric DataFrame:
Index      128
Name        192
Age          24
Salary       24
dtype: int64

Memory Usage of Object DataFrame:
Index      128
Name        192
Age          24
Salary      176
dtype: int64

Summary

In this tutorial, we explored the DataFrame.memory_usage method in pandas. Key takeaways include:

  • Using memory_usage to calculate memory usage of a DataFrame
  • Excluding the index memory usage
  • Performing deep introspection for object dtype columns

The DataFrame.memory_usage method is a valuable tool for understanding and optimizing memory usage in pandas DataFrames.


Python Libraries