Pandas DataFrame.copy


Pandas DataFrame.copy

The DataFrame.copy method in pandas is used to create a copy of a DataFrame. This method can create either a deep copy (default) or a shallow copy, depending on the specified parameters. It is useful for preserving the original DataFrame when making modifications to a duplicate.


Syntax

The syntax for DataFrame.copy is:

DataFrame.copy(deep=True)

Here, DataFrame refers to the pandas DataFrame being copied.


Parameters

ParameterDescription
deepIf True (default), creates a deep copy of the DataFrame, including a copy of the data and index. If False, creates a shallow copy, where changes to the original DataFrame may affect the copy.

Returns

A new DataFrame object, either a deep or shallow copy of the original, based on the deep parameter.


Examples

Creating a Deep Copy

A deep copy is independent of the original DataFrame. Changes to the deep copy do not affect the original.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)

# Create a deep copy of the DataFrame
df_copy = df.copy()

# Modify the copy
df_copy['Age'] = [26, 31, 36]

print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)

Output

Original DataFrame:
    Name  Age  Salary
0  Arjun   25  70000
1    Ram   30  80000
2   Priya   35  90000

Deep Copy DataFrame:
    Name  Age  Salary
0  Arjun   26  70000
1    Ram   31  80000
2   Priya   36  90000

Creating a Shallow Copy

A shallow copy shares the same data with the original DataFrame, so changes to the original may affect the copy.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)

# Create a shallow copy of the DataFrame
df_shallow_copy = df.copy(deep=False)

# Modify the original DataFrame
df.loc[0, 'Age'] = 26

print("Original DataFrame:")
print(df)
print("\nShallow Copy DataFrame:")
print(df_shallow_copy)

Output

Original DataFrame:
    Name  Age  Salary
0  Arjun   26  70000
1    Ram   30  80000
2   Priya   35  90000

Shallow Copy DataFrame:
    Name  Age  Salary
0  Arjun   26  70000
1    Ram   30  80000
2   Priya   35  90000

Checking the Independence of a Deep Copy

You can verify that changes to a deep copy do not affect the original DataFrame.

Python Program

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Arjun', 'Ram', 'Priya'],
    'Age': [25, 30, 35],
    'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)

# Create a deep copy of the DataFrame
df_copy = df.copy()

# Verify independence
print("Before modification:")
print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)

# Modify the copy
df_copy['Salary'] = [71000, 81000, 91000]

print("\nAfter modifying the deep copy:")
print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)

Output

Before modification:
Original DataFrame:
    Name  Age  Salary
0  Arjun   25  70000
1    Ram   30  80000
2   Priya   35  90000

Deep Copy DataFrame:
    Name  Age  Salary
0  Arjun   25  70000
1    Ram   30  80000
2   Priya   35  90000

After modifying the deep copy:
Original DataFrame:
    Name  Age  Salary
0  Arjun   25  70000
1    Ram   30  80000
2   Priya   35  90000

Deep Copy DataFrame:
    Name  Age  Salary
0  Arjun   25  71000
1    Ram   30  81000
2   Priya   35  91000

Summary

In this tutorial, we explored the DataFrame.copy method in pandas. Key takeaways include:

  • Using deep=True to create independent copies of a DataFrame.
  • Using deep=False to create shallow copies that share data with the original.
  • Understanding when to use deep or shallow copies based on your data processing needs.

The DataFrame.copy method is an essential tool for safely modifying pandas DataFrames without altering the original data unintentionally.


Python Libraries