Pandas DataFrame.copy
Pandas DataFrame.copy
The DataFrame.copy method in pandas is used to create a copy of a DataFrame. This method can create either a deep copy (default) or a shallow copy, depending on the specified parameters. It is useful for preserving the original DataFrame when making modifications to a duplicate.
Syntax
The syntax for DataFrame.copy is:
DataFrame.copy(deep=True)Here, DataFrame refers to the pandas DataFrame being copied.
Parameters
| Parameter | Description |
|---|---|
deep | If True (default), creates a deep copy of the DataFrame, including a copy of the data and index. If False, creates a shallow copy, where changes to the original DataFrame may affect the copy. |
Returns
A new DataFrame object, either a deep or shallow copy of the original, based on the deep parameter.
Examples
Creating a Deep Copy
A deep copy is independent of the original DataFrame. Changes to the deep copy do not affect the original.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)
# Create a deep copy of the DataFrame
df_copy = df.copy()
# Modify the copy
df_copy['Age'] = [26, 31, 36]
print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)Output
Original DataFrame:
Name Age Salary
0 Arjun 25 70000
1 Ram 30 80000
2 Priya 35 90000
Deep Copy DataFrame:
Name Age Salary
0 Arjun 26 70000
1 Ram 31 80000
2 Priya 36 90000Creating a Shallow Copy
A shallow copy shares the same data with the original DataFrame, so changes to the original may affect the copy.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)
# Create a shallow copy of the DataFrame
df_shallow_copy = df.copy(deep=False)
# Modify the original DataFrame
df.loc[0, 'Age'] = 26
print("Original DataFrame:")
print(df)
print("\nShallow Copy DataFrame:")
print(df_shallow_copy)Output
Original DataFrame:
Name Age Salary
0 Arjun 26 70000
1 Ram 30 80000
2 Priya 35 90000
Shallow Copy DataFrame:
Name Age Salary
0 Arjun 26 70000
1 Ram 30 80000
2 Priya 35 90000Checking the Independence of a Deep Copy
You can verify that changes to a deep copy do not affect the original DataFrame.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35],
'Salary': [70000, 80000, 90000]
}
df = pd.DataFrame(data)
# Create a deep copy of the DataFrame
df_copy = df.copy()
# Verify independence
print("Before modification:")
print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)
# Modify the copy
df_copy['Salary'] = [71000, 81000, 91000]
print("\nAfter modifying the deep copy:")
print("Original DataFrame:")
print(df)
print("\nDeep Copy DataFrame:")
print(df_copy)Output
Before modification:
Original DataFrame:
Name Age Salary
0 Arjun 25 70000
1 Ram 30 80000
2 Priya 35 90000
Deep Copy DataFrame:
Name Age Salary
0 Arjun 25 70000
1 Ram 30 80000
2 Priya 35 90000
After modifying the deep copy:
Original DataFrame:
Name Age Salary
0 Arjun 25 70000
1 Ram 30 80000
2 Priya 35 90000
Deep Copy DataFrame:
Name Age Salary
0 Arjun 25 71000
1 Ram 30 81000
2 Priya 35 91000Summary
In this tutorial, we explored the DataFrame.copy method in pandas. Key takeaways include:
- Using
deep=Trueto create independent copies of a DataFrame. - Using
deep=Falseto create shallow copies that share data with the original. - Understanding when to use deep or shallow copies based on your data processing needs.
The DataFrame.copy method is an essential tool for safely modifying pandas DataFrames without altering the original data unintentionally.