Pandas DataFrame.insert
Pandas DataFrame.insert
The DataFrame.insert method in pandas is used to insert a column into a DataFrame at a specified location. This is useful when you need to add new data to a DataFrame in a specific order.
Syntax
The syntax for DataFrame.insert is:
DataFrame.insert(loc, column, value, allow_duplicates=) Here, DataFrame refers to the pandas DataFrame where the column is being inserted.
Parameters
| Parameter | Description |
|---|---|
loc | Integer specifying the column index where the new column is inserted. Must be within the range [0, number of columns]. |
column | String representing the name of the column to insert. |
value | The values to insert. Can be a scalar, list, or Series with the same length as the DataFrame. |
allow_duplicates | If False, raises a ValueError if the column name already exists. If not specified, defaults to False. |
Returns
None: Modifies the DataFrame in place.
Examples
Inserting a New Column at a Specific Position
Insert a new column into a DataFrame at a specified index.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35]
}
df = pd.DataFrame(data)
# Insert a new column for Salary at index 1
df.insert(loc=1, column='Salary', value=[70000, 80000, 90000])
print("DataFrame after inserting the 'Salary' column:")
print(df)Output
DataFrame after inserting the 'Salary' column:
Name Salary Age
0 Arjun 70000 25
1 Ram 80000 30
2 Priya 90000 35Inserting a Column with Scalar Values
Insert a column where all values are the same.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35]
}
df = pd.DataFrame(data)
# Insert a column for Department with a scalar value
df.insert(loc=2, column='Department', value='Engineering')
print("DataFrame after inserting the 'Department' column:")
print(df)Output
DataFrame after inserting the 'Department' column:
Name Age Department
0 Arjun 25 Engineering
1 Ram 30 Engineering
2 Priya 35 EngineeringPreventing Duplicate Column Names
Use allow_duplicates=False to prevent adding columns with duplicate names.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35]
}
df = pd.DataFrame(data)
# Attempt to insert a duplicate column name
try:
df.insert(loc=1, column='Age', value=[26, 31, 36], allow_duplicates=False)
except ValueError as e:
print("Error:", e)Output
Error: cannot insert Age, already existsInserting a Column with a pandas Series
Insert a new column using a pandas Series.
Python Program
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Arjun', 'Ram', 'Priya'],
'Age': [25, 30, 35]
}
df = pd.DataFrame(data)
# Create a Series for Salaries
salaries = pd.Series([70000, 80000, 90000])
# Insert the Series as a new column
df.insert(loc=2, column='Salary', value=salaries)
print("DataFrame after inserting the 'Salary' column as a Series:")
print(df)Output
DataFrame after inserting the 'Salary' column as a Series:
Name Age Salary
0 Arjun 25 70000
1 Ram 30 80000
2 Priya 35 90000Summary
In this tutorial, we explored the DataFrame.insert method in pandas. Key takeaways include:
- Using
locto specify the position of the new column. - Inserting values as scalars, lists, or pandas Series.
- Preventing duplicate column names with
allow_duplicates=False.
The DataFrame.insert method is a flexible and efficient way to add columns to a pandas DataFrame at specific positions.