How to Add Column to Pandas DataFrame? Examples
Pandas DataFrame - Add Column
To add a new column to the existing Pandas DataFrame, assign the new column values to the DataFrame, indexed using the new column name.
In this tutorial, we shall learn how to add a column to DataFrame, with the help of example programs, that are going to be very detailed and illustrative.
Syntax to add column
The syntax to add a column to DataFrame is:
mydataframe['new_column_name'] = column_values
where mydataframe is the dataframe to which you would like to add the new column with the label new_column_name. You can either provide all the column values as a list or a single value that is taken as default value for all of the rows.
Examples
1. Add column to DataFrame
In this example, we will create a dataframe df_marks
and add a new column with name geometry
.
Python Program
import pandas as pd
mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
'physics': [68, 74, 77, 78],
'chemistry': [84, 56, 73, 69],
'algebra': [78, 88, 82, 87]}
#create dataframe
df_marks = pd.DataFrame(mydictionary)
print('Original DataFrame\n--------------')
print(df_marks)
#add column
df_marks['geometry'] = [81, 92, 67, 76]
print('\n\nDataFrame after adding "geometry" column\n--------------')
print(df_marks)
Explanation
- The program imports the
pandas
library, which provides data structures and functions for efficient data analysis. - A dictionary named
mydictionary
is defined with keys ('names'
,'physics'
,'chemistry'
, and'algebra'
), each associated with a list of values representing the data for those columns. - A DataFrame named
df_marks
is created using thepd.DataFrame()
function, which converts the dictionary into a structured table format. - The original DataFrame is printed, displaying all columns and their respective rows of data.
- A new column named
'geometry'
is added to the DataFrame. The values for this column are provided as a list:[81, 92, 67, 76]
, with each value corresponding to a row in the DataFrame. - The updated DataFrame, now including the new
'geometry'
column, is printed to display its revised structure.
Output
Original DataFrame
--------------
names physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
DataFrame after adding "geometry" column
--------------
names physics chemistry algebra geometry
0 Somu 68 84 78 81
1 Kiku 74 56 88 92
2 Amol 77 73 82 67
3 Lini 78 69 87 76
The column is added to the dataframe with the specified list as column values.
The length of the list you provide for the new column should equal the number of rows in the dataframe. If this condition fails, you will get an error similar to the following.
ValueError: Length of values does not match length of index
2. Add column to DataFrame with a default value
In this example, we will create a dataframe df_marks and add a new column called geometry with a default value for each of the rows in the dataframe.
Python Program
import pandas as pd
mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
'physics': [68, 74, 77, 78],
'chemistry': [84, 56, 73, 69],
'algebra': [78, 88, 82, 87]}
#create dataframe
df_marks = pd.DataFrame(mydictionary)
print('Original DataFrame\n--------------')
print(df_marks)
#add column
df_marks['geometry'] = 65
print('\n\nDataFrame after adding "geometry" column\n--------------')
print(df_marks)
Explanation
- The program imports the
pandas
library, which is used for data analysis and manipulation. - A dictionary named
mydictionary
is created with keys ('names'
,'physics'
,'chemistry'
,'algebra'
) and corresponding lists of values representing data for each column. - A DataFrame named
df_marks
is created using thepd.DataFrame()
function, which organizes the data into a tabular format with columns and rows. - The original DataFrame is printed to display all the columns and their respective rows of data.
- A new column named
'geometry'
is added to the DataFrame by assigning the value65
to it. This value is broadcasted, meaning all rows in the new column will have the value65
. - The modified DataFrame, now including the new
'geometry'
column, is printed to show the updated structure.
Output
Original DataFrame
--------------
names physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
DataFrame after adding "geometry" column
--------------
names physics chemistry algebra geometry
0 Somu 68 84 78 65
1 Kiku 74 56 88 65
2 Amol 77 73 82 65
3 Lini 78 69 87 65
The column is added to the dataframe with the specified value as default column value.
Summary
In this Pandas Tutorial, we learned how to add a new column to Pandas DataFrame with the help of detailed Python examples.