How to Add Column to Pandas DataFrame? Examples


Pandas DataFrame - Add Column

To add a new column to the existing Pandas DataFrame, assign the new column values to the DataFrame, indexed using the new column name.

In this tutorial, we shall learn how to add a column to DataFrame, with the help of example programs, that are going to be very detailed and illustrative.


Syntax to add column

The syntax to add a column to DataFrame is:

mydataframe['new_column_name'] = column_values

where mydataframe is the dataframe to which you would like to add the new column with the label new_column_name. You can either provide all the column values as a list or a single value that is taken as default value for all of the rows.


Examples

1. Add column to DataFrame

In this example, we will create a dataframe df_marks and add a new column with name geometry.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

#create dataframe
df_marks = pd.DataFrame(mydictionary)
print('Original DataFrame\n--------------')
print(df_marks)

#add column
df_marks['geometry'] = [81, 92, 67, 76]
print('\n\nDataFrame after adding "geometry" column\n--------------')
print(df_marks)

Explanation

  1. The program imports the pandas library, which provides data structures and functions for efficient data analysis.
  2. A dictionary named mydictionary is defined with keys ('names', 'physics', 'chemistry', and 'algebra'), each associated with a list of values representing the data for those columns.
  3. A DataFrame named df_marks is created using the pd.DataFrame() function, which converts the dictionary into a structured table format.
  4. The original DataFrame is printed, displaying all columns and their respective rows of data.
  5. A new column named 'geometry' is added to the DataFrame. The values for this column are provided as a list: [81, 92, 67, 76], with each value corresponding to a row in the DataFrame.
  6. The updated DataFrame, now including the new 'geometry' column, is printed to display its revised structure.

Output

Original DataFrame
--------------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87


DataFrame after adding "geometry" column
--------------
  names  physics  chemistry  algebra  geometry
0  Somu       68         84       78        81
1  Kiku       74         56       88        92
2  Amol       77         73       82        67
3  Lini       78         69       87        76

The column is added to the dataframe with the specified list as column values.

The length of the list you provide for the new column should equal the number of rows in the dataframe. If this condition fails, you will get an error similar to the following.

ValueError: Length of values does not match length of index

2. Add column to DataFrame with a default value

In this example, we will create a dataframe df_marks and add a new column called geometry with a default value for each of the rows in the dataframe.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

#create dataframe
df_marks = pd.DataFrame(mydictionary)
print('Original DataFrame\n--------------')
print(df_marks)

#add column
df_marks['geometry'] = 65
print('\n\nDataFrame after adding "geometry" column\n--------------')
print(df_marks)

Explanation

  1. The program imports the pandas library, which is used for data analysis and manipulation.
  2. A dictionary named mydictionary is created with keys ('names', 'physics', 'chemistry', 'algebra') and corresponding lists of values representing data for each column.
  3. A DataFrame named df_marks is created using the pd.DataFrame() function, which organizes the data into a tabular format with columns and rows.
  4. The original DataFrame is printed to display all the columns and their respective rows of data.
  5. A new column named 'geometry' is added to the DataFrame by assigning the value 65 to it. This value is broadcasted, meaning all rows in the new column will have the value 65.
  6. The modified DataFrame, now including the new 'geometry' column, is printed to show the updated structure.

Output

Original DataFrame
--------------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87


DataFrame after adding "geometry" column
--------------
  names  physics  chemistry  algebra  geometry
0  Somu       68         84       78        65
1  Kiku       74         56       88        65
2  Amol       77         73       82        65
3  Lini       78         69       87        65

The column is added to the dataframe with the specified value as default column value.


Summary

In this Pandas Tutorial, we learned how to add a new column to Pandas DataFrame with the help of detailed Python examples.


Python Libraries