How to set Column as Index in Pandas DataFrame?

Pandas – Set Column as Index

By default an index is created for DataFrame. But, you can set a specific column of DataFrame as index, if required.

To set a column as index for a DataFrame, use DataFrame.set_index() function, with the column name passed as argument.

Pandas DataFrame - Set Column as Index

You can also setup MultiIndex with multiple columns in the index. In this case, pass the array of column names required for index, to set_index() method.

Syntax of set_index()

The syntax of set_index() to setup a column as index is

myDataFrame.set_index('column_name')

where myDataFrame is the DataFrame for which you would like to set column_name column as index.

To setup MultiIndex, use the following syntax.

myDataFrame.set_index(['column_name_1', column_name_2])
Run this program ONLINE

You can pass as many column names as required.

Note that set_index() method does not modify the original DataFrame, but returns the DataFrame with the column set as index.

Example 1: Set Column as Index in Pandas DataFrame

In this example, we take a DataFrame, and try to set a column as index.

Python Program

import pandas as pd

#initialize a dataframe
df = pd.DataFrame(
	[[21, 'Amol', 72, 67],
	[23, 'Lini', 78, 69],
	[32, 'Kiku', 74, 56],
	[52, 'Ajit', 54, 76]],
	columns=['rollno', 'name', 'physics', 'botony'])

print('DataFrame with default index\n', df)

#set column as index
df = df.set_index('rollno')

print('\nDataFrame with column as index\n',df)
Run this program ONLINE

Output

Pandas - Set Column as Index

The column rollno of the DataFrame is set as index.

Also, observe the output of original dataframe and the output of dataframe with rollno as index. In the original dataframe, there is a separate index column (first column) with no column name. But in our second dataframe, as existing column is acting as index, this column took the first place.

Example 2: Set MultiIndex for Pandas DataFrame

In this example, we will pass multiple column names as an array to set_index() method to setup MultiIndex for the Pandas DataFrame.

Python Program

import pandas as pd

#initialize a dataframe
df = pd.DataFrame(
	[[21, 'Amol', 72, 67],
	[23, 'Lini', 78, 69],
	[32, 'Kiku', 74, 56],
	[52, 'Ajit', 54, 76]],
	columns=['rollno', 'name', 'physics', 'botony'])

print('DataFrame with default index\n', df)

#set multiple columns as index
df = df.set_index(['rollno','name'])

print('\nDataFrame with MultiIndex\n',df)
Run this program ONLINE

Output

D:\>python example1.py
DataFrame with default index
    rollno  name  physics  botony
0      21  Amol       72      67
1      23  Lini       78      69
2      32  Kiku       74      56
3      52  Ajit       54      76

DataFrame with MultiIndex
              physics  botony
rollno name
21     Amol       72      67
23     Lini       78      69
32     Kiku       74      56
52     Ajit       54      76

Summary

In this Pandas Tutorial, we learned how to set a specific column of the DataFrame as index.