Python – Pandas DataFrame – mean()

Python Pandas – Mean of DataFrame

To calculate mean of a Pandas DataFrame, you can use pandas.DataFrame.mean() method. Using mean() method, you can calculate mean along an axis, or the complete DataFrame.

Example 1: Mean along columns of DataFrame

In this example, we will calculate the mean along the columns. We will come to know the average marks obtained by students, subject wise.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# create dataframe
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# calculate mean
mean = df_marks.mean()
print('\nMean\n------')
print(mean)
Run this program ONLINE

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
physics      74.25
chemistry    70.50
algebra      83.75
dtype: float64

The mean() function returns a Pandas Series. This is the default behavior of the mean() function. Hence, for this particular case, you need not pass any arguments to the mean() function. Or, if you want to explicitly mention to mean() function, to calculate along the columns, pass axis=0 as shown below.

df_marks.mean(axis=0)
Run this program ONLINE

Example 2: Mean of DataFrame

In this example, we will create a DataFrame with numbers present in all columns, and calculate mean of complete DataFrame.

From the previous example, we have seen that mean() function by default returns mean calculated among columns and return a Pandas Series. Apply mean() on returned series and mean of the complete DataFrame is returned.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# create dataframe
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# calculate mean of the whole DataFrame
mean = df_marks.mean().mean()
print('\nMean\n------')
print(mean)
Run this program ONLINE

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
76.16666666666667

Example 3: Mean of DataFrame along Rows

In this example, we will calculate the mean of all the columns along rows or axis=1. In this particular example, the mean along rows gives the average or percentage of marks obtained by each student.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# create dataframe
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# calculate mean along rows
mean = df_marks.mean(axis=1)
print('\nMean\n------')
print(mean)

# display names and average marks
print('\nAverage marks or percentage for each student')
print(pd.concat([df_marks['names'], mean], axis=1))
Run this program ONLINE

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
0    76.666667
1    72.666667
2    77.333333
3    78.000000
dtype: float64

Average marks or percentage for each student
  names          0
0  Somu  76.666667
1  Kiku  72.666667
2  Amol  77.333333
3  Lini  78.000000

Summary

In this Pandas Tutorial, we have learned how to calculate mean of whole DataFrame, mean of DataFrame along column(s) and mean of DataFrame along rows.