How to Calculate Mean of Pandas DataFrame? Python Examples


Python Pandas - Mean of DataFrame

To calculate mean of a Pandas DataFrame, you can use pandas.DataFrame.mean() method. Using mean() method, you can calculate mean along an axis, or the complete DataFrame.

In this tutorial, you'll learn how to find the mean of a DataFrame, along rows, columns, or complete DataFrame using DataFrame.mean() method, with examples.


Examples

1. Find Mean along columns of DataFrame

In this example, we will calculate the mean along the columns. We will come to know the average marks obtained by students, subject wise.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# Create DataFrame 
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# Calculate mean of DataFrame
mean = df_marks.mean()
print('\nMean\n------')
print(mean)

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
physics      74.25
chemistry    70.50
algebra      83.75
dtype: float64

The mean() function returns a Pandas Series. This is the default behavior of the mean() function. Hence, for this particular case, you need not pass any arguments to the mean() function. Or, if you want to explicitly mention to mean() function, to calculate along the columns, pass axis=0 as shown below.

df_marks.mean(axis=0)

2. Find Mean of complete DataFrame

In this example, we will create a DataFrame with numbers present in all columns, and calculate mean of complete DataFrame.

From the previous example, we have seen that mean() function by default returns mean calculated among columns and return a Pandas Series. Apply mean() on returned series and mean of the complete DataFrame is returned.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# Create DataFrame 
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# Calculate mean of the whole DataFrame
mean = df_marks.mean().mean()
print('\nMean\n------')
print(mean)

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
76.16666666666667

3. Find Mean of DataFrame along Rows

In this example, we will calculate the mean of all the columns along rows or axis=1. In this particular example, the mean along rows gives the average or percentage of marks obtained by each student.

Python Program

import pandas as pd

mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
	'physics': [68, 74, 77, 78],
	'chemistry': [84, 56, 73, 69],
	'algebra': [78, 88, 82, 87]}

# Create dataframe
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)

# Calculate mean along rows
mean = df_marks.mean(axis=1)
print('\nMean\n------')
print(mean)

# Display names and average marks
print('\nAverage marks or percentage for each student')
print(pd.concat([df_marks['names'], mean], axis=1))

Output

DataFrame
----------
  names  physics  chemistry  algebra
0  Somu       68         84       78
1  Kiku       74         56       88
2  Amol       77         73       82
3  Lini       78         69       87

Mean
------
0    76.666667
1    72.666667
2    77.333333
3    78.000000
dtype: float64

Average marks or percentage for each student
  names          0
0  Somu  76.666667
1  Kiku  72.666667
2  Amol  77.333333
3  Lini  78.000000

Summary

In this Pandas Tutorial, we have learned how to calculate mean of whole DataFrame, mean of DataFrame along column(s) and mean of DataFrame along rows.