Pandas DataFrame.fillna: Fill Missing Values in a DataFrame


Pandas DataFrame.fillna

The DataFrame.fillna method in pandas is used to fill missing (NA/NaN) values in a DataFrame using a specified method or value.


Syntax

The syntax for DataFrame.fillna is:

DataFrame.fillna(value=None, *, method=None, axis=None, inplace=False, limit=None, downcast=None)

Here, DataFrame refers to the pandas DataFrame on which missing values need to be filled.


Parameters

ParameterDescription
valueScalar, dict, Series, or DataFrame – Value to replace NaN values with. If a dictionary or Series is passed, it specifies values for specific columns.
methodString, default None – Specifies method for filling: ffill (forward fill) or bfill (backward fill).
axis{0 or ‘index’, 1 or ‘columns’} – Determines whether filling is done along rows or columns.
inplaceBoolean, default False – If True, modifies the DataFrame in place.
limitInteger, default None – Specifies the maximum number of consecutive NaN values to fill.
downcastDictionary, default None – Attempts to convert filled values into a smaller datatype if possible.

Returns

A DataFrame with missing values filled, or None if inplace=True.


Examples

1. Filling Missing Values with a Scalar Value

This example demonstrates how to fill all NaN values with a specific scalar value.

Python Program

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'A': [1, 2, None, 4],
    'B': [None, 2, 3, None]
})

# Fill NaN values with 0
df_filled = df.fillna(0)
print(df_filled)

Output

     A    B
0  1.0  0.0
1  2.0  2.0
2  0.0  3.0
3  4.0  0.0

2. Filling Missing Values with a Dictionary

You can fill NaN values with different values for each column using a dictionary.

Python Program

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'A': [1, 2, None, 4],
    'B': [None, 2, 3, None]
})

# Fill NaN values with different values for each column
df_filled_dict = df.fillna({'A': 100, 'B': 200})
print(df_filled_dict)

Output

       A      B
0    1.0  200.0
1    2.0    2.0
2  100.0    3.0
3    4.0  200.0

3. Forward Filling (Propagating Last Valid Observation)

Using method='ffill', the last valid value propagates forward.

Python Program

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'A': [1, 2, None, 4],
    'B': [None, 2, 3, None]
})

# Forward fill missing values
df_ffill = df.fillna(method='ffill')
print(df_ffill)

Output

     A    B
0  1.0  NaN
1  2.0  2.0
2  2.0  3.0
3  4.0  3.0

4. Backward Filling (Using Next Valid Observation)

Using method='bfill', the next valid value fills NaN values.

Python Program

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'A': [1, 2, None, 4],
    'B': [None, 2, 3, None]
})

# Backward fill missing values
df_bfill = df.fillna(method='bfill')
print(df_bfill)

Output

     A    B
0  1.0  2.0
1  2.0  2.0
2  4.0  3.0
3  4.0  NaN

5. Limiting the Number of Values Filled

You can set a limit on the number of consecutive NaN values to fill.

Python Program

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'A': [1, 2, None, 4],
    'B': [None, 2, 3, None]
})

# Forward fill with a limit of 1
df_limited = df.fillna(method='ffill', limit=1)
print(df_limited)

Output

     A    B
0  1.0  NaN
1  2.0  2.0
2  2.0  3.0
3  4.0  3.0

Summary

In this tutorial, we explored the DataFrame.fillna method in pandas. Key takeaways include:

  • Using fillna to replace NaN values with a scalar, dictionary, or another DataFrame.
  • Forward filling (ffill) propagates the last valid value.
  • Backward filling (bfill) uses the next valid value to fill gaps.
  • The limit parameter restricts the number of values filled.
  • Using inplace=True modifies the original DataFrame instead of returning a new one.

Python Libraries