Pandas コラム fillna

Suraj Joshi 2023年1月30日
  1. DataFrame.fillna() メソッド
  2. DataFrame.fillna() メソッドを用いて指定した値で DataFrame 全体を埋める
  3. 指定したカラムの NaN の値を指定した値で埋める
Pandas コラム fillna

このチュートリアルでは、DataFrame.fillna() メソッドを使って、NaN の値を指定した値で埋める方法を説明します。

この記事では、以下の DataFrame を使用します。

import numpy as np
import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Roll No": [501, 502, np.nan, 504, 505, 506],
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
        "Age": [17, 18, np.nan, 16, 18, np.nan],
    }
)

print(student_df)

出力:

   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      NaN       Bob           NaN   NaN
3    504.0      Emma          30.0  16.0
4    505.0      Luna           NaN  18.0
5    506.0     Anish           NaN   NaN

DataFrame.fillna() メソッド

構文

DataFrame.fillna(
    value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)

DataFrame.fillna() メソッドを用いると、DataFrameNaN の値を指定した value または method で埋めることができます。

DataFrame.fillna() メソッドを用いて指定した値で DataFrame 全体を埋める

import numpy as np
import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Roll No": [501, 502, np.nan, 504, 505, 506],
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
        "Age": [17, 18, np.nan, 16, 18, np.nan],
    }
)
filled_df = student_df.fillna(0)

print("DataFrame with NaN values")
print(student_df, "\n")

print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")

出力:

DataFrame with NaN values
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      NaN       Bob           NaN   NaN
3    504.0      Emma          30.0  16.0
4    505.0      Luna           NaN  18.0
5    506.0     Anish           NaN   NaN 

After applying fillna() to the DataFrame:
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      0.0       Bob           0.0   0.0
3    504.0      Emma          30.0  16.0
4    505.0      Luna           0.0  18.0
5    506.0     Anish           0.0   0.0 

DataFrame student_df のすべての NaN 値を DataFrame.fillna() メソッドの引数として渡された 0 で置き換えます。

import numpy as np
import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Roll No": [501, 502, np.nan, 504, 505, 506],
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
        "Age": [17, 18, np.nan, 16, 18, np.nan],
    }
)
filled_df = student_df.fillna(method="ffill")

print("DataFrame with NaN values")
print(student_df, "\n")

print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")

出力:

DataFrame with NaN values
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      NaN       Bob           NaN   NaN
3    504.0      Emma          30.0  16.0
4    505.0      Luna           NaN  18.0
5    506.0     Anish           NaN   NaN 

After applying fillna() to the DataFrame:
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2    502.0       Bob         400.0  18.0
3    504.0      Emma          30.0  16.0
4    505.0      Luna          30.0  18.0
5    506.0     Anish          30.0  18.0 

student_df に含まれるすべての NaN 値を NaN 値と同じ列にある NaN 値の前の値で埋めます。

指定したカラムの NaN の値を指定した値で埋める

特定の値を指定した値で埋めるには、カラム名をキーに、そのカラムの NaN 値に使用する値を値として fillna() メソッドに辞書を渡します。

import numpy as np
import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Roll No": [501, 502, np.nan, 504, 505, 506],
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Income(in $)": [200, 400, np.nan, 300, np.nan, np.nan],
        "Age": [17, 18, np.nan, 16, 18, np.nan],
    }
)
filled_df = student_df.fillna({"Age": 17, "Income(in $)": 300})

print("DataFrame with NaN values")
print(student_df, "\n")

print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")

出力:

DataFrame with NaN values
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      NaN       Bob           NaN   NaN
3    504.0      Emma         300.0  16.0
4    505.0      Luna           NaN  18.0
5    506.0     Anish           NaN   NaN 

After applying fillna() to the DataFrame:
   Roll No      Name  Income(in $)   Age
0    501.0  Jennifer         200.0  17.0
1    502.0    Travis         400.0  18.0
2      NaN       Bob         300.0  17.0
3    504.0      Emma         300.0  16.0
4    505.0      Luna         300.0  18.0
5    506.0     Anish         300.0  17.0 

このメソッドは Age カラムの NaN 値をすべて値 17 で埋め、Income(in $) カラムの NaN 値をすべて値 300 で埋めます。Roll No カラムの NaN 値はそのまま残されます。

著者: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

関連記事 - Pandas NaN