Alternative to the TimeGrouper Function in Pandas
Grouping data is a highly common operation that will be carried out when working with data. To understand data, we often need to group them to see relationships or specific values.
Within Pandas, a couple of functions can be used to group dataframes based on certain requirements. One such function was the TimeGrouper
function which allows us to group data based on time objects, but this function has long been deprecated.
This article will discuss the alternative to the TimeGrouper
function in Pandas and how to use it.
the TimeGrouper
Function Is Deprecated
The TimeGrouper
function, which was used with the groupby
function, has long been deprecated in the Pandas version 0.21.0 in favor of the pandas Grouper()
function which allows us to group data based on a groupby
instruction for an object (which includes the time
object).
Use the Grouper()
Function
As stated, the Grouper()
function allows users to specify a groupby()
function for an object and select which column we want as the key parameter upon which the grouping occurs.
For example, when grouping on non-DateTime columns in addition to DateTime columns, groupby()
is the appropriate place to use pd.Grouper()
. We can always use resample()
if you only need to group on a frequency.
Let’s illustrate the way Grouper()
works by grouping on non-DateTime columns and using the month-end frequency
, which is defined by the M
passed to the freq
argument.
First, let’s create the data we will be grouping using the numpy
library.
Code:
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': np.random.choice(['x', 'y'], size=50),
'b': np.random.rand(50)},
index=pd.date_range('2022', periods=50))
print(df.head())
Output:
a b
2022-01-01 x 0.365385
2022-01-02 y 0.484075
2022-01-03 y 0.863266
2022-01-04 x 0.319142
2022-01-05 x 0.386386
Now that we have the data, let’s use the Grouper()
function on the data by grouping based on the month-end frequency
with the average
of the groupings calculated.
newDf = df.groupby(pd.Grouper(freq="M")).mean()
print(newDf)
Output:
b
2022-01-31 0.582896
2022-02-28 0.451495
So, we have successfully grouped the data based on the month-end frequency
. We can also group the data based on the a
column and the month-end frequency
with the average of the groupings calculated.
otherDf = df.groupby([pd.Grouper(freq="M"), "a"]).mean()
print(otherDf)
Output:
b
a
2022-01-31 x 0.401720
y 0.473320
2022-02-28 x 0.760869
y 0.312064
Olorunfemi is a lover of technology and computers. In addition, I write technology and coding content for developers and hobbyists. When not working, I learn to design, among other things.
LinkedIn