You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Or if you want to use numpy to accelerate (I can achieve 10x speed), you should assign the df.values (or df.to_numpy()) to a variable first, then fill and assign back:
def__call__(self, df):
ifself.fields_groupisNone:
df.fillna(self.fill_value, inplace=True)
else:
cols=get_group_columns(df, self.fields_group)
# this implementation is extremely slow# df.fillna({col: self.fill_value for col in cols}, inplace=True)#! similar to qlib.data.dataset.processor.Fillna, we use numpy to accelerate#! but instead, we assign the numpy array to a variable firstdf_values=df[cols].to_numpy()
nan_select=np.isnan(df_values)
#! then fill value and assign backdf_values[nan_select] =self.fill_valuedf.loc[:, cols] =df_valuesreturndf
🐛 Bug Description
The Fillna processor does not work if fields_group is not None since assigning values to df.values changes nothing.
To Reproduce
Use any model and specify fields_group for Fillna processor.
Expected Behavior
No nan after calling Fillna.
Additional Notes
Same as the issue here: #1307 (comment).
The text was updated successfully, but these errors were encountered: