Skip to content

EA.fillna(method) deprecation and EA.pad_or_backfill issues #54831

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

I have been looking at the EA.fillna method keyword deprecation (original PR at #54001) from the point of view of a downstream EA implementation (from looking into geopandas CI's warnings for the release candidate)

Two issues I am noting:

1) When an external EA currently has a fillna method accepting the method keyword, the user sees three deprecation warnings (the third might be specific to the geopandas example, because of calling super().fillna(..) in GeometryArray.fillna()):

In [2]: s
Out[2]: 
0    POLYGON ((0.00000 0.00000, 1.00000 1.00000, 0....
1                                                 None
2    POLYGON ((0.00000 0.00000, -1.00000 1.00000, 0...
dtype: geometry

In [3]: s.fillna(method="ffill")
/home/joris/scipy/repos/geopandas/geopandas/geoseries.py:873: FutureWarning: GeoSeries.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.
  return super().fillna(
/home/joris/scipy/repos/geopandas/geopandas/geoseries.py:873: FutureWarning: ExtensionArray.fillna 'method' keyword is deprecated. In a future version. arr.pad_or_backfill will be called instead. 3rd-party ExtensionArray authors need to implement pad_or_backfill.
  return super().fillna(
/home/joris/scipy/repos/geopandas/geopandas/array.py:1100: FutureWarning: The 'method' keyword in GeometryArray.fillna is deprecated and will be removed in a future version. Use pad_or_backfill instead.
  return super().fillna(method=method, limit=limit)
Out[3]: 
0    POLYGON ((0.00000 0.00000, 1.00000 1.00000, 0....
1    POLYGON ((0.00000 0.00000, 1.00000 1.00000, 0....
2    POLYGON ((0.00000 0.00000, -1.00000 1.00000, 0...
dtype: geometry

This might be as simple as changing the second warning to a DeprecationWarning for now (the third one might not be easy to avoid, as mentioned above this is called in the geopandas implementation through super().fillna(), and we still want to deprecate direct usage of the EA.fillna method as well)

2) This added a new method to the ExtensionArray, which is pad_or_backfill. Some issues with this name:

  • Methods on the ExtensionArray are user-visible and public methods, so I don't think we should name it "pad_or_backfill". If we want to use this as the mechanism for EAs to override this behaviour, we should call it _pad_or_backfill (like the other official EA methods starting with an underscore), or actually _ffill_or_bfill to use the non-deprecated names.
  • Direct calls of EA.fillna(method=..) are deprecated and currently point to pad_or_backfill. If we don't make this public for end users (point above), then we don't have anything to point to. If we still want to keep this user-accessible at the EA level, we could also simply add ffill and bfill methods to the EA? (to make it consistent with the methods on the Series level)

The first option is very easy to do (simple search/replace on the name), but maybe adding both ffill and bfill as public methods (our own EAs could still dispatch to a shared implementation if that is the easiest) might be the cleaner option long term? And @jbrockmendel based on your comment in the original issue about adding those methods (#53621), I suppose you would be fine with that? (not sure if there was a specific reason to change your mind for the PR to use a combined method?)

Given that those methods need to be implemented by downstream EAs, ideally we get a final naming in for 2.1.0 (or otherwise a quick 2.1.1). But I could look into it tomorrow if we agree on the best path forward.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ExtensionArrayExtending pandas with custom dtypes or arrays.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions